Files
plant-disease-id/tasks/hyper-specific-plant-disease-id/04-ml-model-integration.md
Michael Freno 820a872f07 Initial commit: Plant Disease Identification app
- Next.js 16 App Router project with Tailwind CSS
- Plant disease knowledge base (93 diseases, 25 plants)
- Image upload with client+server preprocessing
- ML inference pipeline with mock/demo fallback
- Responsive results page with disease cards and treatment
- Full test suite (285 passing tests)
2026-06-05 19:21:16 -04:00

5.7 KiB

04. ML Model Loading, Inference Pipeline, and Confidence Scoring

meta: id: hyper-specific-plant-disease-id-04 feature: hyper-specific-plant-disease-id priority: P1 depends_on: [hyper-specific-plant-disease-id-02, hyper-specific-plant-disease-id-03] tags: [ml, inference, backend]

objective:

  • Integrate a custom TensorFlow.js or ONNX-compatible plant disease classifier model into the Next.js API layer — handle model loading, batched inference, confidence scoring, and result ranking against the knowledge base.

deliverables:

  • lib/ml/model-loader.ts — singleton model loader that lazy-loads the TF.js/ONNX model and caches it in memory
  • lib/ml/inference.tsrunInference(imageTensor: Float32Array): Promise<RawPrediction[]> returning top-K class probabilities
  • lib/ml/labels.ts — class label mapping (model output index → disease ID / "healthy" / "unknown")
  • lib/ml/confidence.ts — softmax + confidence calibration, threshold logic (high ≥0.8, medium ≥0.5, low <0.5)
  • app/api/identify/route.tsPOST /api/identify accepting { imageId }, running full pipeline, returning ranked results with knowledge base enrichment
  • lib/api/identify.ts — client helper to call the identify endpoint

steps:

  1. Set up model storage and loading:
    • Place compiled model files (model.json + weight shards) in public/models/plant-disease-classifier/.
    • Implement lib/ml/model-loader.ts with lazy singleton pattern — loads model on first call, keeps in globalThis cache for subsequent calls.
    • Support both TensorFlow.js (@tensorflow/tfjs-node for server, @tensorflow/tfjs for client fallback) and ONNX Runtime (onnxruntime-node).
    • Graceful fallback: if no model file found, use a deterministic mock returning "model not loaded" with explanatory message.
  2. Build lib/ml/inference.ts:
    • Accept normalized Float32Array of shape [1, 3, 224, 224].
    • Run model forward pass.
    • Apply softmax to logits.
    • Return top-5 predictions with class indices and raw probabilities.
    • Measure inference time and attach to result.
  3. Implement lib/ml/labels.ts:
    • Map model output index → disease ID string (e.g., 0 → "tomato-early-blight", 1 → "tomato-late-blight", …).
    • Include "healthy" class for each plant.
    • Include "unknown" as final catch-all class.
  4. Implement lib/ml/confidence.ts:
    • calibrateConfidence(rawProb: number): { adjusted: number, label: "high" | "medium" | "low" }.
    • Apply threshold logic: only return predictions above minConfidence (configurable, default 0.15).
  5. Build app/api/identify/route.ts:
    • Accept { imageId } in request body.
    • Load image from public/uploads/{imageId} and preprocess (reuse pipeline from task 03).
    • Run inference.
    • Look up each top-K disease ID in knowledge base (from task 02) to enrich with name, description, symptoms, treatment.
    • Enrich with lookalike disease cross-references.
    • Return:
      {
        "predictions": [
          {
            "diseaseId": "tomato-early-blight",
            "disease": { /* enriched from knowledge base */ },
            "confidence": { "raw": 0.87, "adjusted": 0.91, "label": "high" },
            "lookalikes": ["tomato-septoria-leaf-spot"]
          }
        ],
        "metadata": { "model": "plant-classifier-v1", "inferenceTimeMs": 320, "imageId": "..." }
      }
      
  6. Add lib/api/identify.ts — a typed client-side function that POSTs to /api/identify with the imageId and returns the typed response.
  7. If no model file is present at build/runtime, return a deterministic mock response with a "demo_mode": true flag so the UI still works for development.

tests:

  • Unit: softmax([1, 2, 3]) sums to ~1.0.
  • Unit: calibrateConfidence(0.9) returns label "high".
  • Unit: Top-5 extraction returns exactly 5 entries sorted descending.
  • Integration: POST /api/identify with valid imageId returns 200 with predictions array.
  • Integration: POST /api/identify with invalid imageId returns 404.
  • Integration: Each prediction's diseaseId exists in knowledge base (cross-reference).
  • Load: Inference completes under 3 seconds (Vercel serverless timeout).
    • Potential issue: serverless functions may have higher GPU latency.
    • Mitigation: consider using Vercel Serverless GPU or a Node.js function with ONNX Runtime CPU.
    • For initial deployment, CPU inference with MobileNet-derived model under 5MB is acceptable (<1s on V8).

acceptance_criteria:

  • Model loads once and caches for subsequent requests.
  • Inference returns top-5 predictions with confidence scores.
  • Each prediction is enriched with full knowledge base data.
  • Predictions include lookalike cross-references.
  • Response includes inference timing metadata.
  • Mock mode works when model file is absent.

validation:

# First upload an image
UPLOAD_RESP=$(curl -X POST -F "image=@test-assets/tomato-leaf.jpg" http://localhost:3000/api/upload)
IMAGE_ID=$(echo $UPLOAD_RESP | jq -r '.imageId')

# Then identify
curl -X POST -H "Content-Type: application/json" \
  -d "{\"imageId\": \"$IMAGE_ID\"}" \
  http://localhost:3000/api/identify | jq '.predictions[0].disease.name'
# → "Early Blight"

notes:

  • A pre-trained MobileNetV2 fine-tuned on PlantVillage + augmented custom data is recommended — it's small (<10 MB), fast on CPU, and reasonably accurate.
  • The actual model training process is OUT OF SCOPE for this task. This task assumes a trained model file is provided. Seed a placeholder warning if missing.
  • If TF.js Node binding has issues, fall back to ONNX Runtime which is pure C++ and more stable on Lambda/Vercel.
  • Consider Vercel's maximum serverless function duration (60s on Pro, 10s on Hobby) — keep model <10 MB and inference <3s.