- Next.js 16 App Router project with Tailwind CSS - Plant disease knowledge base (93 diseases, 25 plants) - Image upload with client+server preprocessing - ML inference pipeline with mock/demo fallback - Responsive results page with disease cards and treatment - Full test suite (285 passing tests)
5.7 KiB
5.7 KiB
04. ML Model Loading, Inference Pipeline, and Confidence Scoring
meta: id: hyper-specific-plant-disease-id-04 feature: hyper-specific-plant-disease-id priority: P1 depends_on: [hyper-specific-plant-disease-id-02, hyper-specific-plant-disease-id-03] tags: [ml, inference, backend]
objective:
- Integrate a custom TensorFlow.js or ONNX-compatible plant disease classifier model into the Next.js API layer — handle model loading, batched inference, confidence scoring, and result ranking against the knowledge base.
deliverables:
lib/ml/model-loader.ts— singleton model loader that lazy-loads the TF.js/ONNX model and caches it in memorylib/ml/inference.ts—runInference(imageTensor: Float32Array): Promise<RawPrediction[]>returning top-K class probabilitieslib/ml/labels.ts— class label mapping (model output index → disease ID / "healthy" / "unknown")lib/ml/confidence.ts— softmax + confidence calibration, threshold logic (high ≥0.8, medium ≥0.5, low <0.5)app/api/identify/route.ts—POST /api/identifyaccepting{ imageId }, running full pipeline, returning ranked results with knowledge base enrichmentlib/api/identify.ts— client helper to call the identify endpoint
steps:
- Set up model storage and loading:
- Place compiled model files (
model.json+ weight shards) inpublic/models/plant-disease-classifier/. - Implement
lib/ml/model-loader.tswith lazy singleton pattern — loads model on first call, keeps inglobalThiscache for subsequent calls. - Support both TensorFlow.js (
@tensorflow/tfjs-nodefor server,@tensorflow/tfjsfor client fallback) and ONNX Runtime (onnxruntime-node). - Graceful fallback: if no model file found, use a deterministic mock returning "model not loaded" with explanatory message.
- Place compiled model files (
- Build
lib/ml/inference.ts:- Accept normalized Float32Array of shape
[1, 3, 224, 224]. - Run model forward pass.
- Apply softmax to logits.
- Return top-5 predictions with class indices and raw probabilities.
- Measure inference time and attach to result.
- Accept normalized Float32Array of shape
- Implement
lib/ml/labels.ts:- Map model output index → disease ID string (e.g.,
0 → "tomato-early-blight",1 → "tomato-late-blight", …). - Include
"healthy"class for each plant. - Include
"unknown"as final catch-all class.
- Map model output index → disease ID string (e.g.,
- Implement
lib/ml/confidence.ts:calibrateConfidence(rawProb: number): { adjusted: number, label: "high" | "medium" | "low" }.- Apply threshold logic: only return predictions above
minConfidence(configurable, default 0.15).
- Build
app/api/identify/route.ts:- Accept
{ imageId }in request body. - Load image from
public/uploads/{imageId}and preprocess (reuse pipeline from task 03). - Run inference.
- Look up each top-K disease ID in knowledge base (from task 02) to enrich with name, description, symptoms, treatment.
- Enrich with lookalike disease cross-references.
- Return:
{ "predictions": [ { "diseaseId": "tomato-early-blight", "disease": { /* enriched from knowledge base */ }, "confidence": { "raw": 0.87, "adjusted": 0.91, "label": "high" }, "lookalikes": ["tomato-septoria-leaf-spot"] } ], "metadata": { "model": "plant-classifier-v1", "inferenceTimeMs": 320, "imageId": "..." } }
- Accept
- Add
lib/api/identify.ts— a typed client-side function thatPOSTs to/api/identifywith the imageId and returns the typed response. - If no model file is present at build/runtime, return a deterministic mock response with a
"demo_mode": trueflag so the UI still works for development.
tests:
- Unit:
softmax([1, 2, 3])sums to ~1.0. - Unit:
calibrateConfidence(0.9)returns label"high". - Unit: Top-5 extraction returns exactly 5 entries sorted descending.
- Integration:
POST /api/identifywith valid imageId returns 200 with predictions array. - Integration:
POST /api/identifywith invalid imageId returns 404. - Integration: Each prediction's
diseaseIdexists in knowledge base (cross-reference). - Load: Inference completes under 3 seconds (Vercel serverless timeout).
- Potential issue: serverless functions may have higher GPU latency.
- Mitigation: consider using Vercel Serverless GPU or a Node.js function with ONNX Runtime CPU.
- For initial deployment, CPU inference with MobileNet-derived model under 5MB is acceptable (<1s on V8).
acceptance_criteria:
- Model loads once and caches for subsequent requests.
- Inference returns top-5 predictions with confidence scores.
- Each prediction is enriched with full knowledge base data.
- Predictions include lookalike cross-references.
- Response includes inference timing metadata.
- Mock mode works when model file is absent.
validation:
# First upload an image
UPLOAD_RESP=$(curl -X POST -F "image=@test-assets/tomato-leaf.jpg" http://localhost:3000/api/upload)
IMAGE_ID=$(echo $UPLOAD_RESP | jq -r '.imageId')
# Then identify
curl -X POST -H "Content-Type: application/json" \
-d "{\"imageId\": \"$IMAGE_ID\"}" \
http://localhost:3000/api/identify | jq '.predictions[0].disease.name'
# → "Early Blight"
notes:
- A pre-trained MobileNetV2 fine-tuned on PlantVillage + augmented custom data is recommended — it's small (<10 MB), fast on CPU, and reasonably accurate.
- The actual model training process is OUT OF SCOPE for this task. This task assumes a trained model file is provided. Seed a placeholder warning if missing.
- If TF.js Node binding has issues, fall back to ONNX Runtime which is pure C++ and more stable on Lambda/Vercel.
- Consider Vercel's maximum serverless function duration (60s on Pro, 10s on Hobby) — keep model <10 MB and inference <3s.