beepboop

2026-06-06 15:09:46 -04:00
parent 78220d3568
commit 06295c83ca
56 changed files with 12018 additions and 440 deletions
--- a/apps/web/tasks/production-ml-pipeline/05-pipeline-integration.md
+++ b/apps/web/tasks/production-ml-pipeline/05-pipeline-integration.md
@@ -0,0 +1,279 @@
+# 05. Real Model Integration into Identification Pipeline
+
+meta:
+id: production-ml-pipeline-05
+feature: production-ml-pipeline
+priority: P0
+depends_on: [production-ml-pipeline-02, production-ml-pipeline-03, production-ml-pipeline-04]
+tags: [implementation, integration, tests-required]
+
+objective:
+
+- Wire the real TF.js model into the `/api/identify` endpoint
+- Replace demo/mock predictions with real model inference
+- Use the PlantVillage label mapping (task 02) to resolve class indices to disease IDs
+- Apply confidence calibration (task 04) to produce meaningful confidence scores
+- Remove the `demo_mode` fallback path
+- Handle healthy class predictions correctly (return "no disease detected" message)
+
+deliverables:
+
+- `src/app/api/identify/route.ts` — rewritten to use real model inference
+- `src/lib/ml/inference.ts` — updated to use calibration and return structured results
+- `src/lib/api/identify.ts` — client-side API updated for new response shape
+- `src/components/ResultsDashboard.tsx` — handle healthy predictions and remove demo mode badge
+- `src/components/HealthyResult.tsx` — new component for "no disease detected" state
+
+steps:
+
+1. **Rewrite `/api/identify` route handler** to use real inference:
+
+   ```typescript
+   export async function POST(request: NextRequest) {
+     // 1. Parse request, validate imageId
+     // 2. Load and preprocess image (existing code)
+     // 3. Run inference with real model
+     const { probabilities, inferenceTimeMs } = await runInference(tensor);
+
+     // 4. Calibrate confidence
+     const calibrated = calibratePrediction(probabilities, isLogits);
+
+     // 5. Map to disease using PlantVillage labels
+     const diseaseId = getDiseaseIdForIndex(calibrated.classIndex);
+     const isHealthy = isHealthyClass(calibrated.classIndex);
+
+     // 6. If healthy, return healthy result
+     if (isHealthy && calibrated.adjusted > 0.5) {
+       return NextResponse.json({
+         healthy: true,
+         plantId: getPlantIdForIndex(calibrated.classIndex),
+         confidence: calibrated,
+         metadata: { model: MODEL_ID, inferenceTimeMs, imageId },
+       });
+     }
+
+     // 7. Get top-K predictions (not just top-1)
+     const topK = getTopKFloat32(probabilities, 5);
+     const predictions = await enrichPredictions(topK);
+
+     // 8. Return results
+     return NextResponse.json({
+       predictions,
+       metadata: { model: MODEL_ID, inferenceTimeMs, imageId },
+       demo_mode: false, // or remove this field entirely
+     });
+   }
+   ```
+
+2. **Update `runInference()` to return calibrated results**:
+
+   ```typescript
+   export async function runInference(
+     imageTensor: Float32Array,
+     topK: number = 5,
+   ): Promise<InferenceResult> {
+     const model = await getModel();
+     const modelStatus = model.getStatus();
+
+     if (!modelStatus.loaded) {
+       throw new Error("Model not loaded. Cannot run inference.");
+     }
+
+     const { output, inferenceTimeMs } = await model.predict(imageTensor);
+
+     // Determine if output is logits or probabilities
+     const isLogits = !isProbabilities(output);
+
+     // Apply calibration
+     const calibration = calibratePrediction(output, isLogits);
+
+     // Get top-K predictions
+     const probs = isLogits ? temperatureScaledSoftmax(output) : output;
+     const topKPredictions = getTopKFloat32(probs, topK);
+
+     return {
+       predictions: topKPredictions,
+       inferenceTimeMs,
+       calibration: {
+         temperature: PLANTVILLAGE_CALIBRATION.temperature,
+         entropy: calibration.entropy,
+         entropyConfidence: calibration.entropyConfidence,
+       },
+     };
+   }
+
+   function isProbabilities(output: Float32Array): boolean {
+     const sum = output.reduce((a, b) => a + b, 0);
+     return Math.abs(sum - 1.0) < 0.01;
+   }
+   ```
+
+3. **Update `enrichPredictions()` to use new label mapping**:
+
+   ```typescript
+   async function enrichPredictions(
+     topPredictions: Array<{ classIndex: number; probability: number }>,
+   ): Promise<PredictionResult[]> {
+     const results: PredictionResult[] = [];
+
+     for (const pred of topPredictions) {
+       // Skip healthy classes in top-K (they're handled separately)
+       if (isHealthyClass(pred.classIndex)) continue;
+
+       const diseaseId = getDiseaseIdForIndex(pred.classIndex);
+       const plantId = getPlantIdForIndex(pred.classIndex);
+
+       if (!diseaseId || diseaseId === "healthy") continue;
+
+       const disease = await getDiseaseById(diseaseId);
+       if (!disease) continue;
+
+       // Use probability as raw confidence, calibrate with entropy
+       const confidence = calibrateConfidence(pred.probability);
+
+       const plant = await getPlantById(disease.plantId).catch(() => null);
+
+       results.push({
+         diseaseId,
+         disease,
+         confidence,
+         lookalikes: disease.lookalikeDiseaseIds,
+         plant: plant ?? null,
+       });
+     }
+
+     results.sort((a, b) => b.confidence.adjusted - a.confidence.adjusted);
+     return results;
+   }
+   ```
+
+4. **Update response types** to support healthy result:
+
+   ```typescript
+   // src/lib/types.ts
+   export interface IdentifyResponse {
+     predictions?: PredictionResult[];
+     healthy?: boolean;
+     plantId?: string;
+     confidence?: ConfidenceResult;
+     metadata: InferenceMetadata;
+     demo_mode?: boolean; // Remove or always false
+   }
+   ```
+
+5. **Update `ResultsDashboard` component** to handle healthy result:
+
+   ```tsx
+   // If response.healthy === true, show HealthyResult component instead of prediction cards
+   if (response?.healthy) {
+     return <HealthyResult plantId={response.plantId} confidence={response.confidence} />;
+   }
+   ```
+
+6. **Create `HealthyResult` component** `src/components/HealthyResult.tsx`:
+
+   ```tsx
+   export default function HealthyResult({ plantId, confidence }) {
+     const plant = usePlant(plantId); // fetch plant data
+     return (
+       <div className="...">
+         <div className="text-6xl">🌿</div>
+         <h2>No Disease Detected</h2>
+         <p>
+           The image appears healthy{plant ? ` (${plant.commonName})` : ""}. Confidence:{" "}
+           {Math.round(confidence.adjusted * 100)}%
+         </p>
+         <p className="text-sm text-zinc-500">
+           If symptoms persist, try uploading a clearer photo of the affected area.
+         </p>
+       </div>
+     );
+   }
+   ```
+
+7. **Remove demo mode logic**:
+   - In `model-loader.ts`: remove `createMockModel()` fallback (or keep it but only for development)
+   - In `route.ts`: remove `demo_mode: true` branch
+   - In `ResultsDashboard.tsx`: remove "Demo mode" badge
+   - In `src/lib/api/identify.ts`: remove `demo_mode` from response type
+
+8. **Add error handling for model not loaded**:
+
+   ```typescript
+   const model = await getModel();
+   if (!model.getStatus().loaded) {
+     return NextResponse.json(
+       {
+         error: "Model not available",
+         message: "ML model failed to load. Please try again later.",
+       },
+       { status: 503 },
+     );
+   }
+   ```
+
+9. **Update client-side API** `src/lib/api/identify.ts`:
+
+   ```typescript
+   export interface IdentifyResponse {
+     predictions?: PredictionResult[];
+     healthy?: boolean;
+     plantId?: string;
+     confidence?: ConfidenceResult;
+     metadata: InferenceMetadata;
+   }
+   ```
+
+10. **Add structured logging** for inference requests:
+    ```typescript
+    console.log(
+      JSON.stringify({
+        event: "inference",
+        imageId,
+        modelId: MODEL_ID,
+        inferenceTimeMs,
+        topPrediction: predictions[0]?.diseaseId,
+        confidence: predictions[0]?.confidence.adjusted,
+        entropy: calibration?.entropy,
+      }),
+    );
+    ```
+
+tests:
+
+- Integration: POST `/api/identify` with valid imageId returns real predictions (no `demo_mode: true`)
+- Integration: response includes `predictions` array with valid diseaseIds from KB
+- Integration: confidence scores are calibrated (not raw softmax)
+- Integration: healthy predictions return `healthy: true` with plantId
+- Unit: `enrichPredictions()` skips healthy classes in top-K
+- Unit: `isProbabilities()` correctly identifies probability output
+- Unit: `runInference()` throws error if model not loaded
+- E2E: upload a tomato leaf image → get tomato disease predictions
+- E2E: upload a healthy plant image → get healthy result
+
+acceptance_criteria:
+
+- `/api/identify` returns real model predictions (not mock)
+- All diseaseIds in response are valid KB entries (verifiable via `getDiseaseById()`)
+- Confidence scores use temperature-scaled calibration (not raw softmax)
+- Healthy predictions return `{ healthy: true, plantId, confidence }` instead of disease predictions
+- Demo mode is completely removed from production path
+- Error handling: model not loaded → 503 response with clear message
+- Structured logging for every inference request
+- Client-side API handles new response shape (healthy vs predictions)
+
+validation:
+
+- `npx vitest run src/app/api/identify/identify.test.ts`
+- `npx vitest run src/lib/ml/inference.test.ts`
+- `curl -X POST http://localhost:3000/api/identify -H "Content-Type: application/json" -d '{"imageId":"<test-id>"}'` — response has real predictions
+- Upload a test image via UI → see real disease names (not demo mode)
+- Check server logs: `event: "inference"` with valid modelId and inferenceTimeMs
+
+notes:
+
+- This task depends on tasks 02, 03, and 04 being complete. Do not start until all dependencies are met.
+- The `enrichPredictions()` function now skips healthy classes — they're handled by the healthy result path
+- If the model is not loaded, return 503 (Service Unavailable) instead of falling back to mock
+- Structured logging should be JSON for easy parsing by log aggregators
+- The `demo_mode` field can be removed entirely or kept as `false` for backwards compatibility