Production ML Pipeline

Objective: Get the plant disease identification ML pipeline to full production readiness with real model inference, proper class mapping, and production-grade error handling.

Status legend: [ ] todo, [~] in-progress, [x] done

Tasks

01 — PlantVillage class inventory and knowledge base mapping → 01-plantvillage-class-inventory.md
02 — Label mapping layer implementation → 02-label-mapping-implementation.md
03 — TensorFlow.js model loading verification and fixes → 03-model-loading-verification.md
04 — Confidence calibration for PlantVillage model → 04-confidence-calibration.md
05 — Real model integration into identification pipeline → 05-pipeline-integration.md
06 — Plant-context-aware identification → 06-plant-context-identification.md
07 — End-to-end integration testing → 07-end-to-end-testing.md
08 — Production hardening and observability → 08-production-hardening.md

Dependencies

01 → 02 (mapping data feeds label layer)
02 → 05 (labels feed pipeline)
03 → 05 (verified model loading feeds pipeline)
04 → 05 (calibration feeds pipeline)
05 → 06 (real model enables plant context)
05 → 07 (integrated pipeline enables e2e testing)
07 → 08 (tested pipeline enables production hardening)

Exit Criteria

The feature is complete when:
- Model loads successfully and produces real (non-mock) predictions
- All 38 PlantVillage classes map to valid knowledge base disease IDs
- End-to-end pipeline works: upload image → get real disease diagnoses with calibrated confidence
- Confidence scores are meaningful (high confidence for clear cases, low for ambiguous)
- Plant context optionally boosts relevant predictions
- Full integration test suite passes
- Error handling, logging, and monitoring in place
- No demo mode fallback in production
- Rate limiting and input sanitization active
- Health endpoint reports model status and inference metrics