# Production ML Pipeline

Objective: Get the plant disease identification ML pipeline to full production readiness with real model inference, proper class mapping, and production-grade error handling.

Status legend: [ ] todo, [~] in-progress, [x] done

## Tasks

- [ ] 01 — PlantVillage class inventory and knowledge base mapping → `01-plantvillage-class-inventory.md`
- [ ] 02 — Label mapping layer implementation → `02-label-mapping-implementation.md`
- [ ] 03 — TensorFlow.js model loading verification and fixes → `03-model-loading-verification.md`
- [ ] 04 — Confidence calibration for PlantVillage model → `04-confidence-calibration.md`
- [ ] 05 — Real model integration into identification pipeline → `05-pipeline-integration.md`
- [ ] 06 — Plant-context-aware identification → `06-plant-context-identification.md`
- [ ] 07 — End-to-end integration testing → `07-end-to-end-testing.md`
- [ ] 08 — Production hardening and observability → `08-production-hardening.md`

## Dependencies

- 01 → 02 (mapping data feeds label layer)
- 02 → 05 (labels feed pipeline)
- 03 → 05 (verified model loading feeds pipeline)
- 04 → 05 (calibration feeds pipeline)
- 05 → 06 (real model enables plant context)
- 05 → 07 (integrated pipeline enables e2e testing)
- 07 → 08 (tested pipeline enables production hardening)

## Exit Criteria

- The feature is complete when:
  - Model loads successfully and produces real (non-mock) predictions
  - All 38 PlantVillage classes map to valid knowledge base disease IDs
  - End-to-end pipeline works: upload image → get real disease diagnoses with calibrated confidence
  - Confidence scores are meaningful (high confidence for clear cases, low for ambiguous)
  - Plant context optionally boosts relevant predictions
  - Full integration test suite passes
  - Error handling, logging, and monitoring in place
  - No demo mode fallback in production
  - Rate limiting and input sanitization active
  - Health endpoint reports model status and inference metrics