153 lines
6.9 KiB
Markdown
153 lines
6.9 KiB
Markdown
# 01. PlantVillage Class Inventory and Knowledge Base Mapping
|
||
|
||
meta:
|
||
id: production-ml-pipeline-01
|
||
feature: production-ml-pipeline
|
||
priority: P0
|
||
depends_on: []
|
||
tags: [data, mapping, research]
|
||
|
||
objective:
|
||
|
||
- Document all 38 PlantVillage model output classes
|
||
- Map each class index to a definitive disease ID in the knowledge base
|
||
- Identify which plants and diseases are missing from the KB and must be added
|
||
- Produce a complete, authoritative mapping file that subsequent tasks consume
|
||
|
||
deliverables:
|
||
|
||
- `src/lib/ml/plantvillage-classes.ts` — definitive mapping of all 38 class indices to structured metadata
|
||
- Updated `tasks/production-ml-pipeline/class-mapping-reference.md` — human-readable reference document
|
||
|
||
steps:
|
||
|
||
1. Document the canonical 38 PlantVillage class labels in order (index 0–37):
|
||
|
||
```
|
||
0: Apple___Apple_scab
|
||
1: Apple___Black_rot
|
||
2: Apple___Cedar_apple_rust
|
||
3: Apple___healthy
|
||
4: Blueberry___healthy
|
||
5: Cherry_(including_sour)___Powdery_mildew
|
||
6: Cherry_(including_sour)___healthy
|
||
7: Corn_(maize)___Cercospora_leaf_spot Gray_leaf_spot
|
||
8: Corn_(maize)___Common_rust_
|
||
9: Corn_(maize)___Northern_Leaf_Blight
|
||
10: Corn_(maize)___healthy
|
||
11: Grape___Black_rot
|
||
12: Grape___Esca_(Black_Measles)
|
||
13: Grape___Leaf_blight_(Isariopsis_Leaf_Spot)
|
||
14: Grape___healthy
|
||
15: Orange___Haunglongbing_(Citrus_greening)
|
||
16: Peach___Bacterial_spot
|
||
17: Peach___healthy
|
||
18: Pepper,_bell___Bacterial_spot
|
||
19: Pepper,_bell___healthy
|
||
20: Potato___Early_blight
|
||
21: Potato___Late_blight
|
||
22: Potato___healthy
|
||
23: Raspberry___healthy
|
||
24: Soybean___healthy
|
||
25: Squash___Powdery_mildew
|
||
26: Strawberry___Leaf_scorch
|
||
27: Strawberry___healthy
|
||
28: Tomato___Bacterial_spot
|
||
29: Tomato___Early_blight
|
||
30: Tomato___Late_blight
|
||
31: Tomato___Leaf_Mold
|
||
32: Tomato___Septoria_leaf_spot
|
||
33: Tomato___Spider_mites Two-spotted_spider_mite
|
||
34: Tomato___Target_Spot
|
||
35: Tomato___Tomato_Yellow_Leaf_Curl_Virus
|
||
36: Tomato___Tomato_mosaic_virus
|
||
37: Tomato___healthy
|
||
```
|
||
|
||
2. For each class, determine the mapping target:
|
||
- **Healthy classes** (13 total: indices 3, 4, 6, 10, 14, 17, 19, 22, 23, 24, 27, 37): map to a special `"healthy"` sentinel. These indicate the model detected no disease.
|
||
- **Disease classes with exact KB match**: map directly to existing disease ID.
|
||
- 28 → `bacterial-leaf-spot-tomato` (Tomato Bacterial_spot ≈ bacterial-leaf-spot-tomato)
|
||
- 29 → `early-blight`
|
||
- 30 → `late-blight`
|
||
- 32 → `septoria-leaf-spot`
|
||
- 25 → `squash-powdery-mildew`
|
||
- 26 → `strawberry-leaf-scorch`
|
||
- 18 → `pepper-bacterial-wilt` (closest match to Pepper Bacterial_spot)
|
||
- **Disease classes needing new KB entries** (no existing disease in our KB):
|
||
- 0: Apple_scab → new disease `apple-scab` under plant `apple`
|
||
- 1: Apple_black_rot → new disease `apple-black-rot` under plant `apple`
|
||
- 2: Apple_cedar_apple_rust → new disease `apple-cedar-apple-rust` under plant `apple`
|
||
- 5: Cherry_powdery_mildew → new disease `cherry-powdery-mildew` under plant `cherry`
|
||
- 7: Corn_cercospora_leaf_spot → new disease `corn-gray-leaf-spot` under plant `corn`
|
||
- 8: Corn_common_rust → new disease `corn-common-rust` under plant `corn`
|
||
- 9: Corn_northern_leaf_blight → new disease `corn-northern-leaf-blight` under plant `corn`
|
||
- 11: Grape_black_rot → new disease `grape-black-rot` under plant `grape`
|
||
- 12: Grape_esca → new disease `grape-esca` under plant `grape`
|
||
- 13: Grape_leaf_blight → new disease `grape-leaf-blight` under plant `grape`
|
||
- 15: Orange_huanglongbing → new disease `orange-citrus-greening` under plant `orange`
|
||
- 16: Peach_bacterial_spot → new disease `peach-bacterial-spot` under plant `peach`
|
||
- 20: Potato_early_blight → new disease `potato-early-blight` under plant `potato`
|
||
- 21: Potato_late_blight → new disease `potato-late-blight` under plant `potato`
|
||
- 31: Tomato_leaf_mold → new disease `tomato-leaf-mold` under plant `tomato`
|
||
- 33: Tomato_spider_mites → new disease `tomato-spider-mites` under plant `tomato`
|
||
- 34: Tomato_target_spot → new disease `tomato-target-spot` under plant `tomato`
|
||
- 35: Tomato_yellow_leaf_curl_virus → new disease `tomato-yellow-leaf-curl-virus` under plant `tomato`
|
||
- 36: Tomato_mosaic_virus → new disease `tomato-mosaic-virus` under plant `tomato`
|
||
|
||
3. Create the mapping type and data structure in `src/lib/ml/plantvillage-classes.ts`:
|
||
|
||
```typescript
|
||
export interface PlantVillageClass {
|
||
index: number;
|
||
rawLabel: string;
|
||
plantId: string; // KB plant slug
|
||
diseaseId: string | null; // null for healthy classes
|
||
isHealthy: boolean;
|
||
displayName: string; // human-readable disease name
|
||
}
|
||
|
||
export const PLANTVILLAGE_CLASSES: readonly PlantVillageClass[] = [ ... ];
|
||
```
|
||
|
||
4. For each class, also record:
|
||
- The PlantVillage plant name (e.g., "Tomato", "Apple")
|
||
- The target KB plantId (e.g., "tomato", "apple")
|
||
- The target KB diseaseId (e.g., "early-blight") or null for healthy
|
||
- Whether the disease needs to be added to the KB (boolean flag for task 02)
|
||
|
||
5. Verify the mapping covers all 38 indices with no gaps or duplicates.
|
||
|
||
tests:
|
||
|
||
- Unit: mapping has exactly 38 entries
|
||
- Unit: indices 0–37 are all present, no gaps
|
||
- Unit: each non-healthy entry has a non-null diseaseId
|
||
- Unit: each healthy entry has null diseaseId and isHealthy=true
|
||
- Unit: no duplicate diseaseIds across non-healthy entries
|
||
- Unit: all plantIds are valid slugs (lowercase, kebab-case)
|
||
|
||
acceptance_criteria:
|
||
|
||
- `src/lib/ml/plantvillage-classes.ts` exports `PLANTVILLAGE_CLASSES` array with exactly 38 entries
|
||
- Every index 0–37 maps to exactly one entry
|
||
- 13 entries are healthy (isHealthy=true, diseaseId=null)
|
||
- 25 entries are diseases with valid plantId and diseaseId
|
||
- Each entry includes rawLabel, plantId, diseaseId, displayName
|
||
- All new disease IDs follow kebab-case convention matching existing KB pattern
|
||
- Reference document `class-mapping-reference.md` lists all 38 classes with their KB mappings
|
||
|
||
validation:
|
||
|
||
- `npx vitest run src/lib/ml/plantvillage-classes.test.ts` — all mapping tests pass
|
||
- Manual review: each of the 25 disease entries maps to a plausible disease in our KB
|
||
|
||
notes:
|
||
|
||
- This task produces the authoritative mapping consumed by task 02 (KB expansion) and task 03 (label mapping)
|
||
- The PlantVillage class order is fixed by the model's training — do NOT reorder
|
||
- "Tomato Bacterial_spot" maps to our existing `bacterial-leaf-spot-tomato` — this is the closest match, not a perfect one
|
||
- "Pepper Bacterial_spot" maps to `pepper-bacterial-wilt` — imperfect but closest available match
|
||
- 10 new plants must be added to the KB: apple, blueberry, cherry, corn, grape, orange, peach, potato, raspberry, soybean
|
||
- Blueberry, Raspberry, Soybean only have "healthy" class — still need plant entries for context but no new disease entries
|