beepboop
This commit is contained in:
@@ -0,0 +1,152 @@
|
||||
# 01. PlantVillage Class Inventory and Knowledge Base Mapping
|
||||
|
||||
meta:
|
||||
id: production-ml-pipeline-01
|
||||
feature: production-ml-pipeline
|
||||
priority: P0
|
||||
depends_on: []
|
||||
tags: [data, mapping, research]
|
||||
|
||||
objective:
|
||||
|
||||
- Document all 38 PlantVillage model output classes
|
||||
- Map each class index to a definitive disease ID in the knowledge base
|
||||
- Identify which plants and diseases are missing from the KB and must be added
|
||||
- Produce a complete, authoritative mapping file that subsequent tasks consume
|
||||
|
||||
deliverables:
|
||||
|
||||
- `src/lib/ml/plantvillage-classes.ts` — definitive mapping of all 38 class indices to structured metadata
|
||||
- Updated `tasks/production-ml-pipeline/class-mapping-reference.md` — human-readable reference document
|
||||
|
||||
steps:
|
||||
|
||||
1. Document the canonical 38 PlantVillage class labels in order (index 0–37):
|
||||
|
||||
```
|
||||
0: Apple___Apple_scab
|
||||
1: Apple___Black_rot
|
||||
2: Apple___Cedar_apple_rust
|
||||
3: Apple___healthy
|
||||
4: Blueberry___healthy
|
||||
5: Cherry_(including_sour)___Powdery_mildew
|
||||
6: Cherry_(including_sour)___healthy
|
||||
7: Corn_(maize)___Cercospora_leaf_spot Gray_leaf_spot
|
||||
8: Corn_(maize)___Common_rust_
|
||||
9: Corn_(maize)___Northern_Leaf_Blight
|
||||
10: Corn_(maize)___healthy
|
||||
11: Grape___Black_rot
|
||||
12: Grape___Esca_(Black_Measles)
|
||||
13: Grape___Leaf_blight_(Isariopsis_Leaf_Spot)
|
||||
14: Grape___healthy
|
||||
15: Orange___Haunglongbing_(Citrus_greening)
|
||||
16: Peach___Bacterial_spot
|
||||
17: Peach___healthy
|
||||
18: Pepper,_bell___Bacterial_spot
|
||||
19: Pepper,_bell___healthy
|
||||
20: Potato___Early_blight
|
||||
21: Potato___Late_blight
|
||||
22: Potato___healthy
|
||||
23: Raspberry___healthy
|
||||
24: Soybean___healthy
|
||||
25: Squash___Powdery_mildew
|
||||
26: Strawberry___Leaf_scorch
|
||||
27: Strawberry___healthy
|
||||
28: Tomato___Bacterial_spot
|
||||
29: Tomato___Early_blight
|
||||
30: Tomato___Late_blight
|
||||
31: Tomato___Leaf_Mold
|
||||
32: Tomato___Septoria_leaf_spot
|
||||
33: Tomato___Spider_mites Two-spotted_spider_mite
|
||||
34: Tomato___Target_Spot
|
||||
35: Tomato___Tomato_Yellow_Leaf_Curl_Virus
|
||||
36: Tomato___Tomato_mosaic_virus
|
||||
37: Tomato___healthy
|
||||
```
|
||||
|
||||
2. For each class, determine the mapping target:
|
||||
- **Healthy classes** (13 total: indices 3, 4, 6, 10, 14, 17, 19, 22, 23, 24, 27, 37): map to a special `"healthy"` sentinel. These indicate the model detected no disease.
|
||||
- **Disease classes with exact KB match**: map directly to existing disease ID.
|
||||
- 28 → `bacterial-leaf-spot-tomato` (Tomato Bacterial_spot ≈ bacterial-leaf-spot-tomato)
|
||||
- 29 → `early-blight`
|
||||
- 30 → `late-blight`
|
||||
- 32 → `septoria-leaf-spot`
|
||||
- 25 → `squash-powdery-mildew`
|
||||
- 26 → `strawberry-leaf-scorch`
|
||||
- 18 → `pepper-bacterial-wilt` (closest match to Pepper Bacterial_spot)
|
||||
- **Disease classes needing new KB entries** (no existing disease in our KB):
|
||||
- 0: Apple_scab → new disease `apple-scab` under plant `apple`
|
||||
- 1: Apple_black_rot → new disease `apple-black-rot` under plant `apple`
|
||||
- 2: Apple_cedar_apple_rust → new disease `apple-cedar-apple-rust` under plant `apple`
|
||||
- 5: Cherry_powdery_mildew → new disease `cherry-powdery-mildew` under plant `cherry`
|
||||
- 7: Corn_cercospora_leaf_spot → new disease `corn-gray-leaf-spot` under plant `corn`
|
||||
- 8: Corn_common_rust → new disease `corn-common-rust` under plant `corn`
|
||||
- 9: Corn_northern_leaf_blight → new disease `corn-northern-leaf-blight` under plant `corn`
|
||||
- 11: Grape_black_rot → new disease `grape-black-rot` under plant `grape`
|
||||
- 12: Grape_esca → new disease `grape-esca` under plant `grape`
|
||||
- 13: Grape_leaf_blight → new disease `grape-leaf-blight` under plant `grape`
|
||||
- 15: Orange_huanglongbing → new disease `orange-citrus-greening` under plant `orange`
|
||||
- 16: Peach_bacterial_spot → new disease `peach-bacterial-spot` under plant `peach`
|
||||
- 20: Potato_early_blight → new disease `potato-early-blight` under plant `potato`
|
||||
- 21: Potato_late_blight → new disease `potato-late-blight` under plant `potato`
|
||||
- 31: Tomato_leaf_mold → new disease `tomato-leaf-mold` under plant `tomato`
|
||||
- 33: Tomato_spider_mites → new disease `tomato-spider-mites` under plant `tomato`
|
||||
- 34: Tomato_target_spot → new disease `tomato-target-spot` under plant `tomato`
|
||||
- 35: Tomato_yellow_leaf_curl_virus → new disease `tomato-yellow-leaf-curl-virus` under plant `tomato`
|
||||
- 36: Tomato_mosaic_virus → new disease `tomato-mosaic-virus` under plant `tomato`
|
||||
|
||||
3. Create the mapping type and data structure in `src/lib/ml/plantvillage-classes.ts`:
|
||||
|
||||
```typescript
|
||||
export interface PlantVillageClass {
|
||||
index: number;
|
||||
rawLabel: string;
|
||||
plantId: string; // KB plant slug
|
||||
diseaseId: string | null; // null for healthy classes
|
||||
isHealthy: boolean;
|
||||
displayName: string; // human-readable disease name
|
||||
}
|
||||
|
||||
export const PLANTVILLAGE_CLASSES: readonly PlantVillageClass[] = [ ... ];
|
||||
```
|
||||
|
||||
4. For each class, also record:
|
||||
- The PlantVillage plant name (e.g., "Tomato", "Apple")
|
||||
- The target KB plantId (e.g., "tomato", "apple")
|
||||
- The target KB diseaseId (e.g., "early-blight") or null for healthy
|
||||
- Whether the disease needs to be added to the KB (boolean flag for task 02)
|
||||
|
||||
5. Verify the mapping covers all 38 indices with no gaps or duplicates.
|
||||
|
||||
tests:
|
||||
|
||||
- Unit: mapping has exactly 38 entries
|
||||
- Unit: indices 0–37 are all present, no gaps
|
||||
- Unit: each non-healthy entry has a non-null diseaseId
|
||||
- Unit: each healthy entry has null diseaseId and isHealthy=true
|
||||
- Unit: no duplicate diseaseIds across non-healthy entries
|
||||
- Unit: all plantIds are valid slugs (lowercase, kebab-case)
|
||||
|
||||
acceptance_criteria:
|
||||
|
||||
- `src/lib/ml/plantvillage-classes.ts` exports `PLANTVILLAGE_CLASSES` array with exactly 38 entries
|
||||
- Every index 0–37 maps to exactly one entry
|
||||
- 13 entries are healthy (isHealthy=true, diseaseId=null)
|
||||
- 25 entries are diseases with valid plantId and diseaseId
|
||||
- Each entry includes rawLabel, plantId, diseaseId, displayName
|
||||
- All new disease IDs follow kebab-case convention matching existing KB pattern
|
||||
- Reference document `class-mapping-reference.md` lists all 38 classes with their KB mappings
|
||||
|
||||
validation:
|
||||
|
||||
- `npx vitest run src/lib/ml/plantvillage-classes.test.ts` — all mapping tests pass
|
||||
- Manual review: each of the 25 disease entries maps to a plausible disease in our KB
|
||||
|
||||
notes:
|
||||
|
||||
- This task produces the authoritative mapping consumed by task 02 (KB expansion) and task 03 (label mapping)
|
||||
- The PlantVillage class order is fixed by the model's training — do NOT reorder
|
||||
- "Tomato Bacterial_spot" maps to our existing `bacterial-leaf-spot-tomato` — this is the closest match, not a perfect one
|
||||
- "Pepper Bacterial_spot" maps to `pepper-bacterial-wilt` — imperfect but closest available match
|
||||
- 10 new plants must be added to the KB: apple, blueberry, cherry, corn, grape, orange, peach, potato, raspberry, soybean
|
||||
- Blueberry, Raspberry, Soybean only have "healthy" class — still need plant entries for context but no new disease entries
|
||||
Reference in New Issue
Block a user