3.1 KiB
3.1 KiB
Hierarchical Model Architecture Upgrade
Scale: 1.47M images across 11,499 disease-plant classes Goal: Replace flat MobileNetV2 (38-class PlantVillage) with hierarchical Swin-Tiny (species → disease) Deployment: Hybrid — lightweight browser model (TF.js) + full server model (ONNX Runtime)
Hardware
| Machine | Role | Specs |
|---|---|---|
| Strix Halo | Primary training + inference | AI 395+ MAX (ROCm), 128GB unified memory |
| RTX 3090 | Secondary training / CUDA path | 24GB VRAM |
| M3 Pro | Development only (work machine) | — |
Key advantage: Strix Halo's 128GB unified memory allows loading the entire 1.5M image dataset into RAM and training with extremely large effective batch sizes — the GPU accesses the full 128GB pool, no VRAM ceiling.
Status Legend
[ ] not started [~] in progress [x] done [-] skipped
Task Map
Phase 1 ──→ Phase 2 ──→ Phase 3 ──→ Phase 4 ──→ Phase 5
Dataset Model Model Server Integration
Reorg Training Export Inference + Testing
& Quant. Pipeline
Phases
- Phase 1 — Dataset Reorganization Parse 11,499 flat directories into hierarchical species→disease structure, create train/val splits, build species index.
- Phase 2 — Hierarchical Model Training Train Swin-Tiny backbone + species head + disease heads using PyTorch + ROCm on Strix Halo.
- Phase 3 — ONNX Export & Quantization Export trained models to ONNX, apply INT8 quantization, verify accuracy.
- Phase 4 — Server Inference Pipeline Build server-side inference API with ONNX Runtime, OOD detection, species routing.
- Phase 5 — Browser Model & Hybrid Integration Lightweight TF.js model for client, hybrid confidence-based routing, full integration.
Dependencies
01 (dataset) ──→ 02 (training) ──→ 03 (export) ──→ 04 (server)
│
└──→ 05 (browser + hybrid)
Exit Criteria
- Species classifier achieves ≥95% top-1 accuracy on held-out val set
- Disease classifiers achieve ≥90% top-3 accuracy per species
- ONNX INT8 models infer in <200ms on CPU, <50ms on GPU
- Browser TF.js model loads and runs in <100ms on mid-range devices
- Hybrid routing works: high-confidence results served instantly from browser
- Server fallback fires automatically when browser confidence is low
- OOD detection rejects non-plant images with ≥99% precision
- Full integration: upload → result in <500ms (browser) or <1s (server)
- Existing app functionality preserved (all routes, pages, API endpoints)