task to get this here done

2026-06-12 13:20:33 -04:00
parent 6379860123
commit 34855eff55
7 changed files with 1307 additions and 85 deletions
--- a/tasks/hierarchical-model-upgrade/05-browser-hybrid.md
+++ b/tasks/hierarchical-model-upgrade/05-browser-hybrid.md
@@ -0,0 +1,208 @@
+# Phase 5 — Browser Model & Hybrid Integration
+
+**Blocked by**: Phase 4 (server inference pipeline)
+**Est. time**: 2-3 days
+**Machine**: Any (development on Strix Halo or M3 Pro)
+
+## Objective
+
+Train a lightweight browser-compatible model (TF.js) and implement the hybrid routing logic: fast first pass in-browser, server fallback when confidence is low.
+
+## Hybrid Flow
+
+```
+User uploads image
+        │
+        ▼
+┌──────────────────────┐
+│ Browser:             │
+│ EfficientNet-Lite    │  ← ~5MB TF.js model in browser
+│ (TF.js)              │     Predicts species + top-5 diseases
+│                      │
+│ Species confidence?  │
+│ ┌────┴────┐         │
+│ │ ≥90%    │ <90%    │
+│ └────┬────┘         │
+│      │               │
+│  Show result         │
+│  (instant)  │        │
+└────────────┼────────┘
+             │ (background if >90%,
+             │  foreground if <90%)
+             ▼
+┌──────────────────────┐
+│ Server:              │
+│ Full Swin-Tiny       │  ← Only when browser is uncertain
+│ (ONNX Runtime)       │     or user requests "detailed analysis"
+│                      │
+│ Returns enriched     │
+│ results with full    │
+│ treatment info       │
+└──────────────────────┘
+```
+
+## Steps
+
+### 5.1 Train lightweight browser model
+
+Use the hierarchical training data to train a **EfficientNet-Lite0** model that outputs both species and disease predictions:
+
+```python
+import timm
+import tensorflow as tf  # For TF.js export
+
+# Train in PyTorch first (for accuracy), then convert
+model = timm.create_model('efficientnet_lite0', pretrained=True)
+# Add: species head (320) + disease head (11,499 flat)
+# Or use hierarchical with just top-50 diseases per species
+
+# Training: 10 epochs frozen backbone, 10 epochs fine-tune
+# Target: <5MB model size, runs in <100ms on mobile device
+```
+
+**Export to TF.js**:
+
+```bash
+# Convert PyTorch → ONNX → TF.js
+python -m tf2onnx.convert --pytorch-model browser_model.pt --output browser_model.onnx
+tensorflowjs_converter --input_format=tf_saved_model browser_model/ browser_tfjs/
+```
+
+**Model size target**: < 5MB (EfficientNet-Lite0 is ~4.7MB with INT8 quantization).
+
+### 5.2 Browser inference integration
+
+```typescript
+// src/lib/ml/inference.ts — Updated with hybrid routing
+
+export type InferenceSource = "browser" | "server";
+export type InferenceMode = "quick" | "detailed";
+
+export async function identifyPlant(
+  image: HTMLImageElement | File,
+  mode: InferenceMode = "quick",
+): Promise<InferenceResult> {
+  // 1. Run browser model (always, it's fast)
+  const browserResult = await runBrowserInference(image);
+
+  // 2. Decide: is this confident enough?
+  if (mode === "quick" && browserResult.topConfidence >= 0.9) {
+    // Browser alone is sufficient
+    return {
+      ...browserResult,
+      source: "browser",
+      inferenceTimeMs: browserResult.inferenceTimeMs,
+    };
+  }
+
+  // 3. Fall back to server for detailed analysis
+  const serverResult = await runServerInference(image);
+
+  return {
+    ...serverResult,
+    source: "server",
+    browserConfidence: browserResult.topConfidence,
+    serverConfidence: serverResult.topConfidence,
+  };
+}
+
+async function runBrowserInference(image: HTMLImageElement): Promise<BrowserResult> {
+  const model = await getBrowserModel(); // Lazy load EfficientNet-Lite
+  const tensor = await preprocessBrowser(image); // TF.js preprocessing
+  const output = await model.predict(tensor);
+  return parseOutput(output);
+}
+```
+
+### 5.3 UI integration
+
+```typescript
+// src/components/ImageUpload.tsx — Updated
+
+function ImageUpload() {
+  const [result, setResult] = useState<InferenceResult | null>(null);
+  const [mode, setMode] = useState<InferenceMode>('quick');
+  const [source, setSource] = useState<InferenceSource | null>(null);
+
+  async function handleUpload(image: File) {
+    // Run browser model (instant)
+    const browserResult = await identifyPlant(image, 'quick');
+    setResult(browserResult);
+    setSource(browserResult.source);
+
+    // If server was called in background, show loading indicator
+    if (browserResult.source === 'server') {
+      // Show "Getting detailed analysis..." spinner
+    }
+  }
+
+  return (
+    <div>
+      <ImageUploader onUpload={handleUpload} />
+      {result && (
+        <div>
+          <ResultCard result={result} />
+          <ConfidenceBadge
+            confidence={result.topConfidence}
+            source={source}  // "browser" or "server"
+          />
+        </div>
+      )}
+    </div>
+  );
+}
+```
+
+### 5.4 User-facing indication
+
+Show a subtle badge indicating which model made the prediction:
+
+| Source              | Badge                | UX                                    |
+| ------------------- | -------------------- | ------------------------------------- |
+| Browser (high conf) | ✅ Instant ID        | Green badge, "Analyzed on device"     |
+| Server (full model) | 🧠 Detailed Analysis | Blue badge, "Deep analysis"           |
+| Server (fallback)   | 🔄 Upgraded          | Yellow badge, "Upgraded for accuracy" |
+
+### 5.5 Progressive enhancement
+
+The system should degrade gracefully:
+
+| Scenario                           | Behavior                                                              |
+| ---------------------------------- | --------------------------------------------------------------------- |
+| Offline                            | Browser model only (may be less accurate for unusual diseases)        |
+| Slow network                       | Browser model shows results immediately, server updates in background |
+| Server down                        | Browser model alone, with note: "Limited to quick analysis"           |
+| New disease (not in browser model) | Server model handles it, browser shows "could be unusual"             |
+| No camera / file                   | Error message, "Upload an image to identify"                          |
+
+## Edge Cases & Gotchas
+
+- **Model loading race**: If the browser model hasn't loaded yet, show a loading spinner rather than falling through to server. Lazy-load the model on page mount.
+- **Discrepancy between browser and server**: If browser and server disagree on the top prediction, show both with confidence bars. The server model is authoritative.
+- **Retina / high-DPI images**: TF.js may handle these differently from ONNX. Ensure preprocessing (resize, normalize) produces identical tensors.
+- **Cache busting**: When the model is updated, increment a version hash in the URL to avoid stale cached models.
+- **Memory**: EfficientNet-Lite takes ~5MB in memory. Older phones may struggle; add a cleanup step after inference (`model.dispose()`).
+
+## Performance Targets
+
+| Metric                          | Target                           |
+| ------------------------------- | -------------------------------- |
+| Browser model load time (warm)  | < 1s                             |
+| Browser model inference         | < 100ms                          |
+| Server model inference (warm)   | < 200ms                          |
+| Hybrid fast path (browser only) | < 200ms total                    |
+| Hybrid server path              | < 1.5s total (including network) |
+| Model file size (browser)       | < 5MB                            |
+
+## Verification
+
+- [ ] Browser model loads in Chrome, Firefox, Safari (desktop + mobile)
+- [ ] Browser model inference completes in < 100ms on mid-range phone
+- [ ] Hybrid routing works: conf ≥90% → browser result, conf <90% → server result
+- [ ] Server fallback fires within 200ms of browser model completing
+- [ ] UI shows source badge ("Instant ID" vs "Deep Analysis")
+- [ ] Offline mode: browser model works without network
+- [ ] Server degraded: system still works with browser model only
+- [ ] No memory leaks on repeated inferences (10+ images in succession)
+- [ ] Identical image produces same top prediction on browser and server (within margin)
+- [ ] All existing tests pass with hybrid pipeline