# Phase 5 — Browser Model & Hybrid Integration **Blocked by**: Phase 4 (server inference pipeline) **Est. time**: 2-3 days **Machine**: Any (development on Strix Halo or M3 Pro) ## Objective Train a lightweight browser-compatible model (TF.js) and implement the hybrid routing logic: fast first pass in-browser, server fallback when confidence is low. ## Hybrid Flow ``` User uploads image │ ▼ ┌──────────────────────┐ │ Browser: │ │ EfficientNet-Lite │ ← ~5MB TF.js model in browser │ (TF.js) │ Predicts species + top-5 diseases │ │ │ Species confidence? │ │ ┌────┴────┐ │ │ │ ≥90% │ <90% │ │ └────┬────┘ │ │ │ │ │ Show result │ │ (instant) │ │ └────────────┼────────┘ │ (background if >90%, │ foreground if <90%) ▼ ┌──────────────────────┐ │ Server: │ │ Full Swin-Tiny │ ← Only when browser is uncertain │ (ONNX Runtime) │ or user requests "detailed analysis" │ │ │ Returns enriched │ │ results with full │ │ treatment info │ └──────────────────────┘ ``` ## Steps ### 5.1 Train lightweight browser model Use the hierarchical training data to train a **EfficientNet-Lite0** model that outputs both species and disease predictions: ```python import timm import tensorflow as tf # For TF.js export # Train in PyTorch first (for accuracy), then convert model = timm.create_model('efficientnet_lite0', pretrained=True) # Add: species head (320) + disease head (11,499 flat) # Or use hierarchical with just top-50 diseases per species # Training: 10 epochs frozen backbone, 10 epochs fine-tune # Target: <5MB model size, runs in <100ms on mobile device ``` **Export to TF.js**: ```bash # Convert PyTorch → ONNX → TF.js python -m tf2onnx.convert --pytorch-model browser_model.pt --output browser_model.onnx tensorflowjs_converter --input_format=tf_saved_model browser_model/ browser_tfjs/ ``` **Model size target**: < 5MB (EfficientNet-Lite0 is ~4.7MB with INT8 quantization). ### 5.2 Browser inference integration ```typescript // src/lib/ml/inference.ts — Updated with hybrid routing export type InferenceSource = "browser" | "server"; export type InferenceMode = "quick" | "detailed"; export async function identifyPlant( image: HTMLImageElement | File, mode: InferenceMode = "quick", ): Promise { // 1. Run browser model (always, it's fast) const browserResult = await runBrowserInference(image); // 2. Decide: is this confident enough? if (mode === "quick" && browserResult.topConfidence >= 0.9) { // Browser alone is sufficient return { ...browserResult, source: "browser", inferenceTimeMs: browserResult.inferenceTimeMs, }; } // 3. Fall back to server for detailed analysis const serverResult = await runServerInference(image); return { ...serverResult, source: "server", browserConfidence: browserResult.topConfidence, serverConfidence: serverResult.topConfidence, }; } async function runBrowserInference(image: HTMLImageElement): Promise { const model = await getBrowserModel(); // Lazy load EfficientNet-Lite const tensor = await preprocessBrowser(image); // TF.js preprocessing const output = await model.predict(tensor); return parseOutput(output); } ``` ### 5.3 UI integration ```typescript // src/components/ImageUpload.tsx — Updated function ImageUpload() { const [result, setResult] = useState(null); const [mode, setMode] = useState('quick'); const [source, setSource] = useState(null); async function handleUpload(image: File) { // Run browser model (instant) const browserResult = await identifyPlant(image, 'quick'); setResult(browserResult); setSource(browserResult.source); // If server was called in background, show loading indicator if (browserResult.source === 'server') { // Show "Getting detailed analysis..." spinner } } return (
{result && (
)}
); } ``` ### 5.4 User-facing indication Show a subtle badge indicating which model made the prediction: | Source | Badge | UX | | ------------------- | -------------------- | ------------------------------------- | | Browser (high conf) | ✅ Instant ID | Green badge, "Analyzed on device" | | Server (full model) | 🧠 Detailed Analysis | Blue badge, "Deep analysis" | | Server (fallback) | 🔄 Upgraded | Yellow badge, "Upgraded for accuracy" | ### 5.5 Progressive enhancement The system should degrade gracefully: | Scenario | Behavior | | ---------------------------------- | --------------------------------------------------------------------- | | Offline | Browser model only (may be less accurate for unusual diseases) | | Slow network | Browser model shows results immediately, server updates in background | | Server down | Browser model alone, with note: "Limited to quick analysis" | | New disease (not in browser model) | Server model handles it, browser shows "could be unusual" | | No camera / file | Error message, "Upload an image to identify" | ## Edge Cases & Gotchas - **Model loading race**: If the browser model hasn't loaded yet, show a loading spinner rather than falling through to server. Lazy-load the model on page mount. - **Discrepancy between browser and server**: If browser and server disagree on the top prediction, show both with confidence bars. The server model is authoritative. - **Retina / high-DPI images**: TF.js may handle these differently from ONNX. Ensure preprocessing (resize, normalize) produces identical tensors. - **Cache busting**: When the model is updated, increment a version hash in the URL to avoid stale cached models. - **Memory**: EfficientNet-Lite takes ~5MB in memory. Older phones may struggle; add a cleanup step after inference (`model.dispose()`). ## Performance Targets | Metric | Target | | ------------------------------- | -------------------------------- | | Browser model load time (warm) | < 1s | | Browser model inference | < 100ms | | Server model inference (warm) | < 200ms | | Hybrid fast path (browser only) | < 200ms total | | Hybrid server path | < 1.5s total (including network) | | Model file size (browser) | < 5MB | ## Verification - [ ] Browser model loads in Chrome, Firefox, Safari (desktop + mobile) - [ ] Browser model inference completes in < 100ms on mid-range phone - [ ] Hybrid routing works: conf ≥90% → browser result, conf <90% → server result - [ ] Server fallback fires within 200ms of browser model completing - [ ] UI shows source badge ("Instant ID" vs "Deep Analysis") - [ ] Offline mode: browser model works without network - [ ] Server degraded: system still works with browser model only - [ ] No memory leaks on repeated inferences (10+ images in succession) - [ ] Identical image produces same top prediction on browser and server (within margin) - [ ] All existing tests pass with hybrid pipeline