task to get this here done
This commit is contained in:
208
tasks/hierarchical-model-upgrade/05-browser-hybrid.md
Normal file
208
tasks/hierarchical-model-upgrade/05-browser-hybrid.md
Normal file
@@ -0,0 +1,208 @@
|
||||
# Phase 5 — Browser Model & Hybrid Integration
|
||||
|
||||
**Blocked by**: Phase 4 (server inference pipeline)
|
||||
**Est. time**: 2-3 days
|
||||
**Machine**: Any (development on Strix Halo or M3 Pro)
|
||||
|
||||
## Objective
|
||||
|
||||
Train a lightweight browser-compatible model (TF.js) and implement the hybrid routing logic: fast first pass in-browser, server fallback when confidence is low.
|
||||
|
||||
## Hybrid Flow
|
||||
|
||||
```
|
||||
User uploads image
|
||||
│
|
||||
▼
|
||||
┌──────────────────────┐
|
||||
│ Browser: │
|
||||
│ EfficientNet-Lite │ ← ~5MB TF.js model in browser
|
||||
│ (TF.js) │ Predicts species + top-5 diseases
|
||||
│ │
|
||||
│ Species confidence? │
|
||||
│ ┌────┴────┐ │
|
||||
│ │ ≥90% │ <90% │
|
||||
│ └────┬────┘ │
|
||||
│ │ │
|
||||
│ Show result │
|
||||
│ (instant) │ │
|
||||
└────────────┼────────┘
|
||||
│ (background if >90%,
|
||||
│ foreground if <90%)
|
||||
▼
|
||||
┌──────────────────────┐
|
||||
│ Server: │
|
||||
│ Full Swin-Tiny │ ← Only when browser is uncertain
|
||||
│ (ONNX Runtime) │ or user requests "detailed analysis"
|
||||
│ │
|
||||
│ Returns enriched │
|
||||
│ results with full │
|
||||
│ treatment info │
|
||||
└──────────────────────┘
|
||||
```
|
||||
|
||||
## Steps
|
||||
|
||||
### 5.1 Train lightweight browser model
|
||||
|
||||
Use the hierarchical training data to train a **EfficientNet-Lite0** model that outputs both species and disease predictions:
|
||||
|
||||
```python
|
||||
import timm
|
||||
import tensorflow as tf # For TF.js export
|
||||
|
||||
# Train in PyTorch first (for accuracy), then convert
|
||||
model = timm.create_model('efficientnet_lite0', pretrained=True)
|
||||
# Add: species head (320) + disease head (11,499 flat)
|
||||
# Or use hierarchical with just top-50 diseases per species
|
||||
|
||||
# Training: 10 epochs frozen backbone, 10 epochs fine-tune
|
||||
# Target: <5MB model size, runs in <100ms on mobile device
|
||||
```
|
||||
|
||||
**Export to TF.js**:
|
||||
|
||||
```bash
|
||||
# Convert PyTorch → ONNX → TF.js
|
||||
python -m tf2onnx.convert --pytorch-model browser_model.pt --output browser_model.onnx
|
||||
tensorflowjs_converter --input_format=tf_saved_model browser_model/ browser_tfjs/
|
||||
```
|
||||
|
||||
**Model size target**: < 5MB (EfficientNet-Lite0 is ~4.7MB with INT8 quantization).
|
||||
|
||||
### 5.2 Browser inference integration
|
||||
|
||||
```typescript
|
||||
// src/lib/ml/inference.ts — Updated with hybrid routing
|
||||
|
||||
export type InferenceSource = "browser" | "server";
|
||||
export type InferenceMode = "quick" | "detailed";
|
||||
|
||||
export async function identifyPlant(
|
||||
image: HTMLImageElement | File,
|
||||
mode: InferenceMode = "quick",
|
||||
): Promise<InferenceResult> {
|
||||
// 1. Run browser model (always, it's fast)
|
||||
const browserResult = await runBrowserInference(image);
|
||||
|
||||
// 2. Decide: is this confident enough?
|
||||
if (mode === "quick" && browserResult.topConfidence >= 0.9) {
|
||||
// Browser alone is sufficient
|
||||
return {
|
||||
...browserResult,
|
||||
source: "browser",
|
||||
inferenceTimeMs: browserResult.inferenceTimeMs,
|
||||
};
|
||||
}
|
||||
|
||||
// 3. Fall back to server for detailed analysis
|
||||
const serverResult = await runServerInference(image);
|
||||
|
||||
return {
|
||||
...serverResult,
|
||||
source: "server",
|
||||
browserConfidence: browserResult.topConfidence,
|
||||
serverConfidence: serverResult.topConfidence,
|
||||
};
|
||||
}
|
||||
|
||||
async function runBrowserInference(image: HTMLImageElement): Promise<BrowserResult> {
|
||||
const model = await getBrowserModel(); // Lazy load EfficientNet-Lite
|
||||
const tensor = await preprocessBrowser(image); // TF.js preprocessing
|
||||
const output = await model.predict(tensor);
|
||||
return parseOutput(output);
|
||||
}
|
||||
```
|
||||
|
||||
### 5.3 UI integration
|
||||
|
||||
```typescript
|
||||
// src/components/ImageUpload.tsx — Updated
|
||||
|
||||
function ImageUpload() {
|
||||
const [result, setResult] = useState<InferenceResult | null>(null);
|
||||
const [mode, setMode] = useState<InferenceMode>('quick');
|
||||
const [source, setSource] = useState<InferenceSource | null>(null);
|
||||
|
||||
async function handleUpload(image: File) {
|
||||
// Run browser model (instant)
|
||||
const browserResult = await identifyPlant(image, 'quick');
|
||||
setResult(browserResult);
|
||||
setSource(browserResult.source);
|
||||
|
||||
// If server was called in background, show loading indicator
|
||||
if (browserResult.source === 'server') {
|
||||
// Show "Getting detailed analysis..." spinner
|
||||
}
|
||||
}
|
||||
|
||||
return (
|
||||
<div>
|
||||
<ImageUploader onUpload={handleUpload} />
|
||||
{result && (
|
||||
<div>
|
||||
<ResultCard result={result} />
|
||||
<ConfidenceBadge
|
||||
confidence={result.topConfidence}
|
||||
source={source} // "browser" or "server"
|
||||
/>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
### 5.4 User-facing indication
|
||||
|
||||
Show a subtle badge indicating which model made the prediction:
|
||||
|
||||
| Source | Badge | UX |
|
||||
| ------------------- | -------------------- | ------------------------------------- |
|
||||
| Browser (high conf) | ✅ Instant ID | Green badge, "Analyzed on device" |
|
||||
| Server (full model) | 🧠 Detailed Analysis | Blue badge, "Deep analysis" |
|
||||
| Server (fallback) | 🔄 Upgraded | Yellow badge, "Upgraded for accuracy" |
|
||||
|
||||
### 5.5 Progressive enhancement
|
||||
|
||||
The system should degrade gracefully:
|
||||
|
||||
| Scenario | Behavior |
|
||||
| ---------------------------------- | --------------------------------------------------------------------- |
|
||||
| Offline | Browser model only (may be less accurate for unusual diseases) |
|
||||
| Slow network | Browser model shows results immediately, server updates in background |
|
||||
| Server down | Browser model alone, with note: "Limited to quick analysis" |
|
||||
| New disease (not in browser model) | Server model handles it, browser shows "could be unusual" |
|
||||
| No camera / file | Error message, "Upload an image to identify" |
|
||||
|
||||
## Edge Cases & Gotchas
|
||||
|
||||
- **Model loading race**: If the browser model hasn't loaded yet, show a loading spinner rather than falling through to server. Lazy-load the model on page mount.
|
||||
- **Discrepancy between browser and server**: If browser and server disagree on the top prediction, show both with confidence bars. The server model is authoritative.
|
||||
- **Retina / high-DPI images**: TF.js may handle these differently from ONNX. Ensure preprocessing (resize, normalize) produces identical tensors.
|
||||
- **Cache busting**: When the model is updated, increment a version hash in the URL to avoid stale cached models.
|
||||
- **Memory**: EfficientNet-Lite takes ~5MB in memory. Older phones may struggle; add a cleanup step after inference (`model.dispose()`).
|
||||
|
||||
## Performance Targets
|
||||
|
||||
| Metric | Target |
|
||||
| ------------------------------- | -------------------------------- |
|
||||
| Browser model load time (warm) | < 1s |
|
||||
| Browser model inference | < 100ms |
|
||||
| Server model inference (warm) | < 200ms |
|
||||
| Hybrid fast path (browser only) | < 200ms total |
|
||||
| Hybrid server path | < 1.5s total (including network) |
|
||||
| Model file size (browser) | < 5MB |
|
||||
|
||||
## Verification
|
||||
|
||||
- [ ] Browser model loads in Chrome, Firefox, Safari (desktop + mobile)
|
||||
- [ ] Browser model inference completes in < 100ms on mid-range phone
|
||||
- [ ] Hybrid routing works: conf ≥90% → browser result, conf <90% → server result
|
||||
- [ ] Server fallback fires within 200ms of browser model completing
|
||||
- [ ] UI shows source badge ("Instant ID" vs "Deep Analysis")
|
||||
- [ ] Offline mode: browser model works without network
|
||||
- [ ] Server degraded: system still works with browser model only
|
||||
- [ ] No memory leaks on repeated inferences (10+ images in succession)
|
||||
- [ ] Identical image produces same top prediction on browser and server (within margin)
|
||||
- [ ] All existing tests pass with hybrid pipeline
|
||||
Reference in New Issue
Block a user