# 16. Backend Router — VoicePrint (Voice Cloning Detection) meta: id: shieldai-unified-restructure-16 feature: shieldai-unified-restructure priority: P1 depends_on: [shieldai-unified-restructure-12, shieldai-unified-restructure-13, shieldai-unified-restructure-14] tags: [backend, trpc, voiceprint, ml, api] objective: - Build the tRPC router for VoicePrint, the AI voice cloning detection service. Port all logic from `services/voiceprint/` and `packages/api/src/routes/voiceprint.routes.ts` into a unified `voiceprint` router and service layer. deliverables: - `web/src/server/api/routers/voiceprint.ts` — VoicePrint router: - `voiceprint.getEnrollments` — `protectedProcedure` returning voice enrollments - `voiceprint.createEnrollment` — `protectedProcedure` uploading and processing voice sample - `voiceprint.deleteEnrollment` — `protectedProcedure` removing enrollment - `voiceprint.analyzeAudio` — `protectedProcedure` analyzing audio for synthetic voice detection - `voiceprint.getAnalyses` — `protectedProcedure` returning analysis history - `voiceprint.getAnalysisResult` — `protectedProcedure` returning detailed analysis results - `voiceprint.getJobStatus` — `protectedProcedure` checking batch analysis job status - `web/src/server/services/voiceprint.service.ts` — Core business logic: - `createEnrollment(userId, name, audioBuffer, metadata)` — save audio, generate embedding hash - `deleteEnrollment(userId, enrollmentId)` — remove audio file and DB record - `analyzeAudio(userId, audioBuffer, enrollmentId?)` — run ML detection: - Preprocess audio (VAD, noise reduction) - Run ECAPA-TDNN model for synthetic detection - If enrollment provided, run FAISS vector matching - Return confidence score and verdict - `getAnalyses(userId, filters?)` — query analysis history - `createBatchJob(userId, audioFilePath)` — create analysis job for async processing - `web/src/server/services/voiceprint/ml.engine.ts` — ML inference: - `preprocessAudio(audioBuffer)` — VAD, resampling, noise reduction - `detectSynthetic(audioFeatures)` — ECAPA-TDNN inference - `matchVoice(embedding, enrollmentId)` — FAISS vector index search - `generateEmbedding(audioFeatures)` — create voice embedding vector - `web/src/server/services/voiceprint/storage.ts` — Audio file storage: - `saveAudio(userId, audioBuffer)` — save to local disk or S3-compatible storage - `getAudioUrl(userId, audioHash)` — generate signed URL for retrieval - `deleteAudio(audioHash)` — remove file steps: 1. Create `web/src/server/api/routers/voiceprint.ts`. 2. Define Zod schemas: - `createEnrollmentSchema`: `name: z.string().min(1)`, `audioBase64: z.string()` (or multipart handling) - `analyzeAudioSchema`: `audioBase64: z.string()`, `enrollmentId: z.string().uuid().optional()` - `analysisFilterSchema`: `page`, `limit`, `verdict` optional 3. Implement router procedures: - Enrollment CRUD with user ownership - Audio analysis with optional enrollment matching - Job status queries 4. Create `web/src/server/services/voiceprint.service.ts`: - Port from `services/voiceprint/src/voiceprint.service.ts` - Handle audio preprocessing pipeline - Integrate with ML engine 5. Create ML engine: - `preprocessAudio`: use WebRTC VAD logic or a Node.js equivalent (e.g., `node-vad`) - `detectSynthetic`: placeholder for ECAPA-TDNN model integration. If model is not available in JS, create a Python microservice bridge or use ONNX Runtime. - `matchVoice`: placeholder for FAISS integration. If FAISS is not available in JS, use `faiss-node` or a Python bridge. - `generateEmbedding`: create embedding vector for storage 6. Create storage module: - For local dev: save to `uploads/voiceprint/{userId}/{hash}.wav` - For production: integrate with S3, R2, or similar - Generate presigned URLs for client retrieval 7. Implement analysis pipeline: - Save audio → preprocess → run detection → store result → create alert if synthetic detected - If enrollment provided, also run matching and include similarity score 8. Wire router into `web/src/server/api/root.ts`. 9. Write unit tests for service functions (mock ML engine). steps: - Unit: `createEnrollment` saves audio and creates DB record - Unit: `analyzeAudio` returns verdict and confidence - Unit: `matchVoice` returns similarity score for enrolled voice - Unit: Storage module saves and retrieves files correctly - Unit: ML engine placeholders return mock results - Integration: tRPC procedures enforce user ownership of enrollments acceptance_criteria: - [ ] Voice enrollments can be created, listed, and deleted per user - [ ] Audio analysis returns synthetic/natural/uncertain verdict with confidence score - [ ] If enrollment is provided, analysis includes voice matching similarity - [ ] Analysis history is queryable with pagination - [ ] Batch jobs can be created and their status tracked - [ ] Audio files are stored securely with user-scoped access - [ ] Synthetic voice detection triggers an alert notification validation: - Upload a test audio file via tRPC client, verify enrollment created - Request analysis on test audio, verify result structure (verdict, confidence, metadata) - Verify that user A cannot access user B's enrollments or analyses - Run `cd web && pnpm test` for VoicePrint unit tests notes: - Reference legacy: `services/voiceprint/src/`, `packages/api/src/routes/voiceprint.routes.ts` - The ECAPA-TDNN and FAISS components may require Python or compiled native modules. If they cannot run in the Node.js monolith: - Option A: Create a lightweight Python gRPC/HTTP service for ML inference, call it from the monolith - Option B: Use ONNX Runtime Node.js bindings if a converted model is available - Option C: Keep the Python service separate but unify the API layer in tRPC (the monolith calls the Python service internally) - For this task, implement the service layer with a pluggable ML engine interface. Use mock/stub implementations if native ML is not yet available. - Audio files can be large. Consider streaming uploads instead of base64 encoding for production. - The analysis pipeline should be idempotent: analyzing the same audio twice should return cached results.