# 17. Backend Router — SpamShield (Spam Detection & Call Analysis) meta: id: shieldai-unified-restructure-17 feature: shieldai-unified-restructure priority: P1 depends_on: [shieldai-unified-restructure-12, shieldai-unified-restructure-13, shieldai-unified-restructure-14] tags: [backend, trpc, spamshield, ml, api] objective: - Build the tRPC router for SpamShield, the spam detection and call analysis service. Port all logic from `services/spamshield/` and `packages/api/src/routes/spamshield.routes.ts` into a unified `spamshield` router and service layer. deliverables: - `web/src/server/api/routers/spamshield.ts` — SpamShield router: - `spamshield.checkNumber` — `publicProcedure` (or API-key protected) checking phone number reputation - `spamshield.classifySMS` — `publicProcedure` classifying SMS text as spam/ham - `spamshield.classifyCall` — `publicProcedure` analyzing call metadata for spam likelihood - `spamshield.getRules` — `protectedProcedure` returning user's spam rules - `spamshield.createRule` — `protectedProcedure` creating a custom spam rule - `spamshield.deleteRule` — `protectedProcedure` deleting a rule - `spamshield.submitFeedback` — `protectedProcedure` submitting false positive/negative feedback - `spamshield.getStats` — `protectedProcedure` returning spam detection statistics - `web/src/server/services/spamshield.service.ts` — Core business logic: - `checkNumberReputation(phoneNumber)` — query Hiya/Truecaller/other reputation APIs - `classifySMS(text)` — run BERT-based spam classification - `classifyCall(metadata)` — run rule engine + ML model on call data - `createRule(userId, ruleType, pattern, action)` — save custom rule - `applyRules(userId, phoneNumber, text?)` — evaluate custom rules against input - `submitFeedback(userId, phoneNumber, isSpam, feedbackType)` — log feedback for model retraining - `getStats(userId, period?)` — aggregate detection stats - `web/src/server/services/spamshield/ml.engine.ts` — ML inference: - `classifyTextBERT(text)` — BERT model inference for SMS spam - `extractFeatures(metadata)` — feature extraction for call analysis - `ruleEngine(rules, input)` — evaluate user-defined and global rules - `web/src/server/services/spamshield/reputation.api.ts` — External reputation lookups: - `lookupHiya(phoneNumber)` — Hiya API - `lookupTruecaller(phoneNumber)` — Truecaller API - `lookupInternalDB(phoneNumber)` — query cached reputation scores steps: 1. Create `web/src/server/api/routers/spamshield.ts`. 2. Define Zod schemas: - `checkNumberSchema`: `phoneNumber: z.string()` (E.164 format validation) - `classifySMSSchema`: `text: z.string().max(2000)` - `classifyCallSchema`: `callerNumber: z.string()`, `duration: z.number().optional()`, `timeOfDay: z.number().optional()` - `createRuleSchema`: `ruleType: z.enum([...])`, `pattern: z.string()`, `action: z.enum([...])`, `priority: z.number().default(0)` - `feedbackSchema`: `phoneNumber: z.string()`, `isSpam: z.boolean()`, `feedbackType: z.enum([...])` 3. Implement router procedures: - Number reputation check (may be called by extension or mobile apps) - SMS and call classification - Rule CRUD with user scoping - Feedback submission 4. Create `web/src/server/services/spamshield.service.ts`: - Port from `services/spamshield/src/` - Implement number normalization (E.164) - Implement reputation caching (Redis or in-memory with TTL) 5. Create ML engine: - `classifyTextBERT`: placeholder for BERT model. If not available in JS, create a Python bridge or use a pre-trained ONNX model. - `extractFeatures`: derive features from call metadata (time patterns, area code, duration) - `ruleEngine`: evaluate regex patterns, area code blocks, prefix blocks, reputation scores 6. Create reputation API module: - Implement circuit breaker for external APIs (reference legacy `services/spamshield/test/circuit-breaker.test.ts`) - Cache results in DB or Redis for 24 hours - Fallback to internal database if external APIs fail 7. Implement audit logging: - Every classification decision is logged to `AuditLog` table - Include input, output, confidence, model version, timestamp 8. Wire router into `web/src/server/api/root.ts`. 9. Write unit tests with mocked ML engine and reputation APIs. steps: - Unit: `checkNumberReputation` normalizes phone and queries APIs with circuit breaker - Unit: `classifySMS` returns spam/ham with confidence - Unit: `ruleEngine` evaluates custom rules correctly - Unit: `submitFeedback` creates feedback record - Unit: Audit logging captures all classification decisions - Integration: tRPC `checkNumber` returns reputation for valid E.164 number acceptance_criteria: - [ ] Phone numbers are normalized to E.164 before processing - [ ] Number reputation checks query external APIs with circuit breaker and caching - [ ] SMS classification returns spam/ham verdict with confidence score - [ ] Call analysis evaluates rules and ML model - [ ] Users can create, list, and delete custom spam rules - [ ] Feedback submissions are logged for model improvement - [ ] All classification decisions are audit-logged - [ ] Stats endpoint returns aggregated detection metrics per user validation: - Call `spamshield.checkNumber` with a test phone number → verify reputation response - Call `spamshield.classifySMS` with known spam text → verify high spam score - Create a custom rule and verify it blocks matching numbers - Submit feedback and verify record created in DB - Run `cd web && pnpm test` for SpamShield unit tests notes: - Reference legacy: `services/spamshield/src/`, `packages/api/src/routes/spamshield.routes.ts` - The BERT model for SMS classification may require Python. Use the same approach as VoicePrint: pluggable ML engine with Python bridge or ONNX. - Hiya and Truecaller APIs require commercial agreements. For development, mock these or use free alternatives like NumVerify. - The `checkNumber` endpoint may receive high traffic from the browser extension. Ensure it is rate-limited and cached aggressively. - Consider adding a global spam database that accumulates feedback from all users (anonymized) to improve detection. - The rule engine should support both user-specific rules and global admin rules.