6.2 KiB
6.2 KiB
17. Backend Router — SpamShield (Spam Detection & Call Analysis)
meta: id: shieldai-unified-restructure-17 feature: shieldai-unified-restructure priority: P1 depends_on: [shieldai-unified-restructure-12, shieldai-unified-restructure-13, shieldai-unified-restructure-14] tags: [backend, trpc, spamshield, ml, api]
objective:
- Build the tRPC router for SpamShield, the spam detection and call analysis service. Port all logic from
services/spamshield/andpackages/api/src/routes/spamshield.routes.tsinto a unifiedspamshieldrouter and service layer.
deliverables:
web/src/server/api/routers/spamshield.ts— SpamShield router:spamshield.checkNumber—publicProcedure(or API-key protected) checking phone number reputationspamshield.classifySMS—publicProcedureclassifying SMS text as spam/hamspamshield.classifyCall—publicProcedureanalyzing call metadata for spam likelihoodspamshield.getRules—protectedProcedurereturning user's spam rulesspamshield.createRule—protectedProcedurecreating a custom spam rulespamshield.deleteRule—protectedProceduredeleting a rulespamshield.submitFeedback—protectedProceduresubmitting false positive/negative feedbackspamshield.getStats—protectedProcedurereturning spam detection statistics
web/src/server/services/spamshield.service.ts— Core business logic:checkNumberReputation(phoneNumber)— query Hiya/Truecaller/other reputation APIsclassifySMS(text)— run BERT-based spam classificationclassifyCall(metadata)— run rule engine + ML model on call datacreateRule(userId, ruleType, pattern, action)— save custom ruleapplyRules(userId, phoneNumber, text?)— evaluate custom rules against inputsubmitFeedback(userId, phoneNumber, isSpam, feedbackType)— log feedback for model retraininggetStats(userId, period?)— aggregate detection stats
web/src/server/services/spamshield/ml.engine.ts— ML inference:classifyTextBERT(text)— BERT model inference for SMS spamextractFeatures(metadata)— feature extraction for call analysisruleEngine(rules, input)— evaluate user-defined and global rules
web/src/server/services/spamshield/reputation.api.ts— External reputation lookups:lookupHiya(phoneNumber)— Hiya APIlookupTruecaller(phoneNumber)— Truecaller APIlookupInternalDB(phoneNumber)— query cached reputation scores
steps:
- Create
web/src/server/api/routers/spamshield.ts. - Define Zod schemas:
checkNumberSchema:phoneNumber: z.string()(E.164 format validation)classifySMSSchema:text: z.string().max(2000)classifyCallSchema:callerNumber: z.string(),duration: z.number().optional(),timeOfDay: z.number().optional()createRuleSchema:ruleType: z.enum([...]),pattern: z.string(),action: z.enum([...]),priority: z.number().default(0)feedbackSchema:phoneNumber: z.string(),isSpam: z.boolean(),feedbackType: z.enum([...])
- Implement router procedures:
- Number reputation check (may be called by extension or mobile apps)
- SMS and call classification
- Rule CRUD with user scoping
- Feedback submission
- Create
web/src/server/services/spamshield.service.ts:- Port from
services/spamshield/src/ - Implement number normalization (E.164)
- Implement reputation caching (Redis or in-memory with TTL)
- Port from
- Create ML engine:
classifyTextBERT: placeholder for BERT model. If not available in JS, create a Python bridge or use a pre-trained ONNX model.extractFeatures: derive features from call metadata (time patterns, area code, duration)ruleEngine: evaluate regex patterns, area code blocks, prefix blocks, reputation scores
- Create reputation API module:
- Implement circuit breaker for external APIs (reference legacy
services/spamshield/test/circuit-breaker.test.ts) - Cache results in DB or Redis for 24 hours
- Fallback to internal database if external APIs fail
- Implement circuit breaker for external APIs (reference legacy
- Implement audit logging:
- Every classification decision is logged to
AuditLogtable - Include input, output, confidence, model version, timestamp
- Every classification decision is logged to
- Wire router into
web/src/server/api/root.ts. - Write unit tests with mocked ML engine and reputation APIs.
steps:
- Unit:
checkNumberReputationnormalizes phone and queries APIs with circuit breaker - Unit:
classifySMSreturns spam/ham with confidence - Unit:
ruleEngineevaluates custom rules correctly - Unit:
submitFeedbackcreates feedback record - Unit: Audit logging captures all classification decisions
- Integration: tRPC
checkNumberreturns reputation for valid E.164 number
acceptance_criteria:
- Phone numbers are normalized to E.164 before processing
- Number reputation checks query external APIs with circuit breaker and caching
- SMS classification returns spam/ham verdict with confidence score
- Call analysis evaluates rules and ML model
- Users can create, list, and delete custom spam rules
- Feedback submissions are logged for model improvement
- All classification decisions are audit-logged
- Stats endpoint returns aggregated detection metrics per user
validation:
- Call
spamshield.checkNumberwith a test phone number → verify reputation response - Call
spamshield.classifySMSwith known spam text → verify high spam score - Create a custom rule and verify it blocks matching numbers
- Submit feedback and verify record created in DB
- Run
cd web && pnpm testfor SpamShield unit tests
notes:
- Reference legacy:
services/spamshield/src/,packages/api/src/routes/spamshield.routes.ts - The BERT model for SMS classification may require Python. Use the same approach as VoicePrint: pluggable ML engine with Python bridge or ONNX.
- Hiya and Truecaller APIs require commercial agreements. For development, mock these or use free alternatives like NumVerify.
- The
checkNumberendpoint may receive high traffic from the browser extension. Ensure it is rate-limited and cached aggressively. - Consider adding a global spam database that accumulates feedback from all users (anonymized) to improve detection.
- The rule engine should support both user-specific rules and global admin rules.