Files
Kordant/tasks/core-services-implementation/06-spamshield-reputation.md
2026-05-31 22:03:18 -04:00

71 lines
4.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 06. Twilio Lookup and Phone Reputation API Integration
meta:
id: core-services-06
feature: core-services-implementation
priority: P1
depends_on: [core-services-01]
tags: [spamshield, reputation, twilio, caller-id, api-integration, table-stakes]
objective:
- Replace the stub Hiya/Truecaller lookup functions that return `{ score: 0, isSpam: false }` with real phone reputation API integrations (Twilio Lookup) and integrate results into the spam classification pipeline.
deliverables:
- Twilio Lookup API client for caller name, line type, and carrier info
- Phone reputation scoring system with caching
- Integration with existing rule engine (reputation score augments rule-based decisions)
- STIR/SHAKEN attestation verification (if carrier partnership available)
- Rate-limited, cost-aware API usage
steps:
1. Sign up for Twilio account and enable Lookup API at https://www.twilio.com/lookup
2. Add `TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN` to `.env.example`
3. Create `spamshield/twilio.client.ts`:
- `lookupPhone(phoneNumber, type?)` — caller name, line type (mobile/landline/VoIP), carrier
- `lookupReputation(phoneNumber)` — spam risk score, call volume, report counts
- `verifyStirShaken(phoneNumber)` — attestation level (A/B/C) if available
4. Replace stub `lookupHiya()` and `lookupTruecaller()` in `reputation.api.ts` with real Twilio calls
5. Implement reputation scoring algorithm:
- Twilio spam risk score (0100) mapped to internal confidence (0.01.0)
- Line type weighting: VoIP = higher risk, landline = lower risk
- Carrier reputation: known spam carriers = +20 risk
- STIR/SHAKEN attestation: Full attestation (A) = -30 risk, None (C) = +20 risk
6. Cache results in Redis with 24h TTL (phone numbers don't change reputation rapidly)
7. Wire into `spamshield.service.ts`:
- Before rule engine, check reputation
- If reputation confidence > 0.7, block immediately
- If reputation confidence 0.40.7, flag for review
- If reputation confidence < 0.4, proceed to rule engine + ML classifier
8. Add cost tracking: $0.004$0.03 per lookup, track monthly usage per user
9. Implement fallback: if Twilio API fails, use internal rule engine only (graceful degradation)
tests:
- Unit: Mock Twilio API responses, verify reputation scoring algorithm
- Integration: Test with real Twilio Lookup API using known spam number
- E2E: Submit spam check for phone number → verify reputation lookup → get classification result
acceptance_criteria:
- [ ] `lookupPhone()` makes real HTTP request to Twilio Lookup API
- [ ] Reputation scores are calculated from real Twilio data (not hardcoded zeros)
- [ ] High-reputation numbers (confidence > 0.7) trigger automatic block without rule/ML processing
- [ ] Cache stores reputation results for 24 hours, reducing API costs
- [ ] Twilio API failures gracefully fall back to rule engine (no crashes)
- [ ] Cost tracking records each lookup for billing analytics
- [ ] STIR/SHAKEN attestation is checked and factored into score when available
- [ ] VoIP lines get +20 risk weighting compared to landline
- [ ] Internal DB cache (`lookupInternalDB`) is checked before Twilio API call
- [ ] Rate limits: max 100 lookups/minute per user to prevent abuse
validation:
- Run `vitest run spamshield.service.test.ts` — all tests pass
- Manual: Check reputation for known spam number (e.g., reported robocall number), verify high score
- Check cache: Redis `GET spamshield:reputation:+15551234567` returns cached result
- Monitor cost: Database shows lookup usage per user per month
notes:
- Twilio Lookup costs $0.004 per basic lookup, $0.03 per advanced lookup (reputation, caller name)
- At 100 lookups/user/month, cost is $0.40$3.00 per user — manageable at $12+/mo ARPU
- Hiya and Truecaller have proprietary APIs but require carrier partnerships — Twilio is the best consumer-accessible option
- STIR/SHAKEN requires telecom partner for full attestation data — implement if/when partnership exists
- The existing rule engine (`ruleEngine()`) is functional — reputation augments it, doesn't replace it