From f8f90502fa6ea6d1b3c3bf62c683d66df8494b10 Mon Sep 17 00:00:00 2001
From: Michael Freno <michaelt.freno@gmail.com>
Date: Tue, 28 Apr 2026 13:58:27 -0400
Subject: [PATCH] Add ShieldAI technical architecture and implementation plan
 (FRE-4459)

- System overview with 3 core services: VoicePrint, DarkWatch, SpamShield
- Tech stack: TypeScript, Fastify, Next.js, PostgreSQL, Redis, Python/FastAPI
- Build vs buy decisions for each component
- 6-phase implementation timeline (24 weeks)
- Infrastructure, security, risk mitigation, and team estimates
- API surface definitions for all services
- Child issues created: FRE-4470 through FRE-4475

Co-Authored-By: Paperclip <noreply@paperclip.ing>
---
 plans/SHIELDAI-technical-architecture.md | 448 +++++++++++++++++++++++
 1 file changed, 448 insertions(+)
 create mode 100644 plans/SHIELDAI-technical-architecture.md

diff --git a/plans/SHIELDAI-technical-architecture.md b/plans/SHIELDAI-technical-architecture.md
new file mode 100644
index 0000000..c3644a8
--- /dev/null
+++ b/plans/SHIELDAI-technical-architecture.md
@@ -0,0 +1,448 @@
+# ShieldAI Technical Architecture & Implementation Plan
+
+## 1. System Overview
+
+ShieldAI is a multi-service SaaS platform with three core engines:
+
+1. **VoicePrint** — voice cloning detection and synthetic voice analysis
+2. **DarkWatch** — dark web exposure monitoring and alerting
+3. **SpamShield** — real-time spam call/text classification and blocking
+
+All three engines share a common platform layer (auth, billing, user management, notification system, API gateway).
+
+---
+
+## 2. High-Level Architecture
+
+```
+┌──────────────────────────────────────────────────────────┐
+│                    Client Apps                            │
+│  (Web Dashboard · Mobile App · CLI · Browser Extension)  │
+└──────────────────────┬───────────────────────────────────┘
+                       │ HTTPS / WSS
+┌──────────────────────▼───────────────────────────────────┐
+│                   API Gateway                             │
+│         (Rate limiting · Auth · Routing · Logging)        │
+└──┬──────────────┬──────────────┬──────────────┬──────────┘
+   │              │              │              │
+┌──▼─────┐  ┌────▼─────┐  ┌────▼─────┐  ┌────▼──────────┐
+│Users/  │  │ VoicePrint│  │DarkWatch │  │ SpamShield    │
+│Billing │  │  Service  │  │  Service │  │   Service     │
+└────────┘  └───────────┘  └──────────┘  └───────────────┘
+   │              │              │              │
+┌──▼──────────────▼──────────────▼──────────────▼──────────┐
+│                  Shared Infrastructure                    │
+│  (Message Queue · Cache · Object Store · ML Pipeline)    │
+└──────────────────────────────────────────────────────────┘
+```
+
+### Tech Stack
+
+| Layer | Technology | Rationale |
+|-------|-----------|-----------|
+| Language | TypeScript (Node.js) | Team velocity, shared codebase, strong ecosystem |
+| Framework | Fastify (API), Next.js (dashboard) | Performance, SSR, mature |
+| Database | PostgreSQL + Prisma | Relational data, type safety, migrations |
+| Cache | Redis | Session, rate limits, real-time alert dedup |
+| Queue | BullMQ (Redis-backed) | Dark web scan jobs, voice analysis jobs |
+| Object Store | S3 / MinIO | Audio samples, reports, scan results |
+| ML Runtime | Python microservice (FastAPI) | Voice analysis models, spam classification |
+| Container | Docker + Docker Compose (dev), K8s (prod) | Portability, scaling |
+| Infra | Terraform + AWS (ECS/Fargate or EKS) | Cloud-native, auto-scaling |
+| CI/CD | GitHub Actions | Automated build, test, deploy |
+
+---
+
+## 3. VoicePrint Service — Voice Cloning Detection
+
+### 3.1 Architecture
+
+```
+┌──────────────┐     ┌──────────────┐     ┌─────────────────┐
+│  Audio In    │────▶│ Preprocessor │────▶│ ML Classifier   │
+│  (upload/   │     │ (VAD, NR,   │     │ (Synthetic vs   │
+│   live call)│     │  normalization)│   │  Natural voice) │
+└──────────────┘     └──────────────┘     └────────┬────────┘
+                                                    │
+┌──────────────┐     ┌──────────────┐     ┌────────▼────────┐
+│  Alert/     │◀────│  Result      │◀────│  Voice          │
+│  Dashboard  │     │  Formatter   │     │  Fingerprint    │
+└──────────────┘     └──────────────┘     │  Matcher        │
+                                           └─────────────────┘
+```
+
+### 3.2 Components
+
+**Audio Preprocessor (Python)**
+- Voice Activity Detection (VAD): Silero VAD
+- Noise reduction: WebRTC VAD + RNNoise
+- Sample rate normalization to 16kHz mono
+- Chunking for real-time streaming analysis
+
+**ML Classifier — Synthetic Voice Detection**
+- Primary model: Fine-tuned **ECAPA-TDNN** (state-of-the-art speaker embedding)
+- Secondary: **WaveNet-based** anomaly detector for artifacts in synthetic audio
+- Training data: ASVspoof 2019/2021 corpus + internal synthetic voice samples
+- Output: confidence score (0-1) that audio is synthetic/cloned
+- Threshold: configurable per tier (Plus: 0.7, Premium: 0.6)
+
+**Voice Fingerprint Matcher**
+- Enrollments: store speaker embeddings for registered family members
+- Cosine similarity matching against enrollment vault
+- New voice detection: "unrecognized speaker" alerts for incoming calls
+- Storage: FAISS index for fast approximate nearest neighbor search
+
+**Real-Time Call Analysis (Premium)**
+- WebRTC-based audio stream interception
+- Sliding window analysis (5-second chunks, 1-second overlap)
+- WebSocket push for real-time alerts to client
+
+### 3.3 Build vs Buy
+
+| Component | Decision | Rationale |
+|-----------|----------|-----------|
+| Synthetic voice detection | **Build** (fine-tune open models) | Core IP, differentiator, ASVspoof models are open |
+| Voice fingerprinting | **Build** (ECAPA-TDNN + FAISS) | Well-understood, low cost at scale |
+| Real-time audio pipeline | **Build** (WebRTC + Python) | Tight integration with blocking engine |
+| Alternative API | **Sonix** or **Rev.ai** (fallback) | Use as secondary validation if needed |
+
+### 3.4 API Surface
+
+```
+POST   /api/v1/voiceprint/enroll          — Enroll a voice profile
+GET    /api/v1/voiceprint/enrollments     — List enrolled profiles
+DELETE /api/v1/voiceprint/enrollments/:id — Remove enrollment
+POST   /api/v1/voiceprint/analyze         — Upload audio for analysis
+WS     /api/v1/voiceprint/stream          — Real-time streaming analysis
+GET    /api/v1/voiceprint/results/:id     — Get analysis result
+POST   /api/v1/voiceprint/batch           — Batch analyze multiple files
+```
+
+---
+
+## 4. DarkWatch Service — Dark Web Monitoring
+
+### 4.1 Architecture
+
+```
+┌──────────────────────────────────────────────────────────────┐
+│                    DarkWatch Service                          │
+│                                                               │
+│  ┌─────────────┐   ┌─────────────┐   ┌────────────────────┐ │
+│  │  Scheduler  │──▶│  Data       │──▶│  Matching &        │ │
+│  │  (Cron/    │   │  Ingestion  │   │  Alert Pipeline    │ │
+│  │   Queue)   │   │  (APIs,     │   │  (Dedup, Severity, │ │
+│  └─────────────┘   │    Scrapers)│   │   Notification)    │ │
+│                    └─────────────┘   └────────────────────┘ │
+│                                                               │
+│  ┌─────────────┐   ┌─────────────┐   ┌────────────────────┐ │
+│  │  User       │   │  Exposure   │   │  Report            │ │
+│  │  Watch List │   │  Database   │   │  Generator         │ │
+│  │  Manager    │   │  (Indexed)  │   │  (PDF, Digest)     │ │
+│  └─────────────┘   └─────────────┘   └────────────────────┘ │
+└──────────────────────────────────────────────────────────────┘
+```
+
+### 4.2 Data Sources
+
+| Source | Type | Coverage | Cost Model | Tier |
+|--------|------|----------|------------|------|
+| **Have I Been Pwned (HIBP)** | API | Email, password breaches | Free (rate limited) / Paid API | All tiers |
+| **SecurityTrails** | API | DNS, domain exposures | ~$100/month | Plus, Premium |
+| **Censys** | API | Internet-wide scan data | ~$200/month | Premium |
+| **Dark web forums** | Scrapers/API | Phone numbers, SSN, emails | ~$500/month (aggregator) | Premium |
+| **Shodan** | API | IoT, exposed services | ~$250/month | Premium |
+| **Internal honeypots** | Build | Phone number exposure | Infrastructure cost | All tiers |
+
+### 4.3 Core Components
+
+**Watch List Manager**
+- Stores user-submitted identifiers: emails, phone numbers, SSN (hashed), home addresses
+- Deduplication: SHA-256 hash of normalized identifiers
+- Tier-based limits: Basic (2 identifiers), Plus (10), Premium (unlimited)
+
+**Data Ingestion Pipeline**
+- Scheduled jobs (BullMQ cron): daily for Basic, hourly for Plus, real-time for Premium
+- Multi-source aggregation with fallback
+- Normalization layer: standardize formats across sources
+- Deduplication: content hash of exposure records
+
+**Matching Engine**
+- Exact match: email, phone number, SSN (last 4 digits for Basic, full hash for Premium)
+- Fuzzy match: name + address combinations for home title monitoring
+- Severity scoring: based on data type, recency, source reliability
+
+**Alert Pipeline**
+- Dedup window: 24 hours per exposure type
+- Severity levels: INFO (email in old breach), WARNING (phone number recent), CRITICAL (SSN + financial)
+- Notification channels: email, push notification, SMS (Premium)
+- Alert fatigue protection: digest mode for INFO, immediate for WARNING+
+
+**Exposure Database**
+- PostgreSQL table with GIN index on identifier arrays
+- Time-series: track exposure history per user
+- Retention: 5 years for Premium, 1 year for Plus, 30 days for Basic
+
+### 4.4 Build vs Buy
+
+| Component | Decision | Rationale |
+|-----------|----------|-----------|
+| Data aggregation | **Buy** (APIs) | Faster time-to-market, battle-tested sources |
+| Matching engine | **Build** | Core logic, tier-specific rules, dedup |
+| Alert system | **Build** | Integrates with shared notification platform |
+| Honeypot network | **Build** | Differentiator, early detection for phone numbers |
+| Full alternative | **Identity1** or **WizIQ** API | Evaluate if build cost exceeds ~$2K/month |
+
+### 4.5 API Surface
+
+```
+POST   /api/v1/darkwatch/watchlist        — Add identifier to watch
+GET    /api/v1/darkwatch/watchlist        — List watched identifiers
+DELETE /api/v1/darkwatch/watchlist/:id    — Remove identifier
+POST   /api/v1/darkwatch/scan             — Trigger manual scan
+GET    /api/v1/darkwatch/exposures        — List user's exposures
+GET    /api/v1/darkwatch/exposures/:id    — Exposure detail
+GET    /api/v1/darkwatch/reports          — List scan reports
+POST   /api/v1/darkwatch/reports/generate — Generate PDF report
+GET    /api/v1/darkwatch/alerts           — List user's alerts
+PATCH  /api/v1/darkwatch/alerts/:id/read  — Mark alert as read
+```
+
+---
+
+## 5. SpamShield Service — Spam Call/Text Blocking
+
+### 5.1 Architecture
+
+```
+┌──────────────────────────────────────────────────────────┐
+│                   SpamShield Service                      │
+│                                                           │
+│  ┌─────────────┐  ┌─────────────┐  ┌──────────────────┐ │
+│  │  Ingestion  │──│  Feature    │──│  Classifier      │ │
+│  │  (Call/Text │  │  Extractor  │  │  (ML + Rules)    │ │
+│  │   Events)   │  │  (Metadata, │  │  (Random Forest  │ │
+│  └─────────────┘  │   Content)  │  │   + Rule Engine) │ │
+│                   └─────────────┘  └────────┬─────────┘ │
+│                                              │            │
+│  ┌─────────────┐  ┌─────────────┐  ┌────────▼─────────┐ │
+│  │  Action     │◀─│  Decision   │◀─│  Score           │ │
+│  │  Executor   │  │  Engine     │  │  Aggregator      │ │
+│  │  (Block,   │  │  (Threshold,│  │  (Multi-signal   │ │
+│  │   Flag,    │  │   Confidence)│  │   combination)   │ │
+│  │   Notify)  │  │             │  │                  │ │
+│  └─────────────┘  └─────────────┘  └──────────────────┘ │
+└──────────────────────────────────────────────────────────┘
+```
+
+### 5.2 Spam Detection Layers
+
+**Layer 1: Number Reputation (Rule-Based)**
+- Carrier CNAM lookup: identify business vs. personal numbers
+- Known spam databases: integration with Hiya, Truecaller API
+- Number age: new numbers (<30 days) flagged as suspicious
+- Call pattern analysis: high volume from single number = spam
+- Geographic anomaly: unexpected country/region for user
+
+**Layer 2: Content Classification (ML)**
+- SMS text classification: fine-tuned BERT model for spam vs. ham
+- Feature extraction: URL presence, emoji density, urgency keywords, sender ID
+- Confidence threshold: 0.85 for auto-block, 0.6-0.85 for flag
+- Continuous learning: user feedback (false positive/negative) retrains model
+
+**Layer 3: Behavioral Analysis**
+- Call frequency patterns: robo-dial detection (>5 calls/minute from same pool)
+- Time-of-day anomaly: unusual hours for user's timezone
+- Session analysis: short duration calls (<10s) = likely robo-call
+- VOIP detection: identify carrier type (VOIP = higher spam probability)
+
+**Layer 4: Community Intelligence**
+- Aggregated user reports: crowd-sourced spam number database
+- Weighted scoring: more reports = higher spam score
+- Decay function: older reports lose weight over time
+
+### 5.3 Real-Time Blocking
+
+**Call Blocking**
+- Integration: SIP trunking or carrier API (Twilio, Plivo)
+- Flow: incoming call → API lookup → decision (<200ms) → block/flag/ring
+- Block action: send to voicemail with "AI-detected spam" greeting
+- Flag action: show "Likely Spam" on caller ID before answer
+- False positive recovery: one-tap "keep call" overrides for 30 days
+
+**Text Blocking**
+- Integration: SMPP gateway or carrier API
+- Flow: incoming SMS → content analysis → decision (<500ms) → block/flag
+- Block action: move to spam folder with preview
+- Flag action: show banner "Possible Spam" with swipe to keep
+
+### 5.4 Build vs Buy
+
+| Component | Decision | Rationale |
+|-----------|----------|-----------|
+| Number reputation | **Buy** (Hiya + Truecaller) | Established databases, hard to build from scratch |
+| Content classifier | **Build** (fine-tune BERT) | Domain-specific, continuous improvement |
+| Behavioral analysis | **Build** | Proprietary data advantage |
+| Call/text routing | **Buy** (Twilio/Plivo) | Carrier relationships, global coverage |
+| Community intelligence | **Build** | Network effect, differentiator |
+| Full alternative | **Syrrex** or **TollBridge** | Evaluate if integration complexity is too high |
+
+### 5.5 API Surface
+
+```
+POST   /api/v1/spamshield/calls/analyze      — Analyze incoming call
+POST   /api/v1/spamshield/sms/analyze        — Analyze incoming SMS
+GET    /api/v1/spamshield/history            — User's blocked/flagged history
+POST   /api/v1/spamshield/feedback           — Submit false positive/negative
+POST   /api/v1/spamshield/whitelist          — Add number to whitelist
+POST   /api/v1/spamshield/blacklist          — Add number to blacklist
+GET    /api/v1/spamshield/stats              — User's spam statistics
+WS     /api/v1/spamshield/realtime           — Real-time event stream
+```
+
+---
+
+## 6. Shared Platform Services
+
+### 6.1 Auth & User Management
+- NextAuth.js with email/password + OAuth (Google, Apple)
+- RBAC: user, family_admin, family_member, support
+- Family group management: up to unlimited members (Premium), 3 (Plus)
+
+### 6.2 Billing
+- Stripe subscription management
+- Tier-based feature gating via middleware
+- Usage tracking for free tier limits
+
+### 6.3 Notification System
+- Multi-channel: email (Resend), push (FCM/APNs), SMS (Twilio)
+- Template system with localization support
+- Alert dedup and rate limiting per user
+
+### 6.4 Analytics
+- PostHog for product analytics
+- Custom dashboards: detection rates, false positive rates, conversion funnels
+- Model performance monitoring: precision, recall, drift detection
+
+---
+
+## 7. Development Timeline
+
+### Phase 1: Foundation (Weeks 1-4)
+- [ ] Project scaffolding: monorepo (Turborepo), CI/CD pipeline
+- [ ] Auth service: user registration, login, family groups
+- [ ] Billing integration: Stripe subscriptions, tier gating
+- [ ] API gateway: routing, rate limiting, authentication middleware
+- [ ] Database schema: Prisma models, migrations
+- [ ] Notification service: email, push infrastructure
+
+### Phase 2: DarkWatch MVP (Weeks 5-8)
+- [ ] Watch list manager with CRUD API
+- [ ] HIBP API integration (first data source)
+- [ ] Matching engine: exact match for email/phone
+- [ ] Alert pipeline: email notifications for exposures
+- [ ] Dashboard: exposure list, watch list management
+- [ ] Manual scan trigger with job queue
+
+### Phase 3: SpamShield MVP (Weeks 9-12)
+- [ ] Number reputation integration (Hiya API)
+- [ ] SMS content classifier: train initial BERT model
+- [ ] Call analysis API with rule engine
+- [ ] Blocking/flagging action executor
+- [ ] User feedback loop: false positive/negative collection
+- [ ] Dashboard: spam history, whitelist/blacklist
+
+### Phase 4: VoicePrint MVP (Weeks 13-16)
+- [ ] Audio preprocessing pipeline
+- [ ] ECAPA-TDNN model training on ASVspoof data
+- [ ] Voice enrollment API with FAISS index
+- [ ] Batch audio analysis endpoint
+- [ ] Dashboard: enrollment management, analysis results
+- [ ] Synthetic voice detection accuracy benchmarking
+
+### Phase 5: Real-Time Features (Weeks 17-20)
+- [ ] Real-time call analysis via WebRTC
+- [ ] Streaming WebSocket alerts
+- [ ] DarkWatch automated scheduling (tier-based frequency)
+- [ ] SpamShield real-time call/text interception
+- [ ] Cross-service alert correlation
+
+### Phase 6: Beta & Launch (Weeks 21-24)
+- [ ] Beta testing with 100 users
+- [ ] Performance optimization: P99 latency targets
+- [ ] Mobile app (React Native or Tauri)
+- [ ] Documentation, onboarding flows
+- [ ] Production deployment, monitoring, alerting
+- [ ] Launch
+
+---
+
+## 8. Infrastructure & Deployment
+
+### 8.1 Environment Strategy
+- **Dev**: Docker Compose, local PostgreSQL/Redis
+- **Staging**: AWS ECS Fargate, RDS PostgreSQL, ElastiCache Redis
+- **Prod**: AWS ECS Fargate (or EKS if scaling demands), multi-AZ, auto-scaling
+
+### 8.2 Key Services
+| Service | Provider | Notes |
+|---------|----------|-------|
+| Compute | AWS ECS/Fargate | Container-based, auto-scale |
+| Database | AWS RDS PostgreSQL | Multi-AZ, automated backups |
+| Cache | AWS ElastiCache Redis | Cluster mode for BullMQ |
+| Storage | AWS S3 | Audio files, reports |
+| CDN | CloudFront | Static assets, dashboard |
+| Email | Resend | Transactional emails |
+| SMS | Twilio | Alert notifications, call routing |
+| ML Training | AWS SageMaker | Model training jobs |
+| ML Inference | AWS Lambda / ECS | Real-time inference |
+| Monitoring | Datadog + Sentry | APM, error tracking |
+
+### 8.3 Security
+- All data encrypted at rest (AES-256) and in transit (TLS 1.3)
+- PII field-level encryption for SSN, phone numbers
+- SOC 2 Type II readiness from launch
+- OWASP Top 10 compliance
+- Regular penetration testing (quarterly)
+- GDPR + CCPA compliance for data retention
+
+---
+
+## 9. Key Technical Risks & Mitigations
+
+| Risk | Impact | Mitigation |
+|------|--------|------------|
+| Voice model false positives | User trust erosion | Start with "flag" not "block", user feedback loop |
+| Dark web data source reliability | Stale alerts | Multi-source redundancy, health monitoring |
+| Real-time latency SLA | Missed spam calls | Edge deployment, <200ms target with fallback |
+| Scalability of voice analysis | High compute cost | Async batch for non-real-time, GPU spot instances |
+| API dependency (Hiya, Twilio) | Service outage | Circuit breakers, fallback providers |
+| Model drift over time | Accuracy degradation | Monthly retraining pipeline, performance monitoring |
+
+---
+
+## 10. Team & Resource Estimates
+
+| Role | Headcount | Phase 1 | Phase 2-3 | Phase 4-6 |
+|------|-----------|---------|-----------|-----------|
+| Backend Engineer | 2 | ✓ | ✓ | ✓ |
+| ML Engineer | 1 | — | — | ✓ |
+| Frontend Engineer | 1 | ✓ | ✓ | ✓ |
+| DevOps/SRE | 1 | ✓ | ✓ | ✓ |
+| QA Engineer | 1 | — | ✓ | ✓ |
+
+**Estimated monthly burn (engineering only):** ~$45K for 6-person team
+
+---
+
+## 11. Success Metrics (Technical)
+
+| Metric | Target | Measurement |
+|--------|--------|-------------|
+| Voice detection accuracy (F1) | >0.90 | ASVspoof benchmark + internal test set |
+| Spam classification precision | >0.95 | User feedback, labeled test set |
+| Dark web scan coverage | >3 major sources | Data source inventory |
+| API P99 latency | <500ms | Datadog APM |
+| False positive rate (calls) | <2% | User feedback tracking |
+| System uptime | >99.9% | Uptime monitoring |
+| Dark web alert freshness | <24h | Time from exposure to alert |