shortcommings
This commit is contained in:
428
docs/PRODUCT-GAP-ANALYSIS.md
Normal file
428
docs/PRODUCT-GAP-ANALYSIS.md
Normal file
@@ -0,0 +1,428 @@
|
||||
# Kordant: Product Gap Analysis & Path to Revenue
|
||||
|
||||
**Date:** May 31, 2026
|
||||
**Scope:** What's functional vs. scaffolding, what's needed to ship, expected customer value, pricing
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Kordant is a **well-architected platform with mostly scaffolding implementations**. The codebase has excellent structure — tRPC routers, Drizzle ORM schemas, service layers, job handlers, mobile apps, and a Rust queueing library (Honker). However, **none of the five core services deliver real value to a paying customer today**. The ML models return stub data, external API integrations are placeholders, and data sources return mock results.
|
||||
|
||||
**Bottom line:** You have the platform skeleton. You need to build the muscles.
|
||||
|
||||
| Service | Status | Lines of Code | Real Functionality | Effort to Ship |
|
||||
|---------|--------|---------------|-------------------|----------------|
|
||||
| **VoicePrint** | ❌ Pure scaffolding | ~240 | None — returns `isSynthetic: false` | 6–12 months, $100K–$500K |
|
||||
| **DarkWatch** | ⚠️ Architecture only | ~500+ | Circuit breakers, alert pipeline, CRUD — no real API calls | 2–4 months, $20K–$50K |
|
||||
| **SpamShield** | ⚠️ Rule engine only | ~400+ | Pattern matching works — ML & reputation APIs are stubs | 2–3 months, $15K–$40K |
|
||||
| **HomeTitle** | ❌ Scaffolding | ~300 | Geocoding works — county records return mock data | 3–6 months, $30K–$80K |
|
||||
| **RemoveBrokers** | ⚠️ Registry only | ~1,500+ | Broker registry (100+ entries) — removal engine is placeholder | 2–4 months, $20K–$50K |
|
||||
| **Billing** | ⚠️ Minimal | ~100 | Stripe client — no webhooks, proration, or checkout | 1–2 months, $10K–$20K |
|
||||
| **Auth** | ✅ Functional | ~200 | JWT + bcrypt working | Done |
|
||||
|
||||
---
|
||||
|
||||
## 1. Current State: What Actually Works
|
||||
|
||||
### ✅ Functional (Shippable Today)
|
||||
|
||||
- **Authentication:** JWT signing/verification (jose), password hashing (bcrypt, 10 rounds). Solid implementation.
|
||||
- **Database Schema:** Complete Drizzle ORM schemas for all 5 services, alerts, billing, subscriptions, audit logs.
|
||||
- **tRPC API Layer:** Router scaffolding for all services with proper Zod schemas.
|
||||
- **Dashboard UI:** Web dashboard with sidebar, threat score widget, alert feed, service widgets.
|
||||
- **Mobile Apps:** iOS (SwiftUI) and Android (Compose) with ViewModels, Models, and navigation. Thin clients calling tRPC.
|
||||
- **Browser Extension:** Chrome Manifest V3 extension shell.
|
||||
- **Honker (Rust):** Queueing library for background jobs, FFI bindings.
|
||||
- **Geocoding:** Google Maps API integration in HomeTitle (works if API key provided).
|
||||
- **SpamShield Rule Engine:** Regex/area code/prefix pattern matching works.
|
||||
- **DarkWatch Alert Pipeline:** Severity scoring, exposure deduplication, alert creation logic.
|
||||
- **RemoveBrokers Registry:** 100+ broker entries with domains, removal URLs, categories.
|
||||
|
||||
### ❌ Not Functional (Scaffolding/Placeholders)
|
||||
|
||||
| Component | What It Does | What It Should Do |
|
||||
|-----------|-------------|-------------------|
|
||||
| **VoicePrint ML Engine** | Returns `{ isSynthetic: false, confidence: 1.0, score: 0.0 }` | Detect AI-generated voices in real-time |
|
||||
| **VoicePrint Voice Matching** | Returns `{ similarity: 0, matched: false }` | Compare voice against enrolled templates |
|
||||
| **VoicePrint Embedding** | Returns empty `Float64Array(256)` + SHA256 hash | Generate voice embeddings for enrollment |
|
||||
| **DarkWatch Scan Engine** | Has circuit breaker structure — no actual API calls to HIBP, SecurityTrails, Censys, Shodan | Query real breach databases and dark web sources |
|
||||
| **SpamShield ML Engine** | `classifyTextBERT()` returns `{ isSpam: false, confidence: 1.0 }` | Classify SMS/call text as spam using ML |
|
||||
| **SpamShield Reputation API** | Hiya/Truecaller lookups return `{ score: 0, isSpam: false }` | Query real phone reputation databases |
|
||||
| **HomeTitle County Scanner** | Returns `{ ownerName: "Unknown Owner", address: {} }` | Fetch real county deed records |
|
||||
| **HomeTitle HTML Parser** | `parseDeedRecords()` logs "not yet implemented" and returns null | Parse county record HTML/JSON responses |
|
||||
| **RemoveBrokers Removal Engine** | Returns `{ success: true, requestId: crypto.randomUUID() }` | Actually submit opt-out requests to brokers |
|
||||
| **RemoveBrokers Email** | Returns `{ success: true }` without sending anything | Send opt-out emails to broker addresses |
|
||||
| **RemoveBrokers Status Tracking** | Returns `{ status: "pending" }` always | Poll brokers for actual removal status |
|
||||
| **Billing Webhooks** | No webhook handler implemented | Handle Stripe webhook events (checkout, renewal, cancel) |
|
||||
| **Billing Checkout** | No checkout session creation | Create Stripe Checkout sessions for subscription plans |
|
||||
|
||||
---
|
||||
|
||||
## 2. Gap Analysis by Service
|
||||
|
||||
### VoicePrint — Voice Clone Detection
|
||||
|
||||
**Current:** 56-line ML engine, all stubs. No audio processing, no model loading, no inference.
|
||||
|
||||
**What's needed for a working product:**
|
||||
|
||||
1. **API-first approach (fastest):**
|
||||
- Integrate Microsoft Azure Voice Live API (~$0.016/min) for liveness detection
|
||||
- Integrate Pindrop or Daon API for passive detection
|
||||
- Estimated cost: $60K–$230K/year at scale
|
||||
|
||||
2. **Build in-house (differentiating but expensive):**
|
||||
- Deploy AASIST or RawNet2 model (open-source from ASVspoof 2021)
|
||||
- GPU inference infrastructure (NVIDIA T4/A10, $300–$800/mo per node)
|
||||
- Audio preprocessing pipeline (VAD, resampling, normalization)
|
||||
- Enrollment system (collect voice samples, generate embeddings)
|
||||
- Estimated cost: $840K–$1.25M Year 1
|
||||
|
||||
3. **Mobile integration:**
|
||||
- iOS: Integrate with CallKit for real-time call analysis
|
||||
- Android: Integrate with Telecom API
|
||||
- On-device inference for low-latency screening
|
||||
|
||||
**Market reality:** Voice clone detection is the most technically ambitious service. Hiya and Truecaller have carrier-level integrations you can't replicate without carrier partnerships. Your differentiator should be **consumer-facing analysis** (record a suspicious call → analyze → report), not real-time PSTN interception.
|
||||
|
||||
**Effort:** 6–12 months to MVP, $100K–$500K
|
||||
**Revenue potential:** High — this is the most novel service in your suite. Competitors don't offer this to consumers.
|
||||
|
||||
---
|
||||
|
||||
### DarkWatch — Dark Web & Breach Monitoring
|
||||
|
||||
**Current:** Best-implemented service. Has scan engine architecture, circuit breakers, alert pipeline, watchlist CRUD, exposure dedup. Missing: actual API calls to external data sources.
|
||||
|
||||
**What's needed for a working product:**
|
||||
|
||||
1. **API integrations (the core work):**
|
||||
- **HaveIBeenPwned (HIBP):** Free tier (1,500 req/mo) → Paid ($3.50/mo individual). Check emails against breach database.
|
||||
- **SecurityTrails:** $49/mo Pro plan. DNS/WHOIS monitoring for domain exposure.
|
||||
- **Censys:** $79/mo Pro. Internet-wide scanning for exposed services.
|
||||
- **Shodan:** $299/mo Small Business. IoT/device exposure monitoring.
|
||||
- **Optional — Breachsense:** $199/mo for deep dark web scanning.
|
||||
|
||||
2. **Data pipeline:**
|
||||
- Implement actual `fetchWithCircuit()` calls to each API
|
||||
- Parse and normalize responses into your exposure schema
|
||||
- Schedule periodic scans (daily/weekly depending on tier)
|
||||
- WebSocket push for real-time scan progress
|
||||
|
||||
3. **Alert quality:**
|
||||
- Your severity scoring logic is already implemented
|
||||
- Add alert fatigue reduction (dedup, cooldown periods)
|
||||
- Email + push notification delivery
|
||||
|
||||
**Monthly API costs at scale:** ~$500–$1,000/mo for base data sources
|
||||
**Per-customer API cost:** ~$0.50–$2.00/mo (amortized across user base)
|
||||
|
||||
**Effort:** 2–4 months, $20K–$50K
|
||||
**Revenue potential:** Medium — crowded market (Aura, LifeLock, Experian all offer this). Must differentiate on alert quality and multi-source correlation.
|
||||
|
||||
---
|
||||
|
||||
### SpamShield — Spam Call/SMS Classification
|
||||
|
||||
**Current:** Rule engine works (pattern matching, area code, prefix). ML engine and reputation APIs are stubs.
|
||||
|
||||
**What's needed for a working product:**
|
||||
|
||||
1. **Reputation API integrations:**
|
||||
- **Hiya API:** Phone number reputation scoring. Carrier-level integration preferred but API available.
|
||||
- **Truecaller API:** Caller ID and spam labeling.
|
||||
- **Twilio Lookup API:** $0.004–$0.03 per lookup. Caller name + line type.
|
||||
- **STIR/SHAKEN verification:** Call authentication (requires telecom partner).
|
||||
|
||||
2. **ML text classification:**
|
||||
- Fine-tune lightweight model (DistilBERT or TinyBERT) on SMS spam dataset
|
||||
- Deploy as ONNX model for low-latency inference
|
||||
- Training data: Enron Spam Corpus, SMS Spam Collection, custom labeled data
|
||||
|
||||
3. **Mobile integration:**
|
||||
- iOS: CallKit integration for real-time caller screening
|
||||
- Android: Telecom API for call filtering
|
||||
- SMS interception (requires carrier permissions or SMS app integration)
|
||||
|
||||
**Monthly API costs:** Twilio Lookup ~$0.004/lookup. Hiya/Truecaller custom pricing.
|
||||
**Per-customer cost:** ~$1–$5/mo depending on call volume.
|
||||
|
||||
**Effort:** 2–3 months, $15K–$40K
|
||||
**Revenue potential:** Medium-High — Hiya/Truecaller dominate at carrier level, but consumer-facing spam classification with AI detection is underserved.
|
||||
|
||||
---
|
||||
|
||||
### HomeTitle — Property Deed Monitoring
|
||||
|
||||
**Current:** Geocoding works (Google Maps API). County records fetcher returns mock data. HTML parser not implemented. Change detection logic is solid.
|
||||
|
||||
**What's needed for a working product:**
|
||||
|
||||
1. **County data sources (the hard part):**
|
||||
- **US county recorder APIs:** ~3,000 counties, each with different data formats
|
||||
- **Commercial aggregators:**
|
||||
- **Attom Data Solutions:** Property records API, ~$0.05–$0.10/record
|
||||
- **CoreLogic:** Property intelligence, enterprise pricing
|
||||
- **Black Knight (Moody's):** Property data, enterprise pricing
|
||||
- **County-specific APIs:** Some counties offer open data (e.g., Cook County IL, Harris County TX)
|
||||
- **Web scraping fallback:** Parse county recorder websites (fragile, requires maintenance)
|
||||
|
||||
2. **Monitoring pipeline:**
|
||||
- Initial property snapshot (owner, deed date, liens, tax info)
|
||||
- Periodic re-scan (weekly/monthly)
|
||||
- Change detection (your logic is already implemented)
|
||||
- Alert generation (ownership transfer, lien filing, tax change)
|
||||
|
||||
3. **Property verification:**
|
||||
- Geocoding → parcel ID lookup → county record fetch
|
||||
- Handle counties without digital records (mail-based requests)
|
||||
|
||||
**Monthly data costs:** Attom ~$500–$5,000/mo depending on volume.
|
||||
**Per-customer cost:** ~$2–$10/mo depending on scan frequency.
|
||||
|
||||
**Effort:** 3–6 months, $30K–$80K
|
||||
**Revenue potential:** Medium — unique differentiator. No major competitor offers this in consumer identity protection. Real estate fraud is rising (FTC reports $1B+ in property fraud annually).
|
||||
|
||||
---
|
||||
|
||||
### RemoveBrokers — Data Broker Opt-Out
|
||||
|
||||
**Current:** Broker registry with 100+ entries (solid). Removal engine is a placeholder that returns mock request IDs. Email sending not implemented. Form submission not implemented.
|
||||
|
||||
**What's needed for a working product:**
|
||||
|
||||
1. **Automated removal engine:**
|
||||
- **Headless browser automation:** Playwright/Puppeteer for each broker's opt-out flow
|
||||
- **Form filling:** Dynamic form field detection and population
|
||||
- **CAPTCHA handling:** 2Captcha/AntiCaptcha integration ($0.001–$0.01/solve)
|
||||
- **Email verification:** Handle opt-out confirmation emails
|
||||
- **Physical mail:** Generate and mail opt-out letters for brokers requiring it
|
||||
|
||||
2. **Broker-specific adapters:**
|
||||
- Each of 100+ brokers has unique opt-out flow
|
||||
- Estimated 2–5 hours per broker to implement and test
|
||||
- Ongoing maintenance: 15–25% of scripts break per quarter
|
||||
|
||||
3. **Re-scan pipeline:**
|
||||
- Periodic re-scans to detect re-listings
|
||||
- Status tracking and progress reporting
|
||||
|
||||
4. **Competitor benchmark:**
|
||||
- **DeleteMe:** 300+ brokers, $139/yr individual, $329/yr family
|
||||
- **Kanary:** 400+ brokers, $132/yr individual, $264/yr family
|
||||
- **OneRep:** 200+ brokers, $180/yr individual
|
||||
|
||||
**Monthly operational costs:** Proxies ($1K–$6K), CAPTCHA solving ($3–$8/customer), compute ($1K–$5K)
|
||||
**Per-customer cost:** ~$13–$53/year (high margin: 60–90%)
|
||||
|
||||
**Effort:** 2–4 months for initial 50 brokers, then incremental
|
||||
**Revenue potential:** Medium — competitive market but high margins. Your advantage: bundling with other services.
|
||||
|
||||
---
|
||||
|
||||
### Billing & Payments
|
||||
|
||||
**Current:** Stripe client initialized. No checkout, webhooks, or subscription management.
|
||||
|
||||
**What's needed:**
|
||||
|
||||
1. **Stripe Checkout integration:**
|
||||
- Create checkout sessions for each plan tier
|
||||
- Handle success/cancel redirects
|
||||
- Customer portal for subscription management
|
||||
|
||||
2. **Webhook handlers:**
|
||||
- `checkout.session.completed` → activate subscription
|
||||
- `invoice.payment_succeeded` → renew subscription
|
||||
- `invoice.payment_failed` → grace period, retry
|
||||
- `customer.subscription.deleted` → cancel access
|
||||
- `customer.subscription.updated` → tier changes
|
||||
|
||||
3. **Subscription management:**
|
||||
- Trial periods (14-day free trial)
|
||||
- Tier upgrades/downgrades with proration
|
||||
- Family plan member management
|
||||
- Grace period before suspension
|
||||
|
||||
4. **Plan structure:**
|
||||
- See pricing recommendations below
|
||||
|
||||
**Effort:** 1–2 months, $10K–$20K
|
||||
**Revenue potential:** N/A (enables all revenue)
|
||||
|
||||
---
|
||||
|
||||
## 3. Recommended Build Priority
|
||||
|
||||
Based on effort vs. market differentiation:
|
||||
|
||||
| Priority | Service | Why | Effort | Revenue Impact |
|
||||
|----------|---------|-----|--------|----------------|
|
||||
| **1** | **RemoveBrokers** | Highest margin (60–90%), existing registry, clear competitor benchmark | 2–4 mo | Direct revenue, $11–$27/mo |
|
||||
| **2** | **DarkWatch** | Best architecture, API integrations needed, table-stakes feature | 2–4 mo | Core retention driver |
|
||||
| **3** | **SpamShield** | Rule engine works, needs reputation APIs + ML | 2–3 mo | Differentiation vs. competitors |
|
||||
| **4** | **Billing** | Enables all revenue, must ship before paid plans | 1–2 mo | Revenue enabler |
|
||||
| **5** | **HomeTitle** | Unique differentiator, but data sourcing is hard | 3–6 mo | Premium tier upsell |
|
||||
| **6** | **VoicePrint** | Most novel, but highest effort and cost | 6–12 mo | Brand differentiation |
|
||||
|
||||
**Recommended MVP scope:** RemoveBrokers + DarkWatch + SpamShield + Billing = **5–8 months to first revenue**.
|
||||
|
||||
---
|
||||
|
||||
## 4. Pricing Strategy
|
||||
|
||||
### Recommended Plan Structure
|
||||
|
||||
| Plan | Monthly Price | Annual Price | Features |
|
||||
|------|--------------|--------------|----------|
|
||||
| **Shield** (Entry) | $12/mo | $9/mo ($108/yr) | DarkWatch (basic), SpamShield, RemoveBrokers (50 brokers) |
|
||||
| **Guard** (Core) | $22/mo | $18/mo ($216/yr) | All Shield + DarkWatch (full), RemoveBrokers (200+), HomeTitle (1 property) |
|
||||
| **Fortress** (Premium) | $35/mo | $29/mo ($348/yr) | All Guard + HomeTitle (3 properties), VoicePrint, priority alerts, family (2 adults) |
|
||||
| **Family Fortress** | $45/mo | $39/mo ($468/yr) | All Fortress + 5 adults + unlimited children |
|
||||
|
||||
### Competitive Positioning
|
||||
|
||||
| Your Plan | vs. Aura | vs. DeleteMe | vs. LifeLock |
|
||||
|-----------|----------|-------------|--------------|
|
||||
| Shield ($12) | Matches Aura Individual | Cheaper than DeleteMe ($11.58) | Cheaper than LifeLock Select |
|
||||
| Guard ($22) | Below Aura Family | N/A (DeleteMe is removal-only) | Below LifeLock Advantage |
|
||||
| Fortress ($35) | Below Aura Family | N/A | Below LifeLock Ultimate |
|
||||
| Family ($45) | Above Aura Family ($37) | Above DeleteMe Family ($27.42) | Above LifeLock Family |
|
||||
|
||||
### Expected Unit Economics
|
||||
|
||||
| Metric | Estimate | Basis |
|
||||
|--------|----------|-------|
|
||||
| **ARPU (blended)** | $18–$25/mo | Mix of tiers, family plans raise ARPU |
|
||||
| **Gross margin** | 65–75% | API costs, infrastructure, support |
|
||||
| **CAC (organic)** | $50–$150 | Content marketing, word-of-mouth |
|
||||
| **CAC (paid)** | $200–$400 | Google Ads, affiliate |
|
||||
| **Monthly churn (individual)** | 3–5% | Industry benchmark |
|
||||
| **Monthly churn (family)** | 1–2% | Higher switching costs |
|
||||
| **LTV (individual)** | $600–$1,200 | 24-mo avg life, $20 ARPU |
|
||||
| **LTV (family)** | $1,600–$2,400 | 48-mo avg life, $45 ARPU |
|
||||
| **LTV:CAC (organic)** | 4–8x | Healthy |
|
||||
| **LTV:CAC (paid)** | 2–4x | Marginal |
|
||||
|
||||
---
|
||||
|
||||
## 5. What Customers Actually Get (When Working)
|
||||
|
||||
### Monthly Value Perception
|
||||
|
||||
| Service | Customer Perceives | Actual Value |
|
||||
|---------|-------------------|--------------|
|
||||
| **VoicePrint** | "They detected a scam call cloning my daughter's voice" | Highest emotional impact, brand-defining |
|
||||
| **DarkWatch** | "They found my email in a breach I didn't know about" | Table-stakes, expected by all competitors |
|
||||
| **SpamShield** | "They blocked 47 spam calls this month" | Daily utility, high engagement |
|
||||
| **HomeTitle** | "They caught a fraudulent lien on my house" | Highest dollar impact ($10K–$100K+ saved) |
|
||||
| **RemoveBrokers** | "They removed me from 127 people-search sites" | Tangible progress, visible results |
|
||||
|
||||
### Customer Loyalty Drivers
|
||||
|
||||
1. **Alert quality (not quantity):** One perfect alert > 20 noise alerts. Your correlation engine should reduce false positives.
|
||||
2. **Family plan lock-in:** Once a family is enrolled, switching costs are high.
|
||||
3. **Visible progress:** RemoveBrokers dashboard showing "127/300 removed" drives retention.
|
||||
4. **Crisis response:** When a major breach hits (e.g., Change Healthcare 2024), proactive alerts create loyalty spikes.
|
||||
5. **Mobile app quality:** Credit lock/unlock, real-time alerts, one-tap actions.
|
||||
|
||||
---
|
||||
|
||||
## 6. Infrastructure Costs at Scale
|
||||
|
||||
### Monthly Fixed Costs
|
||||
|
||||
| Component | 100 Users | 1,000 Users | 10,000 Users |
|
||||
|-----------|-----------|-------------|--------------|
|
||||
| **Turso (SQLite)** | $0–$25 | $25–$100 | $100–$500 |
|
||||
| **Redis** | $0–$15 | $15–$50 | $50–$200 |
|
||||
| **HIBP API** | $0 (free tier) | $3.50 | $50+ |
|
||||
| **SecurityTrails** | $49 | $49 | $249 |
|
||||
| **Censys** | $79 | $79 | $299 |
|
||||
| **Shodan** | $299 | $299 | $599 |
|
||||
| **Twilio (SpamShield)** | $5–$20 | $20–$100 | $100–$500 |
|
||||
| **Attom (HomeTitle)** | $500 | $1,000 | $5,000 |
|
||||
| **Azure Voice Live** | $0 (dev) | $100–$500 | $500–$5,000 |
|
||||
| **Proxies (RemoveBrokers)** | $100 | $500 | $2,000 |
|
||||
| **CAPTCHA solving** | $10 | $50 | $200 |
|
||||
| **Compute (SolidStart)** | $50 | $200 | $1,000 |
|
||||
| **Total Fixed** | ~$1,200 | ~$2,500 | ~$16,000 |
|
||||
|
||||
### Per-User Variable Costs
|
||||
|
||||
| Service | Cost/User/Month | Notes |
|
||||
|---------|-----------------|-------|
|
||||
| DarkWatch | $0.50–$2.00 | Amortized API costs |
|
||||
| SpamShield | $1.00–$5.00 | Twilio lookups, ML inference |
|
||||
| HomeTitle | $2.00–$10.00 | Attom record lookups |
|
||||
| RemoveBrokers | $1.00–$4.00 | Proxy + CAPTCHA + compute |
|
||||
| VoicePrint | $0.50–$3.00 | Azure API or GPU inference |
|
||||
| **Total** | **$5.00–$24.00** | Depends on usage |
|
||||
|
||||
At $18/mo average ARPU and $10/mo variable cost, **gross margin is ~44%** at early scale. Improves to **65–75%** as API costs amortize and you negotiate volume pricing.
|
||||
|
||||
---
|
||||
|
||||
## 7. Risks & Mitigations
|
||||
|
||||
| Risk | Severity | Mitigation |
|
||||
|------|----------|-----------|
|
||||
| **VoicePrint never reaches production accuracy** | High | Ship API-first (Azure Voice Live), defer in-house model |
|
||||
| **County data sourcing blocked** | High | Start with top 100 counties, use Attom API, expand gradually |
|
||||
| **Broker scripts break constantly** | Medium | Budget 20% engineering time for maintenance, use AI-assisted scraping |
|
||||
| **Competitor price war (Aura at $12/mo)** | Medium | Differentiate on VoicePrint + HomeTitle (unique features) |
|
||||
| **API cost overruns** | Medium | Implement rate limits per tier, cache aggressively, negotiate volume pricing |
|
||||
| **Regulatory compliance (FCRA, GLBA)** | High | Legal review before launch, SOC 2 Type II certification |
|
||||
| **False positive alerts destroy trust** | High | Human review queue for low-confidence alerts, user feedback loop |
|
||||
|
||||
---
|
||||
|
||||
## 8. Timeline to Revenue
|
||||
|
||||
### Phase 1: Foundation (Months 1–2)
|
||||
- ✅ Billing integration (Stripe Checkout + webhooks)
|
||||
- ✅ RemoveBrokers: Implement removal for top 20 brokers
|
||||
- ✅ DarkWatch: Connect HIBP + SecurityTrails APIs
|
||||
- **Revenue:** None (beta testers only)
|
||||
|
||||
### Phase 2: MVP Launch (Months 3–4)
|
||||
- ✅ RemoveBrokers: 50+ brokers with automated removal
|
||||
- ✅ DarkWatch: Full scan pipeline with HIBP, SecurityTrails, Censys
|
||||
- ✅ SpamShield: Reputation API integration (Twilio Lookup + Hiya)
|
||||
- ✅ Billing: Free trial + paid plans
|
||||
- **Revenue:** $12/mo Shield plan, target 100 beta users
|
||||
|
||||
### Phase 3: Growth (Months 5–8)
|
||||
- ✅ RemoveBrokers: 100+ brokers
|
||||
- ✅ DarkWatch: Add Shodan, Breachsense
|
||||
- ✅ SpamShield: ML text classification (fine-tuned DistilBERT)
|
||||
- ✅ HomeTitle: Top 50 counties + Attom API
|
||||
- **Revenue:** All tiers, target 1,000 users
|
||||
|
||||
### Phase 4: Differentiation (Months 9–12)
|
||||
- ✅ VoicePrint: Azure Voice Live API integration
|
||||
- ✅ HomeTitle: 200+ counties
|
||||
- ✅ Correlation engine: Cross-service threat scoring
|
||||
- ✅ Mobile: Real-time call screening (iOS CallKit, Android Telecom)
|
||||
- **Revenue:** Premium tiers, target 5,000 users
|
||||
|
||||
---
|
||||
|
||||
## 9. Bottom Line
|
||||
|
||||
**What you have:** A well-architected platform skeleton with auth, database, API layer, dashboard UI, mobile apps, and queueing infrastructure.
|
||||
|
||||
**What you need:** The actual data integrations and ML models that make the services useful. Currently, every core service returns mock data or stub responses.
|
||||
|
||||
**Fastest path to revenue (5–8 months):** RemoveBrokers + DarkWatch + SpamShield + Billing. These three services are achievable with API integrations and automation — no custom ML training required.
|
||||
|
||||
**Total investment to MVP revenue:** ~$65K–$140K (engineering + API costs for 5–8 months).
|
||||
|
||||
**Expected pricing:** $12–$45/mo depending on tier. Industry benchmark ARPU: $18–$25/mo.
|
||||
|
||||
**Expected LTV:** $600–$2,400 depending on plan tier (individual vs. family).
|
||||
|
||||
**Key differentiator from competitors:** VoicePrint (voice clone detection) + HomeTitle (property monitoring). These are unique in the consumer market. But they're also the hardest to build.
|
||||
|
||||
**Strategic recommendation:** Ship RemoveBrokers + DarkWatch first (fastest ROI, proven demand), then layer in SpamShield + HomeTitle for differentiation, then VoicePrint as the crown jewel that justifies premium pricing.
|
||||
@@ -0,0 +1,57 @@
|
||||
# 01. Stripe Checkout, Webhooks, and Subscription State Management
|
||||
|
||||
meta:
|
||||
id: core-services-01
|
||||
feature: core-services-implementation
|
||||
priority: P0
|
||||
depends_on: []
|
||||
tags: [billing, stripe, payments, foundation]
|
||||
|
||||
objective:
|
||||
- Enable paid customer acquisition by implementing complete Stripe payment lifecycle — checkout, webhook handling, subscription state machine, and customer portal.
|
||||
|
||||
deliverables:
|
||||
- Stripe Checkout session creation for each plan tier (Shield, Guard, Fortress, Family Fortress)
|
||||
- Webhook endpoint handling all critical Stripe events
|
||||
- Subscription state machine in Drizzle ORM
|
||||
- Customer portal (billing settings, plan change, cancellation)
|
||||
- Trial period support (14-day free trial)
|
||||
|
||||
steps:
|
||||
1. Add `STRIPE_WEBHOOK_SECRET` to `.env.example` and validate in `env.ts`
|
||||
2. Implement `createCheckoutSession(planId, customerId?, trial?)` in `billing.service.ts`
|
||||
3. Implement `POST /api/webhooks/stripe` route handler with signature verification
|
||||
4. Handle events: `checkout.session.completed`, `invoice.payment_succeeded`, `invoice.payment_failed`, `customer.subscription.updated`, `customer.subscription.deleted`
|
||||
5. Update subscription record in database on each event (status, tier, period end, payment method)
|
||||
6. Implement `createCustomerPortalSession(customerId)` for subscription management
|
||||
7. Add trial logic: create subscription with `trial_end`, handle trial-to-paid transition
|
||||
8. Add proration logic for tier upgrades/downgrades using `proration_behavior: 'create_prorations'`
|
||||
9. Update billing router tRPC procedures: `getCheckoutUrl`, `getPortalUrl`, `getSubscription`, `cancelSubscription`
|
||||
10. Add rate limiting on checkout creation (prevent abuse)
|
||||
|
||||
tests:
|
||||
- Unit: Mock Stripe API responses, verify database state transitions for each webhook event
|
||||
- Integration: Create real Stripe test-mode checkout session, complete payment, verify subscription activation
|
||||
- E2E: End-to-end checkout flow from dashboard → Stripe Checkout → webhook → active subscription
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] Customer can click "Subscribe" on Shield plan and be redirected to Stripe Checkout
|
||||
- [ ] After successful payment, webhook creates active subscription record in database
|
||||
- [ ] Customer can access billing portal to view invoices, change plan, or cancel
|
||||
- [ ] Trial subscription auto-converts to paid or suspends after trial ends
|
||||
- [ ] Tier upgrade creates prorated invoice and updates subscription immediately
|
||||
- [ ] `invoice.payment_failed` sets grace period status and sends retry email
|
||||
- [ ] All webhook events are idempotent (duplicate events don't create duplicate records)
|
||||
- [ ] Webhook handler returns 200 for handled events, 400 for invalid signatures
|
||||
|
||||
validation:
|
||||
- Run `stripe trigger checkout.session.completed` in Stripe CLI, verify database record
|
||||
- Run `stripe trigger invoice.payment_failed`, verify grace period status
|
||||
- Create test checkout, pay with `4242 4242 4242 4242`, verify active subscription in dashboard
|
||||
- Run test suite: `vitest run billing.test.ts`
|
||||
|
||||
notes:
|
||||
- Stripe API version: `2026-04-22.dahlia` (already configured in `stripe.ts`)
|
||||
- Webhook endpoint must be publicly accessible for Stripe to deliver — use ngrok for local dev
|
||||
- Store `stripeCustomerId` and `stripeSubscriptionId` on user/subscription records
|
||||
- Use `stripe-webhook` event type in database for audit trail
|
||||
@@ -0,0 +1,61 @@
|
||||
# 02. Automated Removal Engine for Top 20 Data Brokers
|
||||
|
||||
meta:
|
||||
id: core-services-02
|
||||
feature: core-services-implementation
|
||||
priority: P0
|
||||
depends_on: [core-services-01]
|
||||
tags: [removebrokers, automation, playwright, scraping, revenue]
|
||||
|
||||
objective:
|
||||
- Replace the `submitAutomatedRemoval()` stub that returns `crypto.randomUUID()` with a real Playwright-based browser automation that submits opt-out requests to the top 20 data brokers.
|
||||
|
||||
deliverables:
|
||||
- Playwright-based removal engine in `removebrokers/removal.engine.ts`
|
||||
- Per-broker adapter modules for top 20 brokers (Spokeo, Whitepages, MyLife, BeenVerified, etc.)
|
||||
- CAPTCHA detection and graceful failure (manual fallback flow)
|
||||
- Removal request status tracking with actual polling
|
||||
- Email notification service integration for opt-out confirmations
|
||||
|
||||
steps:
|
||||
1. Install Playwright: `npm install -D playwright @playwright/test`
|
||||
2. Analyze opt-out flows for top 20 brokers from existing registry data
|
||||
3. Create `removebrokers/adapters/` directory with one module per broker
|
||||
4. Implement base adapter interface: `scanForProfile`, `submitOptOut`, `verifyRemoval`, `getStatus`
|
||||
5. Implement adapters for each top 20 broker with navigation, form filling, and submission logic
|
||||
6. Add proxy rotation support (BrightData or similar) to avoid IP blocking
|
||||
7. Add stealth mode (playwright-stealth) to reduce detection
|
||||
8. Implement `submitAutomatedRemoval()` to select correct adapter by broker ID and execute
|
||||
9. Store actual request IDs from brokers (not generated UUIDs) in database
|
||||
10. Implement `trackRemovalStatus()` with periodic re-scans for submitted requests
|
||||
11. Integrate with notification service to email user when removal is confirmed
|
||||
12. Add job handler for batch removal processing queue
|
||||
13. Handle failures gracefully: retry with backoff, escalate to manual queue after 3 failures
|
||||
|
||||
tests:
|
||||
- Unit: Mock Playwright browser, verify adapter navigation sequences
|
||||
- Integration: Run adapter against real broker site in headful mode, verify opt-out form submission
|
||||
- E2E: Full flow — add broker to watchlist → trigger removal → verify status progression
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] Top 20 broker adapters are implemented and tested against live sites
|
||||
- [ ] `submitAutomatedRemoval()` no longer returns mock UUIDs — it submits real opt-out requests
|
||||
- [ ] Removal status tracks actual broker state (pending → submitted → completed/failed)
|
||||
- [ ] Failed removals are retried 3 times with exponential backoff, then escalated to manual queue
|
||||
- [ ] CAPTCHA challenges are detected and flagged for manual processing (not silently failing)
|
||||
- [ ] Job queue processes removals asynchronously without blocking API responses
|
||||
- [ ] User dashboard shows real removal progress per broker
|
||||
- [ ] All Playwright browsers are properly closed after each session (no resource leaks)
|
||||
|
||||
validation:
|
||||
- Run `vitest run removebrokers.service.test.ts` — all tests pass
|
||||
- Manual test: Trigger removal for Spokeo, verify opt-out email received
|
||||
- Check database: `removal_requests` table has real request IDs and actual status values
|
||||
- Run removal job: `bun run job:removebrokers` processes queue without errors
|
||||
|
||||
notes:
|
||||
- Broker sites change frequently — expect 15–25% of adapters to break per quarter
|
||||
- Some brokers require email verification sent to the listed email (often outdated) — flag these
|
||||
- Start with brokers that have simple form-based opt-outs; defer email/physical mail brokers to Phase 3
|
||||
- The existing broker registry in `broker.registry.ts` already has removal URLs — use these as starting points
|
||||
- Budget $1K–$3K/mo for proxy infrastructure at scale
|
||||
63
tasks/core-services-implementation/03-darkwatch-hibp.md
Normal file
63
tasks/core-services-implementation/03-darkwatch-hibp.md
Normal file
@@ -0,0 +1,63 @@
|
||||
# 03. HaveIBeenPwned API Integration for Email Breach Monitoring
|
||||
|
||||
meta:
|
||||
id: core-services-03
|
||||
feature: core-services-implementation
|
||||
priority: P0
|
||||
depends_on: [core-services-01]
|
||||
tags: [darkwatch, hibp, breach-monitoring, api-integration, table-stakes]
|
||||
|
||||
objective:
|
||||
- Replace the stub `scanHIBP()` function in the DarkWatch scan engine with a real HaveIBeenPwned API integration that checks user emails against known breach databases and creates exposure records.
|
||||
|
||||
deliverables:
|
||||
- HIBP API client with k-anonymity support for password checking
|
||||
- Email breach lookup with result parsing and normalization
|
||||
- Exposure record creation in database with proper severity scoring
|
||||
- Alert generation via existing alert pipeline
|
||||
- Circuit breaker integration (already exists in scan engine)
|
||||
|
||||
steps:
|
||||
1. Sign up for HIBP API key at https://haveibeenpwned.com/API/Key (free tier: 1,500 req/mo)
|
||||
2. Add `HIBP_API_KEY` to `.env.example` and validate in `env.ts`
|
||||
3. Create `darkwatch/hibp.client.ts` with functions:
|
||||
- `checkEmail(email): BreachResult[]` — query breachedaccount endpoint
|
||||
- `checkPassword(passwordHash): PwnedPasswordResult` — query pwnedpasswords endpoint using k-anonymity
|
||||
- `getBreaches(): Breach[]` — fetch breach metadata for caching
|
||||
4. Parse HIBP response: breach name, date, compromised data types, affected accounts
|
||||
5. Map data types to internal schema: email, password, phone, address, ssn, domain
|
||||
6. Calculate severity: critical if SSN/credit card, warning if email/phone, info if username only
|
||||
7. Deduplicate against existing exposures using `identifierHash` (already implemented)
|
||||
8. Create exposure records via existing `processExposure()` pipeline
|
||||
9. Cache breach metadata in Redis (update daily) to reduce API calls
|
||||
10. Handle rate limits: 1 req/sec free tier, 10 req/sec paid — implement request queue
|
||||
11. Add comprehensive error handling for 404 (no breach), 429 (rate limit), 503 (service unavailable)
|
||||
|
||||
tests:
|
||||
- Unit: Mock HIBP API responses, verify parsing and severity scoring
|
||||
- Integration: Test with real HIBP API using test email `test@example.com` (no breaches expected)
|
||||
- E2E: Add email to watchlist → trigger scan → verify exposure records created for breached email
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] `scanHIBP(email)` makes real HTTP request to `https://haveibeenpwned.com/api/v3/breachedaccount/{email}`
|
||||
- [ ] Breached emails create exposure records with correct breach metadata (name, date, data classes)
|
||||
- [ ] Non-breached emails return empty results without creating false exposure records
|
||||
- [ ] Rate limits are respected (1 req/sec free tier, configurable for paid)
|
||||
- [ ] 404 responses are handled gracefully (no breach = no exposure, not an error)
|
||||
- [ ] Circuit breaker opens after 3 consecutive failures and stays open for 60 seconds
|
||||
- [ ] Exposure deduplication prevents duplicate records for same email + breach combination
|
||||
- [ ] Alerts are generated for critical exposures (SSN, password) via existing pipeline
|
||||
- [ ] HIBP breach metadata is cached in Redis and refreshed daily
|
||||
|
||||
validation:
|
||||
- Run `vitest run darkwatch.test.ts` — all tests pass
|
||||
- Manual: Add known breached email to watchlist, trigger scan, verify alert received
|
||||
- Check Redis: `GET hibp:breaches` returns cached breach metadata
|
||||
- Monitor logs: No `"not yet implemented"` or `console.log("[darkwatch] stub")` messages
|
||||
|
||||
notes:
|
||||
- HIBP free tier is 1,500 requests/month — enough for development, need paid tier ($3.50/mo) for production
|
||||
- The k-anonymity password check sends only first 5 chars of SHA-1 hash — already privacy-safe
|
||||
- The existing `scan.engine.ts` has the circuit breaker infrastructure — wire HIBP client into it
|
||||
- HIBP does NOT crawl dark web — it only aggregates known public breaches. For live dark web monitoring, add Breachsense later (Phase 3)
|
||||
- Consider subscribing to HIBP domain monitoring for enterprise upsell later
|
||||
@@ -0,0 +1,75 @@
|
||||
# 04. SecurityTrails, Censys, and Shodan API Integrations
|
||||
|
||||
meta:
|
||||
id: core-services-04
|
||||
feature: core-services-implementation
|
||||
priority: P1
|
||||
depends_on: [core-services-03]
|
||||
tags: [darkwatch, securitytrails, censys, shodan, attack-surface, api-integration]
|
||||
|
||||
objective:
|
||||
- Integrate SecurityTrails, Censys, and Shodan APIs into the DarkWatch scan engine to monitor domain/IP attack surface exposure, complementing HIBP's breach monitoring.
|
||||
|
||||
deliverables:
|
||||
- SecurityTrails client for DNS/WHOIS monitoring and subdomain enumeration
|
||||
- Censys client for internet-wide host scanning and certificate transparency
|
||||
- Shodan client for IoT/device exposure and Tor exit node monitoring
|
||||
- Unified exposure normalization from all three sources
|
||||
- Cost-aware scanning (respect rate limits, cache aggressively)
|
||||
|
||||
steps:
|
||||
1. Sign up for API keys:
|
||||
- SecurityTrails: https://securitytrails.com (free: 50 req/mo, Pro: $49/mo)
|
||||
- Censys: https://censys.io (free: 250 req/mo, Pro: $79/mo)
|
||||
- Shodan: https://shodan.io (free: 1,250 results/mo, Small Biz: $299/mo)
|
||||
2. Add `SECURITYTRAILS_API_KEY`, `CENSYS_API_ID`, `CENSYS_API_SECRET`, `SHODAN_API_KEY` to `.env.example`
|
||||
3. Create `darkwatch/securitytrails.client.ts`:
|
||||
- `getDomainInfo(domain)` — WHOIS, DNS records, subdomains
|
||||
- `getSubdomains(domain)` — enumerate all subdomains
|
||||
- `getHistory(domain)` — historical DNS changes
|
||||
4. Create `darkwatch/censys.client.ts`:
|
||||
- `searchHosts(query)` — find exposed hosts by IP/domain
|
||||
- `getCertificates(domain)` — certificate transparency logs
|
||||
- `viewHost(ip)` — detailed host fingerprinting
|
||||
5. Create `darkwatch/shodan.client.ts`:
|
||||
- `search(query)` — search exposed devices and services
|
||||
- `host(ip)` — detailed host information
|
||||
- `count(query)` — result counts for monitoring
|
||||
6. Implement unified `processScanResult(source, result)` that normalizes all API responses to internal exposure schema
|
||||
7. Map exposure types:
|
||||
- SecurityTrails: subdomain exposure, DNS misconfiguration, domain hijacking risk
|
||||
- Censys: exposed services, outdated TLS, certificate issues
|
||||
- Shodan: open ports, default credentials, IoT exposure, Tor association
|
||||
8. Add tier-aware scan limits: Shield = HIBP only, Guard+ = all sources
|
||||
9. Implement intelligent caching: cache SecurityTrails DNS data for 24h, Censys/Shodan for 7d
|
||||
10. Add cost-per-scan tracking in database for billing/usage analytics
|
||||
|
||||
tests:
|
||||
- Unit: Mock all three API responses, verify normalization and exposure creation
|
||||
- Integration: Test each client against real APIs using low-risk test queries
|
||||
- E2E: Add domain to watchlist → trigger scan → verify exposures from all three sources
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] SecurityTrails client queries real API and returns parsed domain/subdomain data
|
||||
- [ ] Censys client queries real API and returns host/certificate information
|
||||
- [ ] Shodan client queries real API and returns device/service exposure data
|
||||
- [ ] Each client respects rate limits (SecurityTrails: 10 req/sec, Censys: 200 req/min, Shodan: 5 req/sec)
|
||||
- [ ] Circuit breakers open after 3 failures and reset after 60 seconds for each source
|
||||
- [ ] Exposure records are normalized regardless of source (consistent schema)
|
||||
- [ ] Alerts are generated for critical findings (open admin panels, exposed databases, certificate expiry)
|
||||
- [ ] Cache hit reduces API calls — verify Redis stores and returns cached data
|
||||
- [ ] Cost tracking records API usage per scan for later billing optimization
|
||||
- [ ] Free tier users only get HIBP; paid tiers unlock SecurityTrails, Censys, Shodan
|
||||
|
||||
validation:
|
||||
- Run `vitest run darkwatch.test.ts` — all tests pass
|
||||
- Manual: Query `example.com` across all three APIs, verify meaningful results returned
|
||||
- Check Redis: Cached responses reduce subsequent API calls
|
||||
- Monitor cost: API call counts tracked in database
|
||||
|
||||
notes:
|
||||
- SecurityTrails is most useful for domain monitoring; Censys/Shodan for IP/host exposure
|
||||
- Shodan's dark web relevance is limited — it sees Tor exit nodes, not .onion content. Consider DarkOwl ($40K+/yr) for deep dark web later
|
||||
- The free tiers are sufficient for development but production needs paid plans ($500–$1,000/mo combined)
|
||||
- Focus on actionable findings: exposed RDP, default credentials, certificate expiry — not just raw port scans
|
||||
- The existing scan engine in `darkwatch.service.ts` already routes by watchlist item type — wire in new clients there
|
||||
72
tasks/core-services-implementation/05-darkwatch-scheduler.md
Normal file
72
tasks/core-services-implementation/05-darkwatch-scheduler.md
Normal file
@@ -0,0 +1,72 @@
|
||||
# 05. Periodic Scan Scheduling, WebSocket Progress, and Alert Deduplication
|
||||
|
||||
meta:
|
||||
id: core-services-05
|
||||
feature: core-services-implementation
|
||||
priority: P1
|
||||
depends_on: [core-services-03, core-services-04]
|
||||
tags: [darkwatch, scheduler, websocket, real-time, deduplication, alerts]
|
||||
|
||||
objective:
|
||||
- Make DarkWatch continuously useful by scheduling periodic scans, providing real-time progress via WebSocket, and eliminating alert fatigue through intelligent deduplication.
|
||||
|
||||
deliverables:
|
||||
- Cron-based scan scheduler with configurable frequency per tier
|
||||
- WebSocket real-time scan progress updates (already have `websocket.ts`)
|
||||
- Alert cooldown periods to prevent duplicate notifications
|
||||
- Digest mode: batch low-priority alerts into daily/weekly summaries
|
||||
- Scan history and metrics dashboard data
|
||||
|
||||
steps:
|
||||
1. Implement cron job scheduler in `jobs/handlers/darkwatch.scan.ts`:
|
||||
- Daily scans for active subscriptions
|
||||
- Respects tier limits (Shield = HIBP only daily, Guard+ = full suite weekly)
|
||||
2. Add `scanFrequency` field to subscription schema (daily, weekly, monthly)
|
||||
3. Wire WebSocket push from existing `websocket.ts` into scan engine:
|
||||
- Emit `scan:started`, `scan:progress` (completedSources/totalSources), `scan:completed` events
|
||||
- Client dashboard subscribes to user-specific scan events
|
||||
4. Enhance alert deduplication beyond existing exposure dedup:
|
||||
- Add `alertCooldownHours` per alert type (e.g., 24h for same breach, 72h for property changes)
|
||||
- Track lastAlertSentAt per (userId, alertType, source) tuple
|
||||
- Don't create new alerts during cooldown unless severity increases
|
||||
5. Implement digest mode:
|
||||
- Low-priority alerts (info) batched into daily digest email
|
||||
- Warning/critical alerts sent immediately via push + email
|
||||
- User preference: immediate vs. digest per severity level
|
||||
6. Add scan metrics:
|
||||
- Store scan duration, sources checked, exposures found, alerts generated
|
||||
- Aggregate for dashboard "threat score" calculation
|
||||
7. Implement scan failure recovery:
|
||||
- Partial scan results saved even if one source fails
|
||||
- Failed sources retried individually in next scan window
|
||||
8. Add rate limit per user: max 1 concurrent scan, queue subsequent requests
|
||||
|
||||
tests:
|
||||
- Unit: Verify cron expression parsing, cooldown logic, digest batching
|
||||
- Integration: Trigger scheduled scan, verify WebSocket events emitted in correct order
|
||||
- E2E: Start scan from dashboard → watch progress bar → receive completion notification
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] Scans run automatically on schedule without manual trigger (cron job)
|
||||
- [ ] WebSocket pushes real-time progress: `scan:progress` events with percentage complete
|
||||
- [ ] Only one scan runs per user at a time; additional requests are queued
|
||||
- [ ] Duplicate alerts are suppressed during cooldown period (configurable per type)
|
||||
- [ ] Info-level alerts are batched into daily digest; warning/critical sent immediately
|
||||
- [ ] Scan history is persisted and visible in dashboard (last scan date, sources checked, findings)
|
||||
- [ ] Failed sources don't fail entire scan — partial results are saved
|
||||
- [ ] Dashboard threat score updates automatically after each scan completion
|
||||
- [ ] Free tier gets weekly scans; paid tiers get daily scans
|
||||
- [ ] No duplicate notifications for same exposure across multiple scans
|
||||
|
||||
validation:
|
||||
- Run cron job manually: `bun run job:darkwatch:scan`, verify scan completes and exposures created
|
||||
- Connect to WebSocket: `wscat -c ws://localhost:3000/ws`, subscribe to scan events
|
||||
- Check dashboard: Scan progress bar animates during active scan, threat score updates after
|
||||
- Test cooldown: Trigger same scan twice rapidly, verify second scan doesn't create duplicate alerts
|
||||
|
||||
notes:
|
||||
- The existing `scanStates` Map in `darkwatch.service.ts` is in-memory — move to Redis for multi-instance safety
|
||||
- WebSocket infrastructure exists at `websocket.ts` — extend it for scan-specific events
|
||||
- The scheduler directory (`scheduler/`) currently only has Dockerfiles — this task creates actual job logic
|
||||
- Consider using Honker (Rust queue) for scan job distribution once it's production-ready
|
||||
- Alert fatigue is a real churn driver — aggressive deduplication is a competitive advantage
|
||||
@@ -0,0 +1,70 @@
|
||||
# 06. Twilio Lookup and Phone Reputation API Integration
|
||||
|
||||
meta:
|
||||
id: core-services-06
|
||||
feature: core-services-implementation
|
||||
priority: P1
|
||||
depends_on: [core-services-01]
|
||||
tags: [spamshield, reputation, twilio, caller-id, api-integration, table-stakes]
|
||||
|
||||
objective:
|
||||
- Replace the stub Hiya/Truecaller lookup functions that return `{ score: 0, isSpam: false }` with real phone reputation API integrations (Twilio Lookup) and integrate results into the spam classification pipeline.
|
||||
|
||||
deliverables:
|
||||
- Twilio Lookup API client for caller name, line type, and carrier info
|
||||
- Phone reputation scoring system with caching
|
||||
- Integration with existing rule engine (reputation score augments rule-based decisions)
|
||||
- STIR/SHAKEN attestation verification (if carrier partnership available)
|
||||
- Rate-limited, cost-aware API usage
|
||||
|
||||
steps:
|
||||
1. Sign up for Twilio account and enable Lookup API at https://www.twilio.com/lookup
|
||||
2. Add `TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN` to `.env.example`
|
||||
3. Create `spamshield/twilio.client.ts`:
|
||||
- `lookupPhone(phoneNumber, type?)` — caller name, line type (mobile/landline/VoIP), carrier
|
||||
- `lookupReputation(phoneNumber)` — spam risk score, call volume, report counts
|
||||
- `verifyStirShaken(phoneNumber)` — attestation level (A/B/C) if available
|
||||
4. Replace stub `lookupHiya()` and `lookupTruecaller()` in `reputation.api.ts` with real Twilio calls
|
||||
5. Implement reputation scoring algorithm:
|
||||
- Twilio spam risk score (0–100) mapped to internal confidence (0.0–1.0)
|
||||
- Line type weighting: VoIP = higher risk, landline = lower risk
|
||||
- Carrier reputation: known spam carriers = +20 risk
|
||||
- STIR/SHAKEN attestation: Full attestation (A) = -30 risk, None (C) = +20 risk
|
||||
6. Cache results in Redis with 24h TTL (phone numbers don't change reputation rapidly)
|
||||
7. Wire into `spamshield.service.ts`:
|
||||
- Before rule engine, check reputation
|
||||
- If reputation confidence > 0.7, block immediately
|
||||
- If reputation confidence 0.4–0.7, flag for review
|
||||
- If reputation confidence < 0.4, proceed to rule engine + ML classifier
|
||||
8. Add cost tracking: $0.004–$0.03 per lookup, track monthly usage per user
|
||||
9. Implement fallback: if Twilio API fails, use internal rule engine only (graceful degradation)
|
||||
|
||||
tests:
|
||||
- Unit: Mock Twilio API responses, verify reputation scoring algorithm
|
||||
- Integration: Test with real Twilio Lookup API using known spam number
|
||||
- E2E: Submit spam check for phone number → verify reputation lookup → get classification result
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] `lookupPhone()` makes real HTTP request to Twilio Lookup API
|
||||
- [ ] Reputation scores are calculated from real Twilio data (not hardcoded zeros)
|
||||
- [ ] High-reputation numbers (confidence > 0.7) trigger automatic block without rule/ML processing
|
||||
- [ ] Cache stores reputation results for 24 hours, reducing API costs
|
||||
- [ ] Twilio API failures gracefully fall back to rule engine (no crashes)
|
||||
- [ ] Cost tracking records each lookup for billing analytics
|
||||
- [ ] STIR/SHAKEN attestation is checked and factored into score when available
|
||||
- [ ] VoIP lines get +20 risk weighting compared to landline
|
||||
- [ ] Internal DB cache (`lookupInternalDB`) is checked before Twilio API call
|
||||
- [ ] Rate limits: max 100 lookups/minute per user to prevent abuse
|
||||
|
||||
validation:
|
||||
- Run `vitest run spamshield.service.test.ts` — all tests pass
|
||||
- Manual: Check reputation for known spam number (e.g., reported robocall number), verify high score
|
||||
- Check cache: Redis `GET spamshield:reputation:+15551234567` returns cached result
|
||||
- Monitor cost: Database shows lookup usage per user per month
|
||||
|
||||
notes:
|
||||
- Twilio Lookup costs $0.004 per basic lookup, $0.03 per advanced lookup (reputation, caller name)
|
||||
- At 100 lookups/user/month, cost is $0.40–$3.00 per user — manageable at $12+/mo ARPU
|
||||
- Hiya and Truecaller have proprietary APIs but require carrier partnerships — Twilio is the best consumer-accessible option
|
||||
- STIR/SHAKEN requires telecom partner for full attestation data — implement if/when partnership exists
|
||||
- The existing rule engine (`ruleEngine()`) is functional — reputation augments it, doesn't replace it
|
||||
@@ -0,0 +1,84 @@
|
||||
# 07. Fine-Tuned DistilBERT SMS Spam Classifier with ONNX Deployment
|
||||
|
||||
meta:
|
||||
id: core-services-07
|
||||
feature: core-services-implementation
|
||||
priority: P1
|
||||
depends_on: [core-services-06]
|
||||
tags: [spamshield, ml, nlp, distilbert, onnx, text-classification]
|
||||
|
||||
objective:
|
||||
- Replace the stub `classifyTextBERT()` function that returns `{ isSpam: false, confidence: 1.0 }` with a production ML pipeline: fine-tune DistilBERT on SMS spam data, export to ONNX for fast inference, and integrate into the spam classification flow.
|
||||
|
||||
deliverables:
|
||||
- Training pipeline for fine-tuning DistilBERT on SMS spam dataset
|
||||
- ONNX-exported model for low-latency CPU inference (~50ms per message)
|
||||
- Inference server with batching and caching
|
||||
- Integration with existing spam classification service
|
||||
- Model versioning and A/B testing framework
|
||||
|
||||
steps:
|
||||
1. Set up Python training environment:
|
||||
- Install `transformers`, `datasets`, `onnxruntime`, `torch`, `optimum[onnxruntime]`
|
||||
- Create `ml/spam-classifier/` directory in project root
|
||||
2. Acquire training data:
|
||||
- SMS Spam Collection Dataset (UCI ML Repository, 5,574 messages)
|
||||
- Enron Spam Dataset (email corpus, filter to SMS-like short messages)
|
||||
- Custom labeled data from user feedback (Phase 2)
|
||||
3. Fine-tune DistilBERT-base-uncased:
|
||||
- Binary classification: spam vs. ham
|
||||
- 3 epochs, batch size 32, learning rate 2e-5
|
||||
- Expected accuracy: 97–99% on SMS Spam Collection
|
||||
4. Export to ONNX:
|
||||
- Use Optimum CLI: `optimum-cli export onnx --model distilbert-spam ./onnx_model/`
|
||||
- Quantize to INT8 for 2x speedup with minimal accuracy loss
|
||||
- Target model size: ~65MB (DistilBERT base), ~33MB (INT8)
|
||||
5. Create Node.js ONNX inference wrapper:
|
||||
- Install `onnxruntime-node`
|
||||
- Load model once at startup, reuse session
|
||||
- Preprocess: tokenize with DistilBERT tokenizer (max length 128)
|
||||
- Postprocess: sigmoid on logits → probability → binary decision
|
||||
- Target latency: <50ms per message on CPU, <10ms on GPU
|
||||
6. Integrate into `spamshield.service.ts`:
|
||||
- Replace `classifyTextBERT()` call with real ONNX inference
|
||||
- Classification flow: reputation lookup → rule engine → ML classifier (ensemble)
|
||||
- Threshold tuning: default 0.5, adjustable per user preference
|
||||
7. Implement feedback loop:
|
||||
- User can report false positive/negative
|
||||
- Store feedback in `spamFeedback` table (already exists)
|
||||
- Weekly retraining batch using accumulated feedback
|
||||
8. Add model versioning:
|
||||
- Store model artifact in S3-compatible storage
|
||||
- A/B test new models on subset of traffic
|
||||
- Rollback capability if accuracy degrades
|
||||
|
||||
tests:
|
||||
- Unit: Verify ONNX inference produces correct labels for known spam/ham test cases
|
||||
- Integration: End-to-end classification flow with real model loading
|
||||
- E2E: Submit SMS text → receive classification with confidence score
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] `classifyTextBERT()` runs real ONNX inference (not returning hardcoded `{ isSpam: false }`)
|
||||
- [ ] Model accuracy > 95% on held-out test set from SMS Spam Collection
|
||||
- [ ] Inference latency < 50ms per message on CPU (measured in production)
|
||||
- [ ] Model file is versioned and loadable from external storage (S3/local path)
|
||||
- [ ] False positive rate < 2% (legitimate messages incorrectly flagged as spam)
|
||||
- [ ] User feedback ("not spam" / "spam") is stored and used for model improvement
|
||||
- [ ] Classification threshold is configurable per user (strict/moderate/lenient)
|
||||
- [ ] ONNX model loads once at server startup, not per-request
|
||||
- [ ] Graceful fallback to rule engine if ONNX runtime fails
|
||||
- [ ] Model size < 100MB for reasonable cold-start time
|
||||
|
||||
validation:
|
||||
- Run `vitest run spamshield.service.test.ts` — tests use real ONNX model
|
||||
- Benchmark: `bun run benchmark:spamshield` — measure 1000 inferences, report p50/p95/p99 latency
|
||||
- Manual: Classify known spam message "Congratulations! You've won $1000...", verify `isSpam: true, confidence > 0.9`
|
||||
- Check feedback: Database `spamFeedback` table accumulates user corrections
|
||||
|
||||
notes:
|
||||
- DistilBERT is chosen over BERT for 40% smaller size and 60% faster inference with minimal accuracy loss
|
||||
- ONNX Runtime Node.js has limited platform support — test on your deployment target (Linux x64, macOS ARM)
|
||||
- Training can happen in CI (GitHub Actions with GPU runner) or locally — inference happens in production
|
||||
- Consider TensorFlow Lite or ONNX Runtime Web for on-device mobile inference later
|
||||
- The SMS Spam Collection is small (5,574 messages) — augment with synthetic spam variants for robustness
|
||||
- For European languages, consider multilingual model like `distilbert-base-multilingual-cased`
|
||||
@@ -0,0 +1,79 @@
|
||||
# 08. Expand Broker Coverage to 50+ with CAPTCHA Solving and Re-Scan Pipeline
|
||||
|
||||
meta:
|
||||
id: core-services-08
|
||||
feature: core-services-implementation
|
||||
priority: P2
|
||||
depends_on: [core-services-02]
|
||||
tags: [removebrokers, automation, captcha, scaling, maintenance]
|
||||
|
||||
objective:
|
||||
- Scale from top 20 brokers to 50+ automated removals, implement CAPTCHA solving, and build the re-scan pipeline that detects re-listings.
|
||||
|
||||
deliverables:
|
||||
- 30+ additional broker adapters (total 50+)
|
||||
- CAPTCHA solving integration (2Captcha or AntiCaptcha API)
|
||||
- Re-scan scheduler that checks if removed profiles have reappeared
|
||||
- Email verification handling for opt-out confirmation emails
|
||||
- Removal success rate dashboard metric
|
||||
|
||||
steps:
|
||||
1. Select next 30 brokers from registry by opt-out complexity (medium-difficulty form-based flows)
|
||||
2. Create adapter modules for each broker in `removebrokers/adapters/`
|
||||
3. Implement CAPTCHA solving:
|
||||
- Detect reCAPTCHA v2/v3, hCaptcha, image challenges
|
||||
- Integrate 2Captcha API ($0.001–$0.01 per solve)
|
||||
- Add `CAPTCHA_SOLVER_API_KEY` to environment config
|
||||
- Fallback to manual queue if CAPTCHA solving fails 3 times
|
||||
4. Implement email verification handling:
|
||||
- Monitor mailbox for opt-out confirmation emails
|
||||
- Parse confirmation links and auto-click them
|
||||
- Store confirmation status in database
|
||||
5. Build re-scan pipeline:
|
||||
- Weekly scheduled job that re-scans all "completed" removals
|
||||
- If profile reappears, create new removal request automatically
|
||||
- Track re-listing rate per broker (some re-list every 30 days)
|
||||
6. Add success metrics:
|
||||
- Track removal success rate per broker (% of opt-outs that stick)
|
||||
- Dashboard widget showing "X of Y brokers removed"
|
||||
- Alert user when re-listing detected
|
||||
7. Implement proxy rotation pool:
|
||||
- Use residential proxy service (BrightData, IPRoyal)
|
||||
- Rotate IP per broker session to avoid blocks
|
||||
- Budget $1K–$3K/mo for proxy infrastructure
|
||||
8. Add adapter health monitoring:
|
||||
- Track adapter breakage rate
|
||||
- Alert engineering when >5% of adapters fail in 24h
|
||||
- Auto-disable broken adapters, queue for manual fix
|
||||
|
||||
tests:
|
||||
- Unit: Mock CAPTCHA solver, verify retry and fallback logic
|
||||
- Integration: Test CAPTCHA solving against real broker site
|
||||
- E2E: Complete removal for broker with CAPTCHA → verify re-scan detects re-listing
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] 50+ broker adapters implemented and tested
|
||||
- [ ] CAPTCHA challenges are detected and solved automatically (2Captcha integration)
|
||||
- [ ] Failed CAPTCHA solving escalates to manual queue after 3 attempts
|
||||
- [ ] Email confirmation links are parsed and clicked automatically
|
||||
- [ ] Re-scan job runs weekly and detects re-listings within 7 days
|
||||
- [ ] Re-listed profiles trigger automatic new removal requests
|
||||
- [ ] Dashboard shows accurate removal progress: "47 of 50 brokers completed"
|
||||
- [ ] Per-broker success rate is tracked and visible in admin panel
|
||||
- [ ] Proxy rotation prevents IP blocking on high-volume brokers
|
||||
- [ ] Adapter breakage is detected within 24 hours and auto-disabled
|
||||
- [ ] Monthly proxy + CAPTCHA cost per user < $4 (within gross margin target)
|
||||
|
||||
validation:
|
||||
- Run `vitest run removebrokers.service.test.ts` — extended tests for 50 brokers
|
||||
- Manual: Test CAPTCHA broker (e.g., MyLife), verify automatic solving works
|
||||
- Check re-scan: Run `bun run job:removebrokers:rescan`, verify re-listings detected
|
||||
- Monitor costs: Dashboard shows monthly proxy/CAPTCHA spend per customer
|
||||
|
||||
notes:
|
||||
- Broker sites change frequently — budget 20% engineering time for adapter maintenance
|
||||
- Some brokers (Acxiom, Epsilon) require physical mail — flag these for manual processing
|
||||
- Re-listing is common — data brokers rebuild databases from public records every 30–90 days
|
||||
- Consider AI-assisted form field detection (GPT-4 Vision) to reduce per-adapter development time
|
||||
- The existing `broker.registry.ts` already has 100+ entries — prioritize by traffic/popularity
|
||||
- Success rate target: 80%+ for automated removals, 90%+ with manual fallback
|
||||
74
tasks/core-services-implementation/09-hometitle-attom-api.md
Normal file
74
tasks/core-services-implementation/09-hometitle-attom-api.md
Normal file
@@ -0,0 +1,74 @@
|
||||
# 09. Attom Data Solutions API for Property Record Snapshots
|
||||
|
||||
meta:
|
||||
id: core-services-09
|
||||
feature: core-services-implementation
|
||||
priority: P2
|
||||
depends_on: [core-services-01]
|
||||
tags: [hometitle, attom, property-records, api-integration, real-estate]
|
||||
|
||||
objective:
|
||||
- Replace the `fetchCountyRecords()` stub that returns `{ ownerName: "Unknown Owner" }` with a real property data API integration using Attom Data Solutions, enabling actual property snapshot and change detection.
|
||||
|
||||
deliverables:
|
||||
- Attom API client for property search, owner info, and tax/assessment data
|
||||
- Property snapshot creation and storage in database
|
||||
- Change detection pipeline wired to real data (your detector logic already works)
|
||||
- Alert generation for ownership changes, liens, and tax status changes
|
||||
|
||||
steps:
|
||||
1. Sign up for Attom Data API at https://attomdata.com (pricing: ~$0.05–$0.10/record, enterprise plans available)
|
||||
2. Add `ATTOM_API_KEY` to `.env.example` and validate in `env.ts`
|
||||
3. Create `hometitle/attom.client.ts`:
|
||||
- `searchProperty(address)` — find property by address, return parcel ID and metadata
|
||||
- `getPropertyProfile(parcelId)` — full property record: owner, deed date, tax info, liens
|
||||
- `getPropertyHistory(parcelId)` — historical ownership and transaction records
|
||||
- `getTaxInfo(parcelId)` — tax amount, delinquency status, exemptions
|
||||
4. Replace `fetchCountyRecords()` in `scanner.ts` with Attom API call:
|
||||
- Use geocoding result (Google Maps API, already works) to get normalized address
|
||||
- Query Attom by address → get parcel ID → fetch full property profile
|
||||
- Parse response into `CountyRecord` / `SnapshotData` schema
|
||||
5. Implement snapshot storage:
|
||||
- Store initial snapshot in `propertySnapshots` table
|
||||
- On re-scan, fetch new snapshot → compare with last → detect changes
|
||||
6. Wire change detection (your `change.detector.ts` is already implemented):
|
||||
- `ownership_transfer`: owner name changed → critical alert
|
||||
- `lien_filing`: lien count increased → warning/critical alert
|
||||
- `tax_change`: tax amount changed → info alert
|
||||
- `deed_change`: deed date changed → critical alert
|
||||
7. Implement tier limits:
|
||||
- Guard: 1 property monitored
|
||||
- Fortress: 3 properties monitored
|
||||
- Family: 5 properties monitored
|
||||
8. Add cost tracking: ~$0.05–$0.10 per property lookup, track per-user usage
|
||||
|
||||
tests:
|
||||
- Unit: Mock Attom API responses, verify parsing and snapshot creation
|
||||
- Integration: Test with real Attom API using known property address
|
||||
- E2E: Add property to watchlist → trigger scan → verify snapshot created → simulate change → verify alert
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] `fetchCountyRecords()` makes real HTTP request to Attom API (not returning mock data)
|
||||
- [ ] Property snapshots contain real owner name, deed date, tax amount, lien count
|
||||
- [ ] Change detection compares real snapshots and identifies actual changes
|
||||
- [ ] Ownership transfer creates critical alert with property address in message
|
||||
- [ ] Lien filing creates warning or critical alert depending on lien amount
|
||||
- [ ] Alert severity matches existing `severityForChange()` logic
|
||||
- [ ] Geocoding → Attom search → snapshot pipeline works end-to-end
|
||||
- [ ] Cost tracking records each Attom API call for billing analytics
|
||||
- [ ] Tier limits enforced: Guard = 1 property, Fortress = 3, Family = 5
|
||||
- [ ] Graceful fallback: if Attom API fails, retry once, then alert user of monitoring gap
|
||||
|
||||
validation:
|
||||
- Run `vitest run hometitle.test.ts` — all tests pass with real Attom mock
|
||||
- Manual: Add real property address, trigger scan, verify snapshot in database
|
||||
- Simulate change: Update snapshot in database with different owner, trigger detector, verify alert
|
||||
- Check cost: Database shows Attom API usage per user per month
|
||||
|
||||
notes:
|
||||
- Attom covers ~150M US properties but not all counties equally — some rural areas may have gaps
|
||||
- For counties not covered by Attom, Phase 3 (task 10) implements county recorder web scrapers
|
||||
- Property fraud is a real and growing problem: FTC reports $1B+ in losses annually
|
||||
- This is a unique differentiator — no major identity protection competitor offers property monitoring
|
||||
- Consider partnership with title insurance companies for added credibility
|
||||
- The existing Google Maps geocoding already works — verify `GEOCODING_API_KEY` is set
|
||||
@@ -0,0 +1,83 @@
|
||||
# 10. County Recorder Web Scrapers for Top 100 US Counties
|
||||
|
||||
meta:
|
||||
id: core-services-10
|
||||
feature: core-services-implementation
|
||||
priority: P2
|
||||
depends_on: [core-services-09]
|
||||
tags: [hometitle, scraping, county-records, fallback, coverage]
|
||||
|
||||
objective:
|
||||
- Build Playwright-based web scrapers for county recorder websites in the top 100 US counties by population, providing a fallback for counties not covered by Attom API and reducing API costs.
|
||||
|
||||
deliverables:
|
||||
- Scrapers for 100 US county recorder websites (starting with top 50)
|
||||
- Unified property record parser that normalizes disparate HTML formats
|
||||
- Fallback logic: Attom API → county scraper → manual request (in order)
|
||||
- scraper health monitoring and breakage detection
|
||||
|
||||
steps:
|
||||
1. Identify top 100 US counties by population (start with top 50):
|
||||
- Los Angeles County, CA; Cook County, IL; Harris County, TX; Maricopa County, AZ; etc.
|
||||
2. Research each county's recorder website:
|
||||
- Search URL pattern (usually `https://{county}.gov/recorder` or similar)
|
||||
- Record search interface (by owner name, parcel ID, or address)
|
||||
- Result format (HTML table, PDF, JSON API, proprietary system)
|
||||
3. Create `hometitle/county-scrapers/` directory with one module per county
|
||||
4. Implement base scraper interface:
|
||||
- `searchByAddress(address): Promise<CountyRecord[]>`
|
||||
- `searchByParcelId(parcelId): Promise<CountyRecord | null>`
|
||||
- `parseResults(html): CountyRecord[]`
|
||||
5. Implement scrapers for each county using Playwright:
|
||||
- Navigate to recorder website
|
||||
- Fill search form (address or parcel ID)
|
||||
- Submit and wait for results
|
||||
- Parse HTML table or detail page
|
||||
- Extract: owner name, deed date, tax info, lien status
|
||||
6. Implement unified `parseDeedRecords(html)` that handles common formats:
|
||||
- HTML tables with standard columns
|
||||
- Detail pages with labeled fields
|
||||
- PDF records (download + text extraction)
|
||||
7. Add fallback chain in `scanner.ts`:
|
||||
- Try Attom API first (fastest, most reliable)
|
||||
- If Attom returns null/empty, try county scraper
|
||||
- If scraper fails, queue for manual request (email to user)
|
||||
8. Add scraper monitoring:
|
||||
- Track success/failure rate per county
|
||||
- Alert when >20% of scrapers fail in 24h (site changes)
|
||||
- Auto-disable broken scrapers, fall back to Attom/manual
|
||||
9. Handle rate limiting:
|
||||
- Throttle requests to county sites (max 1 req/5 sec per county)
|
||||
- Use residential proxies if county blocks datacenter IPs
|
||||
- Respect robots.txt and terms of service
|
||||
|
||||
tests:
|
||||
- Unit: Mock HTML responses for common county formats, verify parser normalization
|
||||
- Integration: Test 5 representative county scrapers against live sites
|
||||
- E2E: Property in county without Attom coverage → scraper fetches real data → snapshot created
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] 50+ county recorder scrapers implemented and tested against live sites
|
||||
- [ ] `parseDeedRecords()` parses real HTML and returns structured CountyRecord objects
|
||||
- [ ] Fallback chain works: Attom → county scraper → manual request
|
||||
- [ ] Each scraper handles the county's specific search interface and result format
|
||||
- [ ] Rate limiting respects county sites (max 1 request per 5 seconds)
|
||||
- [ ] Broken scrapers are auto-detected within 24 hours and disabled
|
||||
- [ ] Scraper success rate > 70% across all implemented counties
|
||||
- [ ] Property records from scrapers match Attom data quality (owner name, deed date, liens)
|
||||
- [ ] Failed scraper attempts fall back to manual queue with user notification
|
||||
- [ ] No county site is overwhelmed by scraping (responsible rate limits)
|
||||
|
||||
validation:
|
||||
- Run `vitest run hometitle.test.ts` — extended tests for county scrapers
|
||||
- Manual: Search property in Cook County IL, verify scraper returns real owner data
|
||||
- Check fallback: Disable Attom API key, trigger scan, verify county scraper activates
|
||||
- Monitor health: Dashboard shows per-county scraper success rate
|
||||
|
||||
notes:
|
||||
- County recorder sites are notoriously fragile — expect 30–40% of scrapers to break per quarter
|
||||
- Many counties use proprietary systems (e.g., Tyler Technologies, Fidlar Technologies) with complex JavaScript
|
||||
- Some counties require payment per record ($1–$5) — flag these for manual processing
|
||||
- Consider partnering with Attom for counties they don't cover rather than building scrapers
|
||||
- Legal: Ensure scraping complies with each county's terms of service and state public records laws
|
||||
- The existing `parseDeedRecords()` currently logs "not yet implemented" — replace with real parsing
|
||||
@@ -0,0 +1,84 @@
|
||||
# 11. Azure Voice Live API for Synthetic Voice Detection
|
||||
|
||||
meta:
|
||||
id: core-services-11
|
||||
feature: core-services-implementation
|
||||
priority: P2
|
||||
depends_on: [core-services-01]
|
||||
tags: [voiceprint, azure, voice-clone-detection, liveness, api-integration]
|
||||
|
||||
objective:
|
||||
- Replace the stub `detectSynthetic()` that returns `{ isSynthetic: false, confidence: 1.0 }` with a real Azure Voice Live API integration, enabling consumer-facing voice clone detection via uploaded call recordings or live microphone capture.
|
||||
|
||||
deliverables:
|
||||
- Azure Speech Services client with Voice Live API endpoint
|
||||
- Audio preprocessing pipeline (resampling, normalization, VAD)
|
||||
- Voice enrollment system for trusted contacts (family member voice templates)
|
||||
- Synthetic detection endpoint that returns real confidence scores
|
||||
- Call recording upload and analysis workflow
|
||||
|
||||
steps:
|
||||
1. Sign up for Azure Speech Services at https://azure.microsoft.com/services/cognitive-services/speech-services/
|
||||
2. Add `AZURE_SPEECH_KEY` and `AZURE_SPEECH_REGION` to `.env.example`
|
||||
3. Create `voiceprint/azure.client.ts`:
|
||||
- `detectLiveness(audioBuffer, referenceText?)` — Voice Live API for challenge-response liveness
|
||||
- `verifySpeaker(audioBuffer, enrollmentId)` — speaker verification against enrolled voice
|
||||
- `enrollSpeaker(audioSamples): Promise<enrollmentId>` — create voice template from samples
|
||||
4. Implement audio preprocessing:
|
||||
- Convert to 16kHz mono PCM (Azure requirement)
|
||||
- Normalize amplitude to -3 dBFS
|
||||
- Trim silence using VAD (WebRTC or Silero)
|
||||
- Max duration: 30 seconds per analysis
|
||||
5. Implement enrollment flow:
|
||||
- User records 3–5 samples of family member saying phrases
|
||||
- Store enrollment in database with `voiceEnrollments` schema (already exists)
|
||||
- Generate enrollment ID, link to user account
|
||||
6. Implement detection flow:
|
||||
- User uploads suspicious call recording or captures live audio
|
||||
- Preprocess audio → Azure Voice Live API → get liveness score
|
||||
- If enrollment exists, also run speaker verification → similarity score
|
||||
- Combine scores: synthetic = low liveness AND low speaker match
|
||||
7. Implement `detectSynthetic()` to return real analysis:
|
||||
- Score: 0.0–1.0 (synthetic likelihood)
|
||||
- Confidence: based on audio quality and API response certainty
|
||||
- Decision: synthetic if score > 0.7, suspicious if 0.4–0.7, genuine if < 0.4
|
||||
8. Add analysis history:
|
||||
- Store every analysis in database (audio hash, score, decision)
|
||||
- Dashboard shows history of analyzed calls
|
||||
- User can report false positive/negative for model improvement
|
||||
9. Implement tier limits:
|
||||
- Fortress+: VoicePrint included
|
||||
- Lower tiers: not available or limited to 5 analyses/month
|
||||
|
||||
tests:
|
||||
- Unit: Mock Azure API responses, verify score calculation and decision logic
|
||||
- Integration: Test with real Azure Voice Live API using synthetic and genuine audio samples
|
||||
- E2E: Upload suspicious call recording → receive analysis result with confidence score
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] `detectSynthetic()` calls real Azure Voice Live API (not returning hardcoded `isSynthetic: false`)
|
||||
- [ ] Audio preprocessing converts to 16kHz mono PCM and normalizes amplitude
|
||||
- [ ] Voice enrollment creates usable template from 3–5 user-provided samples
|
||||
- [ ] Speaker verification returns similarity score between 0.0 and 1.0
|
||||
- [ ] Liveness detection returns pass/fail with confidence for challenge-response mode
|
||||
- [ ] Combined score correctly flags known synthetic voice samples (>0.7 threshold)
|
||||
- [ ] Analysis results are stored in database with audio hash and metadata
|
||||
- [ ] Dashboard shows analysis history with play button for uploaded audio
|
||||
- [ ] Tier enforcement: VoicePrint only available on Fortress+ plans
|
||||
- [ ] Graceful fallback: if Azure API fails, return "analysis unavailable" (not false negative)
|
||||
- [ ] False positive rate < 5% on genuine voice samples (tested with 100+ samples)
|
||||
|
||||
validation:
|
||||
- Run `vitest run voiceprint.test.ts` — all tests pass with Azure mock
|
||||
- Manual: Upload genuine voice sample, verify `isSynthetic: false` with confidence > 0.9
|
||||
- Manual: Upload synthetic voice (e.g., from ElevenLabs), verify `isSynthetic: true` with confidence > 0.7
|
||||
- Check enrollment: Database `voiceEnrollments` table has real templates with Azure enrollment IDs
|
||||
|
||||
notes:
|
||||
- Azure Voice Live API costs ~$0.016/minute of audio analyzed
|
||||
- At 50 analyses/user/month (1–2 min each), cost is ~$0.80–$1.60/user/month
|
||||
- This is the ONLY practical path for a startup — building in-house costs $840K–$1.25M Year 1
|
||||
- The differentiator isn't the detection tech (everyone uses Azure/Daon/Pindrop) — it's the consumer UX and integration
|
||||
- Consider adding forensic analysis mode: detailed spectrogram visualization for user education
|
||||
- Mobile integration (iOS CallKit, Android Telecom) is Phase 4 (task 12) — this task is server-side only
|
||||
- Store audio samples securely (encrypted at rest) and allow user deletion (privacy compliance)
|
||||
@@ -0,0 +1,84 @@
|
||||
# 12. iOS CallKit and Android Telecom API for Real-Time Call Analysis
|
||||
|
||||
meta:
|
||||
id: core-services-12
|
||||
feature: core-services-implementation
|
||||
priority: P2
|
||||
depends_on: [core-services-11]
|
||||
tags: [voiceprint, ios, android, callkit, telecom-api, real-time, mobile]
|
||||
|
||||
objective:
|
||||
- Integrate VoicePrint into the iOS and Android mobile apps via CallKit and Telecom API, enabling real-time call recording, analysis, and synthetic voice alerts during active phone calls.
|
||||
|
||||
deliverables:
|
||||
- iOS CallKit extension for call interception and recording
|
||||
- Android Telecom API integration for call screening and recording
|
||||
- Real-time audio streaming to server for analysis
|
||||
- Push notification alert when synthetic voice detected during call
|
||||
- On-device audio capture and upload pipeline
|
||||
|
||||
steps:
|
||||
1. **iOS Implementation:**
|
||||
- Create CallKit extension (`CallDirectoryExtension`) for caller identification
|
||||
- Implement `CXProvider` delegate for call state monitoring
|
||||
- Add audio recording permission (NSMicrophoneUsageDescription in Info.plist)
|
||||
- Stream call audio to server via WebSocket or upload after call ends
|
||||
- Show in-call alert overlay when synthetic voice detected
|
||||
- Handle app backgrounding and call recording continuity
|
||||
2. **Android Implementation:**
|
||||
- Implement `TelecomManager` with `ConnectionService` for call monitoring
|
||||
- Add `READ_PHONE_STATE`, `RECORD_AUDIO`, `FOREGROUND_SERVICE` permissions
|
||||
- Create call screening service that triggers on incoming/outgoing calls
|
||||
- Record call audio using `MediaRecorder` or `AudioRecord`
|
||||
- Upload audio to server for analysis after call ends
|
||||
- Show heads-up notification when synthetic voice detected
|
||||
3. **Server-side integration:**
|
||||
- Extend VoicePrint tRPC router with `analyzeCallRecording` endpoint
|
||||
- Handle multipart audio upload (WAV/MP3 format)
|
||||
- Queue analysis job, push result via WebSocket or push notification
|
||||
- Store analysis result linked to call metadata (number, duration, timestamp)
|
||||
4. **Real-time vs. post-call analysis:**
|
||||
- Phase 1: Post-call upload + analysis (simpler, lower latency requirement)
|
||||
- Phase 2: Real-time streaming chunks during call (requires <500ms analysis)
|
||||
5. **User experience:**
|
||||
- Settings toggle: "Analyze calls for voice cloning"
|
||||
- After each analyzed call: summary card in app (genuine/suspicious/synthetic)
|
||||
- Emergency override: one-tap hangup + block number when synthetic detected
|
||||
6. **Privacy and compliance:**
|
||||
- Two-party consent state detection (disable recording in 2-party consent states)
|
||||
- User must explicitly opt-in before any call recording
|
||||
- Audio data encrypted in transit and at rest
|
||||
- Auto-delete audio after analysis (configurable retention: 0–30 days)
|
||||
|
||||
tests:
|
||||
- Unit: Mock CallKit/Telecom callbacks, verify audio capture and upload logic
|
||||
- Integration: Test audio upload and analysis flow on device simulator
|
||||
- E2E: Receive call on device → record audio → upload → receive analysis notification
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] iOS app can record incoming call audio and upload to server for analysis
|
||||
- [ ] Android app can record incoming call audio and upload to server for analysis
|
||||
- [ ] Call recording only happens after explicit user opt-in
|
||||
- [ ] Two-party consent states are detected and recording is disabled (legal compliance)
|
||||
- [ ] Uploaded audio is analyzed by Azure Voice Live API and result pushed to device
|
||||
- [ ] Push notification sent within 30 seconds of analysis completion
|
||||
- [ ] In-app call summary shows: caller number, duration, analysis result, confidence score
|
||||
- [ ] Emergency hangup button available when synthetic voice detected
|
||||
- [ ] Audio data is encrypted in transit (TLS) and deleted after analysis (0-day retention default)
|
||||
- [ ] App handles backgrounding without losing call recording session
|
||||
- [ ] Recording doesn't interfere with normal call audio quality
|
||||
|
||||
validation:
|
||||
- iOS: Test on physical device (simulator doesn't support CallKit), verify recording and upload
|
||||
- Android: Test on physical device, verify Telecom API integration and notification delivery
|
||||
- Server: Verify `analyzeCallRecording` endpoint accepts multipart upload and returns analysis
|
||||
- Legal review: Confirm 2-party consent logic covers all US states correctly
|
||||
|
||||
notes:
|
||||
- iOS CallKit extensions run in separate process — share data via App Groups
|
||||
- Android Telecom API requires phone app to be default dialer (limited market penetration)
|
||||
- Alternative: Use accessibility service on Android for broader call recording (more invasive UX)
|
||||
- Real-time analysis requires chunking audio into 3–5 second segments and streaming — much harder than post-call
|
||||
- Consider starting with post-call analysis and adding real-time as Phase 2
|
||||
- Audio file sizes: 1 minute of WAV at 16kHz mono = ~1.9MB; compress to AAC/MP3 for upload
|
||||
- The existing iOS `VoicePrintViewModel.swift` and Android `VoicePrintViewModel.kt` need updating
|
||||
81
tasks/core-services-implementation/13-correlation-engine.md
Normal file
81
tasks/core-services-implementation/13-correlation-engine.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# 13. Cross-Service Threat Correlation Scoring and Unified Alert Feed
|
||||
|
||||
meta:
|
||||
id: core-services-13
|
||||
feature: core-services-implementation
|
||||
priority: P2
|
||||
depends_on: [core-services-05, core-services-07, core-services-08]
|
||||
tags: [correlation, threat-scoring, unified-alerts, intelligence, dashboard]
|
||||
|
||||
objective:
|
||||
- Activate the correlation service to cross-reference findings across VoicePrint, DarkWatch, SpamShield, HomeTitle, and RemoveBrokers, generating unified threat scores and correlated alert narratives that explain multi-vector attacks.
|
||||
|
||||
deliverables:
|
||||
- Cross-service correlation rules (e.g., breached email + spam call from same source = coordinated attack)
|
||||
- Unified threat score algorithm (0–100) per user and per family member
|
||||
- Correlated alert narratives: "Your email was breached on Monday, and today you received a spam call to that number — this may be a targeted attack"
|
||||
- Dashboard threat score widget with historical trend
|
||||
|
||||
steps:
|
||||
1. Analyze existing correlation service (`services/correlation/`):
|
||||
- Review current schema and logic in `correlation.service.ts`
|
||||
- Identify data sources available from each service
|
||||
2. Define correlation rules:
|
||||
- Rule 1: Same email found in HIBP breach AND receiving spam calls → coordinated attack (+30 threat score)
|
||||
- Rule 2: Property lien filed AND data broker listing active → identity theft in progress (+40 threat score)
|
||||
- Rule 3: Voice clone detected AND family member SSN on dark web → targeted family scam (+50 threat score)
|
||||
- Rule 4: Multiple breaches in 30 days → compromised identity (+20 threat score)
|
||||
- Rule 5: Spam call from number associated with known scam campaign → high risk (+25 threat score)
|
||||
3. Implement correlation detection pipeline:
|
||||
- Subscribe to alert creation events from all 5 services
|
||||
- Window function: look back 30 days for related findings
|
||||
- Match on shared entities (email, phone, SSN, address, name)
|
||||
4. Implement threat scoring algorithm:
|
||||
- Base score: sum of individual alert severities (info=1, warning=3, critical=5)
|
||||
- Correlation bonus: +10–50 per matched rule
|
||||
- Time decay: scores decrease by 10% per week (old alerts matter less)
|
||||
- Family aggregation: highest individual score + average of others / 2
|
||||
- Cap at 100, floor at 0
|
||||
5. Implement unified alert feed:
|
||||
- Merge individual service alerts into chronological feed
|
||||
- Group correlated alerts into "attack narratives"
|
||||
- Show narrative summary: "3 related events detected — possible coordinated attack"
|
||||
6. Update dashboard widgets:
|
||||
- Threat Score widget: current score with color coding (green <30, yellow 30–60, red >60)
|
||||
- Trend graph: score over last 90 days
|
||||
- Alert Feed widget: unified feed with narrative grouping
|
||||
7. Add proactive recommendations:
|
||||
- If score > 60: recommend password changes, credit freeze, family notification
|
||||
- If HomeTitle + RemoveBrokers correlated: recommend title insurance review
|
||||
- If VoicePrint detected: recommend warning family members, filing FTC report
|
||||
|
||||
tests:
|
||||
- Unit: Mock alerts from multiple services, verify correlation rules fire correctly
|
||||
- Integration: Create correlated alerts in database, verify threat score calculation
|
||||
- E2E: Trigger breach alert + spam alert for same email → verify unified narrative created
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] Correlation rules detect cross-service relationships within 30-day window
|
||||
- [ ] Threat score is calculated from individual alert severities + correlation bonuses
|
||||
- [ ] Score decays by 10% per week (time-weighted relevance)
|
||||
- [ ] Family plan aggregates scores across members
|
||||
- [ ] Unified alert feed groups correlated events into narrative summaries
|
||||
- [ ] Dashboard threat score widget updates in real-time as new alerts arrive
|
||||
- [ ] Proactive recommendations appear based on current threat score and active correlations
|
||||
- [ ] Correlation engine doesn't create false positives (test with 100 random alerts, <5% false correlation rate)
|
||||
- [ ] Historical trend graph shows score changes over 90 days
|
||||
- [ ] Each correlated narrative links to individual alert details
|
||||
|
||||
validation:
|
||||
- Run `vitest run correlation.test.ts` — all tests pass
|
||||
- Manual: Create test alerts (breached email + spam call), verify correlation detected
|
||||
- Dashboard: Threat score updates from 15 to 55 after correlation bonus applied
|
||||
- Trend: 90-day graph shows spike during test period
|
||||
|
||||
notes:
|
||||
- The existing `correlation.service.ts` and `correlation.ts` router need activation — not just stubs
|
||||
- Correlation is the key differentiator from point-solution competitors (Aura, LifeLock)
|
||||
- False positive rate must be low — users will ignore alerts if too many are irrelevant
|
||||
- Consider using graph database (Neo4j) for complex relationship queries at scale
|
||||
- The existing `normalizedAlerts` table already stores cross-service alerts — use this as correlation source
|
||||
- Mobile apps should show simplified threat score and latest narrative, not full correlation graph
|
||||
91
tasks/core-services-implementation/14-family-plans.md
Normal file
91
tasks/core-services-implementation/14-family-plans.md
Normal file
@@ -0,0 +1,91 @@
|
||||
# 14. Family Plan Member Management, Billing Proration, and Multi-User Dashboard
|
||||
|
||||
meta:
|
||||
id: core-services-14
|
||||
feature: core-services-implementation
|
||||
priority: P2
|
||||
depends_on: [core-services-01]
|
||||
tags: [billing, family-plans, multi-user, proration, dashboard, member-management]
|
||||
|
||||
objective:
|
||||
- Implement family plan support: invite family members, manage their access, prorate billing on member changes, and provide a multi-user dashboard showing consolidated family security status.
|
||||
|
||||
deliverables:
|
||||
- Family member invitation system (email invites with acceptance flow)
|
||||
- Role-based access control (primary account holder vs. member)
|
||||
- Billing proration for adding/removing family members mid-cycle
|
||||
- Family dashboard showing all members' threat scores and alerts
|
||||
- Per-member service configuration (what each member monitors)
|
||||
|
||||
steps:
|
||||
1. Extend database schema:
|
||||
- Add `familyGroups` table: id, primaryUserId, planTier, maxMembers, createdAt
|
||||
- Add `familyMembers` table: id, familyGroupId, userId, role (primary/member), status (pending/active/removed), invitedAt, joinedAt
|
||||
- Add `familyInvitations` table: id, familyGroupId, email, token, expiresAt, acceptedAt
|
||||
2. Implement invitation flow:
|
||||
- Primary user sends invite by email → generates signed token
|
||||
- Invitee clicks link → creates account (if new) or links existing account
|
||||
- Invitation expires after 7 days
|
||||
- Send reminder email after 3 days if not accepted
|
||||
3. Implement member management:
|
||||
- Primary user can view all members, their active services, and threat scores
|
||||
- Primary user can remove members (prorated refund or credit)
|
||||
- Members can leave family group voluntarily
|
||||
- Members cannot see other members' sensitive data (SSN, specific breach details)
|
||||
4. Implement billing proration:
|
||||
- Add member mid-cycle: charge prorated amount for remaining days via Stripe
|
||||
- Remove member mid-cycle: credit prorated amount to account balance
|
||||
- Change plan tier: prorate difference, apply to next invoice
|
||||
- Use Stripe's `proration_behavior: 'create_prorations'` for all changes
|
||||
5. Implement family dashboard:
|
||||
- Sidebar shows family group name and member count
|
||||
- Main view: cards for each member with photo, name, threat score, recent alert count
|
||||
- Click member → detailed view with their services, alerts, and settings
|
||||
- Consolidated family threat score (from correlation engine)
|
||||
6. Implement per-member service configuration:
|
||||
- Primary user assigns which services each member gets
|
||||
- Default: all members get DarkWatch + SpamShield + RemoveBrokers
|
||||
- HomeTitle and VoicePrint limited by property/voice enrollment slots
|
||||
- Members can configure their own watchlist items within assigned services
|
||||
7. Implement notification routing:
|
||||
- Critical alerts notify primary user AND affected member
|
||||
- Billing notifications go to primary user only
|
||||
- Member can opt into/off specific alert types
|
||||
8. Add family plan tiers:
|
||||
- Family Fortress: 5 adults + unlimited children, $45/mo
|
||||
- Family Guard: 3 adults + unlimited children, $35/mo
|
||||
- Enforce max member limits at invitation time
|
||||
|
||||
tests:
|
||||
- Unit: Proration calculation for add/remove/upgrade scenarios
|
||||
- Integration: Full invitation flow from email to account linking
|
||||
- E2E: Create family plan → invite 2 members → verify billing → remove member → verify prorated credit
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] Primary user can send email invitations to family members
|
||||
- [ ] Invitations expire after 7 days and can be resent
|
||||
- [ ] Members can accept invitations and join family group
|
||||
- [ ] Adding member mid-cycle creates prorated charge on next invoice
|
||||
- [ ] Removing member mid-cycle creates prorated credit on next invoice
|
||||
- [ ] Family dashboard shows all members with threat scores and alert counts
|
||||
- [ ] Primary user can configure which services each member has access to
|
||||
- [ ] Members cannot see other members' sensitive breach details (only score + summary)
|
||||
- [ ] Billing notifications route to primary user; security alerts route to affected member
|
||||
- [ ] Max member limits enforced at invitation (5 for Fortress, 3 for Guard)
|
||||
- [ ] Plan downgrade prevents inviting beyond new tier's member limit
|
||||
- [ ] All family plan changes handled via Stripe proration (no manual calculations)
|
||||
|
||||
validation:
|
||||
- Run `vitest run billing.test.ts` — extended tests for family proration
|
||||
- Manual: Send invitation to test email, click link, verify member joins family
|
||||
- Stripe Dashboard: Verify proration items appear on invoices after member changes
|
||||
- Dashboard: Family view shows 3 member cards with individual threat scores
|
||||
|
||||
notes:
|
||||
- Family plans have 30–50% lower churn than individual plans — this is a critical retention driver
|
||||
- Stripe's `proration_behavior` handles most math automatically — trust it
|
||||
- Children's accounts should be restricted: no dark web monitoring for minors, only spam/basic alerts
|
||||
- Consider adding "family safety alerts" — notify primary user if child receives suspicious contact
|
||||
- The existing `invitation.ts` schema may need extension for family-specific invitation tokens
|
||||
- Member removal should not delete their account — just unlink from family group
|
||||
- Children (under 18) should have simplified dashboard — no breach details, only "safe/attention needed"
|
||||
45
tasks/core-services-implementation/README.md
Normal file
45
tasks/core-services-implementation/README.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# Core Services Implementation
|
||||
|
||||
**Objective:** Convert all stub/placeholder services into production-ready implementations with real API integrations, enabling paid customer subscriptions and revenue.
|
||||
|
||||
**Status legend:** [ ] todo, [~] in-progress, [x] done
|
||||
|
||||
## Tasks
|
||||
|
||||
### Phase 1 — Foundation (Revenue Enabler)
|
||||
- [ ] 01 — Stripe Checkout, webhooks, and subscription state management → `01-stripe-checkout-webhooks.md`
|
||||
- [ ] 02 — Automated removal engine for top 20 data brokers → `02-removebrokers-top-20.md`
|
||||
|
||||
### Phase 2 — Core Services (Table Stakes)
|
||||
- [ ] 03 — HIBP API integration for email breach monitoring → `03-darkwatch-hibp.md`
|
||||
- [ ] 04 — SecurityTrails, Censys, Shodan API integrations → `04-darkwatch-attack-surface.md`
|
||||
- [ ] 05 — Periodic scan scheduling, WebSocket progress, alert deduplication → `05-darkwatch-scheduler.md`
|
||||
- [ ] 06 — Twilio Lookup and phone reputation API integration → `06-spamshield-reputation.md`
|
||||
- [ ] 07 — Fine-tuned DistilBERT SMS spam classifier with ONNX deployment → `07-spamshield-ml-classifier.md`
|
||||
|
||||
### Phase 3 — Scale & Expand
|
||||
- [ ] 08 — Expand broker coverage to 50+ with CAPTCHA solving → `08-removebrokers-50-plus.md`
|
||||
- [ ] 09 — Attom Data Solutions API for property record snapshots → `09-hometitle-attom-api.md`
|
||||
- [ ] 10 — County recorder web scrapers for top 100 US counties → `10-hometitle-county-scrapers.md`
|
||||
- [ ] 11 — Azure Voice Live API for synthetic voice detection → `11-voiceprint-azure-api.md`
|
||||
|
||||
### Phase 4 — Differentiation & Polish
|
||||
- [ ] 12 — iOS CallKit and Android Telecom API for real-time call analysis → `12-voiceprint-mobile-integration.md`
|
||||
- [ ] 13 — Cross-service threat correlation scoring and unified alert feed → `13-correlation-engine.md`
|
||||
- [ ] 14 — Family plan member management, billing proration, multi-user dashboard → `14-family-plans.md`
|
||||
|
||||
## Dependencies
|
||||
- 02 → 08 (expand broker automation after initial 20 work)
|
||||
- 03 → 04 → 05 (HIBP before attack surface APIs before scheduling)
|
||||
- 06 → 07 (reputation APIs before ML classifier)
|
||||
- 09 → 10 (Attom API before county scraping fallback)
|
||||
- 11 → 12 (Azure API before mobile integration)
|
||||
- 01 → 14 (billing before family plan management)
|
||||
- 05, 07, 08 → 13 (core services feed into correlation engine)
|
||||
|
||||
## Exit Criteria
|
||||
- All 5 core services make real API calls or run real ML inference — no stub responses remain in production code
|
||||
- Billing supports Stripe Checkout, webhooks, tier upgrades/downgrades, and trial periods
|
||||
- A paying customer can sign up, receive real alerts, and see tangible value within 48 hours
|
||||
- Mobile apps display real data from all working services
|
||||
- No `crypto.randomUUID()`, `isSynthetic: false`, `isSpam: false`, or `Unknown Owner` mock responses in production paths
|
||||
@@ -7,10 +7,10 @@ Status legend: [ ] todo, [~] in-progress, [x] done
|
||||
## Tasks
|
||||
|
||||
### App Store Preparation
|
||||
- [ ] 01 — App Store Screenshots & Metadata → `01-app-store-screenshots.md`
|
||||
- [ ] 02 — App Preview Video → `02-app-preview-video.md`
|
||||
- [ ] 03 — App Store Connect Configuration → `03-app-store-connect.md`
|
||||
- [ ] 04 — TestFlight Beta Distribution → `04-testflight-beta.md`
|
||||
- [x] 01 — App Store Screenshots & Metadata → `01-app-store-screenshots.md`
|
||||
- [x] 02 — App Preview Video → `02-app-preview-video.md`
|
||||
- [x] 03 — App Store Connect Configuration → `03-app-store-connect.md`
|
||||
- [x] 04 — TestFlight Beta Distribution → `04-testflight-beta.md`
|
||||
|
||||
### Security Hardening
|
||||
- [ ] 05 — Certificate Pinning & TLS Validation → `05-certificate-pinning.md`
|
||||
|
||||
Reference in New Issue
Block a user