Files
Kordant/docs/PRODUCT-GAP-ANALYSIS.md
2026-05-31 22:03:18 -04:00

429 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Kordant: Product Gap Analysis & Path to Revenue
**Date:** May 31, 2026
**Scope:** What's functional vs. scaffolding, what's needed to ship, expected customer value, pricing
---
## Executive Summary
Kordant is a **well-architected platform with mostly scaffolding implementations**. The codebase has excellent structure — tRPC routers, Drizzle ORM schemas, service layers, job handlers, mobile apps, and a Rust queueing library (Honker). However, **none of the five core services deliver real value to a paying customer today**. The ML models return stub data, external API integrations are placeholders, and data sources return mock results.
**Bottom line:** You have the platform skeleton. You need to build the muscles.
| Service | Status | Lines of Code | Real Functionality | Effort to Ship |
|---------|--------|---------------|-------------------|----------------|
| **VoicePrint** | ❌ Pure scaffolding | ~240 | None — returns `isSynthetic: false` | 612 months, $100K$500K |
| **DarkWatch** | ⚠️ Architecture only | ~500+ | Circuit breakers, alert pipeline, CRUD — no real API calls | 24 months, $20K$50K |
| **SpamShield** | ⚠️ Rule engine only | ~400+ | Pattern matching works — ML & reputation APIs are stubs | 23 months, $15K$40K |
| **HomeTitle** | ❌ Scaffolding | ~300 | Geocoding works — county records return mock data | 36 months, $30K$80K |
| **RemoveBrokers** | ⚠️ Registry only | ~1,500+ | Broker registry (100+ entries) — removal engine is placeholder | 24 months, $20K$50K |
| **Billing** | ⚠️ Minimal | ~100 | Stripe client — no webhooks, proration, or checkout | 12 months, $10K$20K |
| **Auth** | ✅ Functional | ~200 | JWT + bcrypt working | Done |
---
## 1. Current State: What Actually Works
### ✅ Functional (Shippable Today)
- **Authentication:** JWT signing/verification (jose), password hashing (bcrypt, 10 rounds). Solid implementation.
- **Database Schema:** Complete Drizzle ORM schemas for all 5 services, alerts, billing, subscriptions, audit logs.
- **tRPC API Layer:** Router scaffolding for all services with proper Zod schemas.
- **Dashboard UI:** Web dashboard with sidebar, threat score widget, alert feed, service widgets.
- **Mobile Apps:** iOS (SwiftUI) and Android (Compose) with ViewModels, Models, and navigation. Thin clients calling tRPC.
- **Browser Extension:** Chrome Manifest V3 extension shell.
- **Honker (Rust):** Queueing library for background jobs, FFI bindings.
- **Geocoding:** Google Maps API integration in HomeTitle (works if API key provided).
- **SpamShield Rule Engine:** Regex/area code/prefix pattern matching works.
- **DarkWatch Alert Pipeline:** Severity scoring, exposure deduplication, alert creation logic.
- **RemoveBrokers Registry:** 100+ broker entries with domains, removal URLs, categories.
### ❌ Not Functional (Scaffolding/Placeholders)
| Component | What It Does | What It Should Do |
|-----------|-------------|-------------------|
| **VoicePrint ML Engine** | Returns `{ isSynthetic: false, confidence: 1.0, score: 0.0 }` | Detect AI-generated voices in real-time |
| **VoicePrint Voice Matching** | Returns `{ similarity: 0, matched: false }` | Compare voice against enrolled templates |
| **VoicePrint Embedding** | Returns empty `Float64Array(256)` + SHA256 hash | Generate voice embeddings for enrollment |
| **DarkWatch Scan Engine** | Has circuit breaker structure — no actual API calls to HIBP, SecurityTrails, Censys, Shodan | Query real breach databases and dark web sources |
| **SpamShield ML Engine** | `classifyTextBERT()` returns `{ isSpam: false, confidence: 1.0 }` | Classify SMS/call text as spam using ML |
| **SpamShield Reputation API** | Hiya/Truecaller lookups return `{ score: 0, isSpam: false }` | Query real phone reputation databases |
| **HomeTitle County Scanner** | Returns `{ ownerName: "Unknown Owner", address: {} }` | Fetch real county deed records |
| **HomeTitle HTML Parser** | `parseDeedRecords()` logs "not yet implemented" and returns null | Parse county record HTML/JSON responses |
| **RemoveBrokers Removal Engine** | Returns `{ success: true, requestId: crypto.randomUUID() }` | Actually submit opt-out requests to brokers |
| **RemoveBrokers Email** | Returns `{ success: true }` without sending anything | Send opt-out emails to broker addresses |
| **RemoveBrokers Status Tracking** | Returns `{ status: "pending" }` always | Poll brokers for actual removal status |
| **Billing Webhooks** | No webhook handler implemented | Handle Stripe webhook events (checkout, renewal, cancel) |
| **Billing Checkout** | No checkout session creation | Create Stripe Checkout sessions for subscription plans |
---
## 2. Gap Analysis by Service
### VoicePrint — Voice Clone Detection
**Current:** 56-line ML engine, all stubs. No audio processing, no model loading, no inference.
**What's needed for a working product:**
1. **API-first approach (fastest):**
- Integrate Microsoft Azure Voice Live API (~$0.016/min) for liveness detection
- Integrate Pindrop or Daon API for passive detection
- Estimated cost: $60K$230K/year at scale
2. **Build in-house (differentiating but expensive):**
- Deploy AASIST or RawNet2 model (open-source from ASVspoof 2021)
- GPU inference infrastructure (NVIDIA T4/A10, $300$800/mo per node)
- Audio preprocessing pipeline (VAD, resampling, normalization)
- Enrollment system (collect voice samples, generate embeddings)
- Estimated cost: $840K$1.25M Year 1
3. **Mobile integration:**
- iOS: Integrate with CallKit for real-time call analysis
- Android: Integrate with Telecom API
- On-device inference for low-latency screening
**Market reality:** Voice clone detection is the most technically ambitious service. Hiya and Truecaller have carrier-level integrations you can't replicate without carrier partnerships. Your differentiator should be **consumer-facing analysis** (record a suspicious call → analyze → report), not real-time PSTN interception.
**Effort:** 612 months to MVP, $100K$500K
**Revenue potential:** High — this is the most novel service in your suite. Competitors don't offer this to consumers.
---
### DarkWatch — Dark Web & Breach Monitoring
**Current:** Best-implemented service. Has scan engine architecture, circuit breakers, alert pipeline, watchlist CRUD, exposure dedup. Missing: actual API calls to external data sources.
**What's needed for a working product:**
1. **API integrations (the core work):**
- **HaveIBeenPwned (HIBP):** Free tier (1,500 req/mo) → Paid ($3.50/mo individual). Check emails against breach database.
- **SecurityTrails:** $49/mo Pro plan. DNS/WHOIS monitoring for domain exposure.
- **Censys:** $79/mo Pro. Internet-wide scanning for exposed services.
- **Shodan:** $299/mo Small Business. IoT/device exposure monitoring.
- **Optional — Breachsense:** $199/mo for deep dark web scanning.
2. **Data pipeline:**
- Implement actual `fetchWithCircuit()` calls to each API
- Parse and normalize responses into your exposure schema
- Schedule periodic scans (daily/weekly depending on tier)
- WebSocket push for real-time scan progress
3. **Alert quality:**
- Your severity scoring logic is already implemented
- Add alert fatigue reduction (dedup, cooldown periods)
- Email + push notification delivery
**Monthly API costs at scale:** ~$500$1,000/mo for base data sources
**Per-customer API cost:** ~$0.50$2.00/mo (amortized across user base)
**Effort:** 24 months, $20K$50K
**Revenue potential:** Medium — crowded market (Aura, LifeLock, Experian all offer this). Must differentiate on alert quality and multi-source correlation.
---
### SpamShield — Spam Call/SMS Classification
**Current:** Rule engine works (pattern matching, area code, prefix). ML engine and reputation APIs are stubs.
**What's needed for a working product:**
1. **Reputation API integrations:**
- **Hiya API:** Phone number reputation scoring. Carrier-level integration preferred but API available.
- **Truecaller API:** Caller ID and spam labeling.
- **Twilio Lookup API:** $0.004$0.03 per lookup. Caller name + line type.
- **STIR/SHAKEN verification:** Call authentication (requires telecom partner).
2. **ML text classification:**
- Fine-tune lightweight model (DistilBERT or TinyBERT) on SMS spam dataset
- Deploy as ONNX model for low-latency inference
- Training data: Enron Spam Corpus, SMS Spam Collection, custom labeled data
3. **Mobile integration:**
- iOS: CallKit integration for real-time caller screening
- Android: Telecom API for call filtering
- SMS interception (requires carrier permissions or SMS app integration)
**Monthly API costs:** Twilio Lookup ~$0.004/lookup. Hiya/Truecaller custom pricing.
**Per-customer cost:** ~$1$5/mo depending on call volume.
**Effort:** 23 months, $15K$40K
**Revenue potential:** Medium-High — Hiya/Truecaller dominate at carrier level, but consumer-facing spam classification with AI detection is underserved.
---
### HomeTitle — Property Deed Monitoring
**Current:** Geocoding works (Google Maps API). County records fetcher returns mock data. HTML parser not implemented. Change detection logic is solid.
**What's needed for a working product:**
1. **County data sources (the hard part):**
- **US county recorder APIs:** ~3,000 counties, each with different data formats
- **Commercial aggregators:**
- **Attom Data Solutions:** Property records API, ~$0.05$0.10/record
- **CoreLogic:** Property intelligence, enterprise pricing
- **Black Knight (Moody's):** Property data, enterprise pricing
- **County-specific APIs:** Some counties offer open data (e.g., Cook County IL, Harris County TX)
- **Web scraping fallback:** Parse county recorder websites (fragile, requires maintenance)
2. **Monitoring pipeline:**
- Initial property snapshot (owner, deed date, liens, tax info)
- Periodic re-scan (weekly/monthly)
- Change detection (your logic is already implemented)
- Alert generation (ownership transfer, lien filing, tax change)
3. **Property verification:**
- Geocoding → parcel ID lookup → county record fetch
- Handle counties without digital records (mail-based requests)
**Monthly data costs:** Attom ~$500$5,000/mo depending on volume.
**Per-customer cost:** ~$2$10/mo depending on scan frequency.
**Effort:** 36 months, $30K$80K
**Revenue potential:** Medium — unique differentiator. No major competitor offers this in consumer identity protection. Real estate fraud is rising (FTC reports $1B+ in property fraud annually).
---
### RemoveBrokers — Data Broker Opt-Out
**Current:** Broker registry with 100+ entries (solid). Removal engine is a placeholder that returns mock request IDs. Email sending not implemented. Form submission not implemented.
**What's needed for a working product:**
1. **Automated removal engine:**
- **Headless browser automation:** Playwright/Puppeteer for each broker's opt-out flow
- **Form filling:** Dynamic form field detection and population
- **CAPTCHA handling:** 2Captcha/AntiCaptcha integration ($0.001$0.01/solve)
- **Email verification:** Handle opt-out confirmation emails
- **Physical mail:** Generate and mail opt-out letters for brokers requiring it
2. **Broker-specific adapters:**
- Each of 100+ brokers has unique opt-out flow
- Estimated 25 hours per broker to implement and test
- Ongoing maintenance: 1525% of scripts break per quarter
3. **Re-scan pipeline:**
- Periodic re-scans to detect re-listings
- Status tracking and progress reporting
4. **Competitor benchmark:**
- **DeleteMe:** 300+ brokers, $139/yr individual, $329/yr family
- **Kanary:** 400+ brokers, $132/yr individual, $264/yr family
- **OneRep:** 200+ brokers, $180/yr individual
**Monthly operational costs:** Proxies ($1K$6K), CAPTCHA solving ($3$8/customer), compute ($1K$5K)
**Per-customer cost:** ~$13$53/year (high margin: 6090%)
**Effort:** 24 months for initial 50 brokers, then incremental
**Revenue potential:** Medium — competitive market but high margins. Your advantage: bundling with other services.
---
### Billing & Payments
**Current:** Stripe client initialized. No checkout, webhooks, or subscription management.
**What's needed:**
1. **Stripe Checkout integration:**
- Create checkout sessions for each plan tier
- Handle success/cancel redirects
- Customer portal for subscription management
2. **Webhook handlers:**
- `checkout.session.completed` → activate subscription
- `invoice.payment_succeeded` → renew subscription
- `invoice.payment_failed` → grace period, retry
- `customer.subscription.deleted` → cancel access
- `customer.subscription.updated` → tier changes
3. **Subscription management:**
- Trial periods (14-day free trial)
- Tier upgrades/downgrades with proration
- Family plan member management
- Grace period before suspension
4. **Plan structure:**
- See pricing recommendations below
**Effort:** 12 months, $10K$20K
**Revenue potential:** N/A (enables all revenue)
---
## 3. Recommended Build Priority
Based on effort vs. market differentiation:
| Priority | Service | Why | Effort | Revenue Impact |
|----------|---------|-----|--------|----------------|
| **1** | **RemoveBrokers** | Highest margin (6090%), existing registry, clear competitor benchmark | 24 mo | Direct revenue, $11$27/mo |
| **2** | **DarkWatch** | Best architecture, API integrations needed, table-stakes feature | 24 mo | Core retention driver |
| **3** | **SpamShield** | Rule engine works, needs reputation APIs + ML | 23 mo | Differentiation vs. competitors |
| **4** | **Billing** | Enables all revenue, must ship before paid plans | 12 mo | Revenue enabler |
| **5** | **HomeTitle** | Unique differentiator, but data sourcing is hard | 36 mo | Premium tier upsell |
| **6** | **VoicePrint** | Most novel, but highest effort and cost | 612 mo | Brand differentiation |
**Recommended MVP scope:** RemoveBrokers + DarkWatch + SpamShield + Billing = **58 months to first revenue**.
---
## 4. Pricing Strategy
### Recommended Plan Structure
| Plan | Monthly Price | Annual Price | Features |
|------|--------------|--------------|----------|
| **Shield** (Entry) | $12/mo | $9/mo ($108/yr) | DarkWatch (basic), SpamShield, RemoveBrokers (50 brokers) |
| **Guard** (Core) | $22/mo | $18/mo ($216/yr) | All Shield + DarkWatch (full), RemoveBrokers (200+), HomeTitle (1 property) |
| **Fortress** (Premium) | $35/mo | $29/mo ($348/yr) | All Guard + HomeTitle (3 properties), VoicePrint, priority alerts, family (2 adults) |
| **Family Fortress** | $45/mo | $39/mo ($468/yr) | All Fortress + 5 adults + unlimited children |
### Competitive Positioning
| Your Plan | vs. Aura | vs. DeleteMe | vs. LifeLock |
|-----------|----------|-------------|--------------|
| Shield ($12) | Matches Aura Individual | Cheaper than DeleteMe ($11.58) | Cheaper than LifeLock Select |
| Guard ($22) | Below Aura Family | N/A (DeleteMe is removal-only) | Below LifeLock Advantage |
| Fortress ($35) | Below Aura Family | N/A | Below LifeLock Ultimate |
| Family ($45) | Above Aura Family ($37) | Above DeleteMe Family ($27.42) | Above LifeLock Family |
### Expected Unit Economics
| Metric | Estimate | Basis |
|--------|----------|-------|
| **ARPU (blended)** | $18$25/mo | Mix of tiers, family plans raise ARPU |
| **Gross margin** | 6575% | API costs, infrastructure, support |
| **CAC (organic)** | $50$150 | Content marketing, word-of-mouth |
| **CAC (paid)** | $200$400 | Google Ads, affiliate |
| **Monthly churn (individual)** | 35% | Industry benchmark |
| **Monthly churn (family)** | 12% | Higher switching costs |
| **LTV (individual)** | $600$1,200 | 24-mo avg life, $20 ARPU |
| **LTV (family)** | $1,600$2,400 | 48-mo avg life, $45 ARPU |
| **LTV:CAC (organic)** | 48x | Healthy |
| **LTV:CAC (paid)** | 24x | Marginal |
---
## 5. What Customers Actually Get (When Working)
### Monthly Value Perception
| Service | Customer Perceives | Actual Value |
|---------|-------------------|--------------|
| **VoicePrint** | "They detected a scam call cloning my daughter's voice" | Highest emotional impact, brand-defining |
| **DarkWatch** | "They found my email in a breach I didn't know about" | Table-stakes, expected by all competitors |
| **SpamShield** | "They blocked 47 spam calls this month" | Daily utility, high engagement |
| **HomeTitle** | "They caught a fraudulent lien on my house" | Highest dollar impact ($10K$100K+ saved) |
| **RemoveBrokers** | "They removed me from 127 people-search sites" | Tangible progress, visible results |
### Customer Loyalty Drivers
1. **Alert quality (not quantity):** One perfect alert > 20 noise alerts. Your correlation engine should reduce false positives.
2. **Family plan lock-in:** Once a family is enrolled, switching costs are high.
3. **Visible progress:** RemoveBrokers dashboard showing "127/300 removed" drives retention.
4. **Crisis response:** When a major breach hits (e.g., Change Healthcare 2024), proactive alerts create loyalty spikes.
5. **Mobile app quality:** Credit lock/unlock, real-time alerts, one-tap actions.
---
## 6. Infrastructure Costs at Scale
### Monthly Fixed Costs
| Component | 100 Users | 1,000 Users | 10,000 Users |
|-----------|-----------|-------------|--------------|
| **Turso (SQLite)** | $0$25 | $25$100 | $100$500 |
| **Redis** | $0$15 | $15$50 | $50$200 |
| **HIBP API** | $0 (free tier) | $3.50 | $50+ |
| **SecurityTrails** | $49 | $49 | $249 |
| **Censys** | $79 | $79 | $299 |
| **Shodan** | $299 | $299 | $599 |
| **Twilio (SpamShield)** | $5$20 | $20$100 | $100$500 |
| **Attom (HomeTitle)** | $500 | $1,000 | $5,000 |
| **Azure Voice Live** | $0 (dev) | $100$500 | $500$5,000 |
| **Proxies (RemoveBrokers)** | $100 | $500 | $2,000 |
| **CAPTCHA solving** | $10 | $50 | $200 |
| **Compute (SolidStart)** | $50 | $200 | $1,000 |
| **Total Fixed** | ~$1,200 | ~$2,500 | ~$16,000 |
### Per-User Variable Costs
| Service | Cost/User/Month | Notes |
|---------|-----------------|-------|
| DarkWatch | $0.50$2.00 | Amortized API costs |
| SpamShield | $1.00$5.00 | Twilio lookups, ML inference |
| HomeTitle | $2.00$10.00 | Attom record lookups |
| RemoveBrokers | $1.00$4.00 | Proxy + CAPTCHA + compute |
| VoicePrint | $0.50$3.00 | Azure API or GPU inference |
| **Total** | **$5.00$24.00** | Depends on usage |
At $18/mo average ARPU and $10/mo variable cost, **gross margin is ~44%** at early scale. Improves to **6575%** as API costs amortize and you negotiate volume pricing.
---
## 7. Risks & Mitigations
| Risk | Severity | Mitigation |
|------|----------|-----------|
| **VoicePrint never reaches production accuracy** | High | Ship API-first (Azure Voice Live), defer in-house model |
| **County data sourcing blocked** | High | Start with top 100 counties, use Attom API, expand gradually |
| **Broker scripts break constantly** | Medium | Budget 20% engineering time for maintenance, use AI-assisted scraping |
| **Competitor price war (Aura at $12/mo)** | Medium | Differentiate on VoicePrint + HomeTitle (unique features) |
| **API cost overruns** | Medium | Implement rate limits per tier, cache aggressively, negotiate volume pricing |
| **Regulatory compliance (FCRA, GLBA)** | High | Legal review before launch, SOC 2 Type II certification |
| **False positive alerts destroy trust** | High | Human review queue for low-confidence alerts, user feedback loop |
---
## 8. Timeline to Revenue
### Phase 1: Foundation (Months 12)
- ✅ Billing integration (Stripe Checkout + webhooks)
- ✅ RemoveBrokers: Implement removal for top 20 brokers
- ✅ DarkWatch: Connect HIBP + SecurityTrails APIs
- **Revenue:** None (beta testers only)
### Phase 2: MVP Launch (Months 34)
- ✅ RemoveBrokers: 50+ brokers with automated removal
- ✅ DarkWatch: Full scan pipeline with HIBP, SecurityTrails, Censys
- ✅ SpamShield: Reputation API integration (Twilio Lookup + Hiya)
- ✅ Billing: Free trial + paid plans
- **Revenue:** $12/mo Shield plan, target 100 beta users
### Phase 3: Growth (Months 58)
- ✅ RemoveBrokers: 100+ brokers
- ✅ DarkWatch: Add Shodan, Breachsense
- ✅ SpamShield: ML text classification (fine-tuned DistilBERT)
- ✅ HomeTitle: Top 50 counties + Attom API
- **Revenue:** All tiers, target 1,000 users
### Phase 4: Differentiation (Months 912)
- ✅ VoicePrint: Azure Voice Live API integration
- ✅ HomeTitle: 200+ counties
- ✅ Correlation engine: Cross-service threat scoring
- ✅ Mobile: Real-time call screening (iOS CallKit, Android Telecom)
- **Revenue:** Premium tiers, target 5,000 users
---
## 9. Bottom Line
**What you have:** A well-architected platform skeleton with auth, database, API layer, dashboard UI, mobile apps, and queueing infrastructure.
**What you need:** The actual data integrations and ML models that make the services useful. Currently, every core service returns mock data or stub responses.
**Fastest path to revenue (58 months):** RemoveBrokers + DarkWatch + SpamShield + Billing. These three services are achievable with API integrations and automation — no custom ML training required.
**Total investment to MVP revenue:** ~$65K$140K (engineering + API costs for 58 months).
**Expected pricing:** $12$45/mo depending on tier. Industry benchmark ARPU: $18$25/mo.
**Expected LTV:** $600$2,400 depending on plan tier (individual vs. family).
**Key differentiator from competitors:** VoicePrint (voice clone detection) + HomeTitle (property monitoring). These are unique in the consumer market. But they're also the hardest to build.
**Strategic recommendation:** Ship RemoveBrokers + DarkWatch first (fastest ROI, proven demand), then layer in SpamShield + HomeTitle for differentiation, then VoicePrint as the crown jewel that justifies premium pricing.