Kordant/docs/PRODUCT-GAP-ANALYSIS.md

# Kordant: Product Gap Analysis & Path to Revenue

**Date:** May 31, 2026
**Scope:** What's functional vs. scaffolding, what's needed to ship, expected customer value, pricing

---

## Executive Summary

Kordant is a **well-architected platform with mostly scaffolding implementations**. The codebase has excellent structure — tRPC routers, Drizzle ORM schemas, service layers, job handlers, mobile apps, and a Rust queueing library (Honker). However, **none of the five core services deliver real value to a paying customer today**. The ML models return stub data, external API integrations are placeholders, and data sources return mock results.

**Bottom line:** You have the platform skeleton. You need to build the muscles.

| Service | Status | Lines of Code | Real Functionality | Effort to Ship |
|---------|--------|---------------|-------------------|----------------|
| **VoicePrint** | ❌ Pure scaffolding | ~240 | None — returns `isSynthetic: false` | 6–12 months, $100K–$500K |
| **DarkWatch** | ⚠️ Architecture only | ~500+ | Circuit breakers, alert pipeline, CRUD — no real API calls | 2–4 months, $20K–$50K |
| **SpamShield** | ⚠️ Rule engine only | ~400+ | Pattern matching works — ML & reputation APIs are stubs | 2–3 months, $15K–$40K |
| **HomeTitle** | ❌ Scaffolding | ~300 | Geocoding works — county records return mock data | 3–6 months, $30K–$80K |
| **RemoveBrokers** | ⚠️ Registry only | ~1,500+ | Broker registry (100+ entries) — removal engine is placeholder | 2–4 months, $20K–$50K |
| **Billing** | ⚠️ Minimal | ~100 | Stripe client — no webhooks, proration, or checkout | 1–2 months, $10K–$20K |
| **Auth** | ✅ Functional | ~200 | JWT + bcrypt working | Done |

---

## 1. Current State: What Actually Works

### ✅ Functional (Shippable Today)

- **Authentication:** JWT signing/verification (jose), password hashing (bcrypt, 10 rounds). Solid implementation.
- **Database Schema:** Complete Drizzle ORM schemas for all 5 services, alerts, billing, subscriptions, audit logs.
- **tRPC API Layer:** Router scaffolding for all services with proper Zod schemas.
- **Dashboard UI:** Web dashboard with sidebar, threat score widget, alert feed, service widgets.
- **Mobile Apps:** iOS (SwiftUI) and Android (Compose) with ViewModels, Models, and navigation. Thin clients calling tRPC.
- **Browser Extension:** Chrome Manifest V3 extension shell.
- **Honker (Rust):** Queueing library for background jobs, FFI bindings.
- **Geocoding:** Google Maps API integration in HomeTitle (works if API key provided).
- **SpamShield Rule Engine:** Regex/area code/prefix pattern matching works.
- **DarkWatch Alert Pipeline:** Severity scoring, exposure deduplication, alert creation logic.
- **RemoveBrokers Registry:** 100+ broker entries with domains, removal URLs, categories.

### ❌ Not Functional (Scaffolding/Placeholders)

| Component | What It Does | What It Should Do |
|-----------|-------------|-------------------|
| **VoicePrint ML Engine** | Returns `{ isSynthetic: false, confidence: 1.0, score: 0.0 }` | Detect AI-generated voices in real-time |
| **VoicePrint Voice Matching** | Returns `{ similarity: 0, matched: false }` | Compare voice against enrolled templates |
| **VoicePrint Embedding** | Returns empty `Float64Array(256)` + SHA256 hash | Generate voice embeddings for enrollment |
| **DarkWatch Scan Engine** | Has circuit breaker structure — no actual API calls to HIBP, SecurityTrails, Censys, Shodan | Query real breach databases and dark web sources |
| **SpamShield ML Engine** | `classifyTextBERT()` returns `{ isSpam: false, confidence: 1.0 }` | Classify SMS/call text as spam using ML |
| **SpamShield Reputation API** | Hiya/Truecaller lookups return `{ score: 0, isSpam: false }` | Query real phone reputation databases |
| **HomeTitle County Scanner** | Returns `{ ownerName: "Unknown Owner", address: {} }` | Fetch real county deed records |
| **HomeTitle HTML Parser** | `parseDeedRecords()` logs "not yet implemented" and returns null | Parse county record HTML/JSON responses |
| **RemoveBrokers Removal Engine** | Returns `{ success: true, requestId: crypto.randomUUID() }` | Actually submit opt-out requests to brokers |
| **RemoveBrokers Email** | Returns `{ success: true }` without sending anything | Send opt-out emails to broker addresses |
| **RemoveBrokers Status Tracking** | Returns `{ status: "pending" }` always | Poll brokers for actual removal status |
| **Billing Webhooks** | No webhook handler implemented | Handle Stripe webhook events (checkout, renewal, cancel) |
| **Billing Checkout** | No checkout session creation | Create Stripe Checkout sessions for subscription plans |

---

## 2. Gap Analysis by Service

### VoicePrint — Voice Clone Detection

**Current:** 56-line ML engine, all stubs. No audio processing, no model loading, no inference.

**What's needed for a working product:**

1. **API-first approach (fastest):**
   - Integrate Microsoft Azure Voice Live API (~$0.016/min) for liveness detection
   - Integrate Pindrop or Daon API for passive detection
   - Estimated cost: $60K–$230K/year at scale

2. **Build in-house (differentiating but expensive):**
   - Deploy AASIST or RawNet2 model (open-source from ASVspoof 2021)
   - GPU inference infrastructure (NVIDIA T4/A10, $300–$800/mo per node)
   - Audio preprocessing pipeline (VAD, resampling, normalization)
   - Enrollment system (collect voice samples, generate embeddings)
   - Estimated cost: $840K–$1.25M Year 1

3. **Mobile integration:**
   - iOS: Integrate with CallKit for real-time call analysis
   - Android: Integrate with Telecom API
   - On-device inference for low-latency screening

**Market reality:** Voice clone detection is the most technically ambitious service. Hiya and Truecaller have carrier-level integrations you can't replicate without carrier partnerships. Your differentiator should be **consumer-facing analysis** (record a suspicious call → analyze → report), not real-time PSTN interception.

**Effort:** 6–12 months to MVP, $100K–$500K
**Revenue potential:** High — this is the most novel service in your suite. Competitors don't offer this to consumers.

---

### DarkWatch — Dark Web & Breach Monitoring

**Current:** Best-implemented service. Has scan engine architecture, circuit breakers, alert pipeline, watchlist CRUD, exposure dedup. Missing: actual API calls to external data sources.

**What's needed for a working product:**

1. **API integrations (the core work):**
   - **HaveIBeenPwned (HIBP):** Free tier (1,500 req/mo) → Paid ($3.50/mo individual). Check emails against breach database.
   - **SecurityTrails:** $49/mo Pro plan. DNS/WHOIS monitoring for domain exposure.
   - **Censys:** $79/mo Pro. Internet-wide scanning for exposed services.
   - **Shodan:** $299/mo Small Business. IoT/device exposure monitoring.
   - **Optional — Breachsense:** $199/mo for deep dark web scanning.

2. **Data pipeline:**
   - Implement actual `fetchWithCircuit()` calls to each API
   - Parse and normalize responses into your exposure schema
   - Schedule periodic scans (daily/weekly depending on tier)
   - WebSocket push for real-time scan progress

3. **Alert quality:**
   - Your severity scoring logic is already implemented
   - Add alert fatigue reduction (dedup, cooldown periods)
   - Email + push notification delivery

**Monthly API costs at scale:** ~$500–$1,000/mo for base data sources
**Per-customer API cost:** ~$0.50–$2.00/mo (amortized across user base)

**Effort:** 2–4 months, $20K–$50K
**Revenue potential:** Medium — crowded market (Aura, LifeLock, Experian all offer this). Must differentiate on alert quality and multi-source correlation.

---

### SpamShield — Spam Call/SMS Classification

**Current:** Rule engine works (pattern matching, area code, prefix). ML engine and reputation APIs are stubs.

**What's needed for a working product:**

1. **Reputation API integrations:**
   - **Hiya API:** Phone number reputation scoring. Carrier-level integration preferred but API available.
   - **Truecaller API:** Caller ID and spam labeling.
   - **Twilio Lookup API:** $0.004–$0.03 per lookup. Caller name + line type.
   - **STIR/SHAKEN verification:** Call authentication (requires telecom partner).

2. **ML text classification:**
   - Fine-tune lightweight model (DistilBERT or TinyBERT) on SMS spam dataset
   - Deploy as ONNX model for low-latency inference
   - Training data: Enron Spam Corpus, SMS Spam Collection, custom labeled data

3. **Mobile integration:**
   - iOS: CallKit integration for real-time caller screening
   - Android: Telecom API for call filtering
   - SMS interception (requires carrier permissions or SMS app integration)

**Monthly API costs:** Twilio Lookup ~$0.004/lookup. Hiya/Truecaller custom pricing.
**Per-customer cost:** ~$1–$5/mo depending on call volume.

**Effort:** 2–3 months, $15K–$40K
**Revenue potential:** Medium-High — Hiya/Truecaller dominate at carrier level, but consumer-facing spam classification with AI detection is underserved.

---

### HomeTitle — Property Deed Monitoring

**Current:** Geocoding works (Google Maps API). County records fetcher returns mock data. HTML parser not implemented. Change detection logic is solid.

**What's needed for a working product:**

1. **County data sources (the hard part):**
   - **US county recorder APIs:** ~3,000 counties, each with different data formats
   - **Commercial aggregators:**
     - **Attom Data Solutions:** Property records API, ~$0.05–$0.10/record
     - **CoreLogic:** Property intelligence, enterprise pricing
     - **Black Knight (Moody's):** Property data, enterprise pricing
     - **County-specific APIs:** Some counties offer open data (e.g., Cook County IL, Harris County TX)
   - **Web scraping fallback:** Parse county recorder websites (fragile, requires maintenance)

2. **Monitoring pipeline:**
   - Initial property snapshot (owner, deed date, liens, tax info)
   - Periodic re-scan (weekly/monthly)
   - Change detection (your logic is already implemented)
   - Alert generation (ownership transfer, lien filing, tax change)

3. **Property verification:**
   - Geocoding → parcel ID lookup → county record fetch
   - Handle counties without digital records (mail-based requests)

**Monthly data costs:** Attom ~$500–$5,000/mo depending on volume.
**Per-customer cost:** ~$2–$10/mo depending on scan frequency.

**Effort:** 3–6 months, $30K–$80K
**Revenue potential:** Medium — unique differentiator. No major competitor offers this in consumer identity protection. Real estate fraud is rising (FTC reports $1B+ in property fraud annually).

---

### RemoveBrokers — Data Broker Opt-Out

**Current:** Broker registry with 100+ entries (solid). Removal engine is a placeholder that returns mock request IDs. Email sending not implemented. Form submission not implemented.

**What's needed for a working product:**

1. **Automated removal engine:**
   - **Headless browser automation:** Playwright/Puppeteer for each broker's opt-out flow
   - **Form filling:** Dynamic form field detection and population
   - **CAPTCHA handling:** 2Captcha/AntiCaptcha integration ($0.001–$0.01/solve)
   - **Email verification:** Handle opt-out confirmation emails
   - **Physical mail:** Generate and mail opt-out letters for brokers requiring it

2. **Broker-specific adapters:**
   - Each of 100+ brokers has unique opt-out flow
   - Estimated 2–5 hours per broker to implement and test
   - Ongoing maintenance: 15–25% of scripts break per quarter

3. **Re-scan pipeline:**
   - Periodic re-scans to detect re-listings
   - Status tracking and progress reporting

4. **Competitor benchmark:**
   - **DeleteMe:** 300+ brokers, $139/yr individual, $329/yr family
   - **Kanary:** 400+ brokers, $132/yr individual, $264/yr family
   - **OneRep:** 200+ brokers, $180/yr individual

**Monthly operational costs:** Proxies ($1K–$6K), CAPTCHA solving ($3–$8/customer), compute ($1K–$5K)
**Per-customer cost:** ~$13–$53/year (high margin: 60–90%)

**Effort:** 2–4 months for initial 50 brokers, then incremental
**Revenue potential:** Medium — competitive market but high margins. Your advantage: bundling with other services.

---

### Billing & Payments

**Current:** Stripe client initialized. No checkout, webhooks, or subscription management.

**What's needed:**

1. **Stripe Checkout integration:**
   - Create checkout sessions for each plan tier
   - Handle success/cancel redirects
   - Customer portal for subscription management

2. **Webhook handlers:**
   - `checkout.session.completed` → activate subscription
   - `invoice.payment_succeeded` → renew subscription
   - `invoice.payment_failed` → grace period, retry
   - `customer.subscription.deleted` → cancel access
   - `customer.subscription.updated` → tier changes

3. **Subscription management:**
   - Trial periods (14-day free trial)
   - Tier upgrades/downgrades with proration
   - Family plan member management
   - Grace period before suspension

4. **Plan structure:**
   - See pricing recommendations below

**Effort:** 1–2 months, $10K–$20K
**Revenue potential:** N/A (enables all revenue)

---

## 3. Recommended Build Priority

Based on effort vs. market differentiation:

| Priority | Service | Why | Effort | Revenue Impact |
|----------|---------|-----|--------|----------------|
| **1** | **RemoveBrokers** | Highest margin (60–90%), existing registry, clear competitor benchmark | 2–4 mo | Direct revenue, $11–$27/mo |
| **2** | **DarkWatch** | Best architecture, API integrations needed, table-stakes feature | 2–4 mo | Core retention driver |
| **3** | **SpamShield** | Rule engine works, needs reputation APIs + ML | 2–3 mo | Differentiation vs. competitors |
| **4** | **Billing** | Enables all revenue, must ship before paid plans | 1–2 mo | Revenue enabler |
| **5** | **HomeTitle** | Unique differentiator, but data sourcing is hard | 3–6 mo | Premium tier upsell |
| **6** | **VoicePrint** | Most novel, but highest effort and cost | 6–12 mo | Brand differentiation |

**Recommended MVP scope:** RemoveBrokers + DarkWatch + SpamShield + Billing = **5–8 months to first revenue**.

---

## 4. Pricing Strategy

### Recommended Plan Structure

| Plan | Monthly Price | Annual Price | Features |
|------|--------------|--------------|----------|
| **Shield** (Entry) | $12/mo | $9/mo ($108/yr) | DarkWatch (basic), SpamShield, RemoveBrokers (50 brokers) |
| **Guard** (Core) | $22/mo | $18/mo ($216/yr) | All Shield + DarkWatch (full), RemoveBrokers (200+), HomeTitle (1 property) |
| **Fortress** (Premium) | $35/mo | $29/mo ($348/yr) | All Guard + HomeTitle (3 properties), VoicePrint, priority alerts, family (2 adults) |
| **Family Fortress** | $45/mo | $39/mo ($468/yr) | All Fortress + 5 adults + unlimited children |

### Competitive Positioning

| Your Plan | vs. Aura | vs. DeleteMe | vs. LifeLock |
|-----------|----------|-------------|--------------|
| Shield ($12) | Matches Aura Individual | Cheaper than DeleteMe ($11.58) | Cheaper than LifeLock Select |
| Guard ($22) | Below Aura Family | N/A (DeleteMe is removal-only) | Below LifeLock Advantage |
| Fortress ($35) | Below Aura Family | N/A | Below LifeLock Ultimate |
| Family ($45) | Above Aura Family ($37) | Above DeleteMe Family ($27.42) | Above LifeLock Family |

### Expected Unit Economics

| Metric | Estimate | Basis |
|--------|----------|-------|
| **ARPU (blended)** | $18–$25/mo | Mix of tiers, family plans raise ARPU |
| **Gross margin** | 65–75% | API costs, infrastructure, support |
| **CAC (organic)** | $50–$150 | Content marketing, word-of-mouth |
| **CAC (paid)** | $200–$400 | Google Ads, affiliate |
| **Monthly churn (individual)** | 3–5% | Industry benchmark |
| **Monthly churn (family)** | 1–2% | Higher switching costs |
| **LTV (individual)** | $600–$1,200 | 24-mo avg life, $20 ARPU |
| **LTV (family)** | $1,600–$2,400 | 48-mo avg life, $45 ARPU |
| **LTV:CAC (organic)** | 4–8x | Healthy |
| **LTV:CAC (paid)** | 2–4x | Marginal |

---

## 5. What Customers Actually Get (When Working)

### Monthly Value Perception

| Service | Customer Perceives | Actual Value |
|---------|-------------------|--------------|
| **VoicePrint** | "They detected a scam call cloning my daughter's voice" | Highest emotional impact, brand-defining |
| **DarkWatch** | "They found my email in a breach I didn't know about" | Table-stakes, expected by all competitors |
| **SpamShield** | "They blocked 47 spam calls this month" | Daily utility, high engagement |
| **HomeTitle** | "They caught a fraudulent lien on my house" | Highest dollar impact ($10K–$100K+ saved) |
| **RemoveBrokers** | "They removed me from 127 people-search sites" | Tangible progress, visible results |

### Customer Loyalty Drivers

1. **Alert quality (not quantity):** One perfect alert > 20 noise alerts. Your correlation engine should reduce false positives.
2. **Family plan lock-in:** Once a family is enrolled, switching costs are high.
3. **Visible progress:** RemoveBrokers dashboard showing "127/300 removed" drives retention.
4. **Crisis response:** When a major breach hits (e.g., Change Healthcare 2024), proactive alerts create loyalty spikes.
5. **Mobile app quality:** Credit lock/unlock, real-time alerts, one-tap actions.

---

## 6. Infrastructure Costs at Scale

### Monthly Fixed Costs

| Component | 100 Users | 1,000 Users | 10,000 Users |
|-----------|-----------|-------------|--------------|
| **Turso (SQLite)** | $0–$25 | $25–$100 | $100–$500 |
| **Redis** | $0–$15 | $15–$50 | $50–$200 |
| **HIBP API** | $0 (free tier) | $3.50 | $50+ |
| **SecurityTrails** | $49 | $49 | $249 |
| **Censys** | $79 | $79 | $299 |
| **Shodan** | $299 | $299 | $599 |
| **Twilio (SpamShield)** | $5–$20 | $20–$100 | $100–$500 |
| **Attom (HomeTitle)** | $500 | $1,000 | $5,000 |
| **Azure Voice Live** | $0 (dev) | $100–$500 | $500–$5,000 |
| **Proxies (RemoveBrokers)** | $100 | $500 | $2,000 |
| **CAPTCHA solving** | $10 | $50 | $200 |
| **Compute (SolidStart)** | $50 | $200 | $1,000 |
| **Total Fixed** | ~$1,200 | ~$2,500 | ~$16,000 |

### Per-User Variable Costs

| Service | Cost/User/Month | Notes |
|---------|-----------------|-------|
| DarkWatch | $0.50–$2.00 | Amortized API costs |
| SpamShield | $1.00–$5.00 | Twilio lookups, ML inference |
| HomeTitle | $2.00–$10.00 | Attom record lookups |
| RemoveBrokers | $1.00–$4.00 | Proxy + CAPTCHA + compute |
| VoicePrint | $0.50–$3.00 | Azure API or GPU inference |
| **Total** | **$5.00–$24.00** | Depends on usage |

At $18/mo average ARPU and $10/mo variable cost, **gross margin is ~44%** at early scale. Improves to **65–75%** as API costs amortize and you negotiate volume pricing.

---

## 7. Risks & Mitigations

| Risk | Severity | Mitigation |
|------|----------|-----------|
| **VoicePrint never reaches production accuracy** | High | Ship API-first (Azure Voice Live), defer in-house model |
| **County data sourcing blocked** | High | Start with top 100 counties, use Attom API, expand gradually |
| **Broker scripts break constantly** | Medium | Budget 20% engineering time for maintenance, use AI-assisted scraping |
| **Competitor price war (Aura at $12/mo)** | Medium | Differentiate on VoicePrint + HomeTitle (unique features) |
| **API cost overruns** | Medium | Implement rate limits per tier, cache aggressively, negotiate volume pricing |
| **Regulatory compliance (FCRA, GLBA)** | High | Legal review before launch, SOC 2 Type II certification |
| **False positive alerts destroy trust** | High | Human review queue for low-confidence alerts, user feedback loop |

---

## 8. Timeline to Revenue

### Phase 1: Foundation (Months 1–2)
- ✅ Billing integration (Stripe Checkout + webhooks)
- ✅ RemoveBrokers: Implement removal for top 20 brokers
- ✅ DarkWatch: Connect HIBP + SecurityTrails APIs
- **Revenue:** None (beta testers only)

### Phase 2: MVP Launch (Months 3–4)
- ✅ RemoveBrokers: 50+ brokers with automated removal
- ✅ DarkWatch: Full scan pipeline with HIBP, SecurityTrails, Censys
- ✅ SpamShield: Reputation API integration (Twilio Lookup + Hiya)
- ✅ Billing: Free trial + paid plans
- **Revenue:** $12/mo Shield plan, target 100 beta users

### Phase 3: Growth (Months 5–8)
- ✅ RemoveBrokers: 100+ brokers
- ✅ DarkWatch: Add Shodan, Breachsense
- ✅ SpamShield: ML text classification (fine-tuned DistilBERT)
- ✅ HomeTitle: Top 50 counties + Attom API
- **Revenue:** All tiers, target 1,000 users

### Phase 4: Differentiation (Months 9–12)
- ✅ VoicePrint: Azure Voice Live API integration
- ✅ HomeTitle: 200+ counties
- ✅ Correlation engine: Cross-service threat scoring
- ✅ Mobile: Real-time call screening (iOS CallKit, Android Telecom)
- **Revenue:** Premium tiers, target 5,000 users

---

## 9. Bottom Line

**What you have:** A well-architected platform skeleton with auth, database, API layer, dashboard UI, mobile apps, and queueing infrastructure.

**What you need:** The actual data integrations and ML models that make the services useful. Currently, every core service returns mock data or stub responses.

**Fastest path to revenue (5–8 months):** RemoveBrokers + DarkWatch + SpamShield + Billing. These three services are achievable with API integrations and automation — no custom ML training required.

**Total investment to MVP revenue:** ~$65K–$140K (engineering + API costs for 5–8 months).

**Expected pricing:** $12–$45/mo depending on tier. Industry benchmark ARPU: $18–$25/mo.

**Expected LTV:** $600–$2,400 depending on plan tier (individual vs. family).

**Key differentiator from competitors:** VoicePrint (voice clone detection) + HomeTitle (property monitoring). These are unique in the consumer market. But they're also the hardest to build.

**Strategic recommendation:** Ship RemoveBrokers + DarkWatch first (fastest ROI, proven demand), then layer in SpamShield + HomeTitle for differentiation, then VoicePrint as the crown jewel that justifies premium pricing.