# 02. Automated Removal Engine for Top 20 Data Brokers meta: id: core-services-02 feature: core-services-implementation priority: P0 depends_on: [core-services-01] tags: [removebrokers, automation, playwright, scraping, revenue] objective: - Replace the `submitAutomatedRemoval()` stub that returns `crypto.randomUUID()` with a real Playwright-based browser automation that submits opt-out requests to the top 20 data brokers. deliverables: - Playwright-based removal engine in `removebrokers/removal.engine.ts` - Per-broker adapter modules for top 20 brokers (Spokeo, Whitepages, MyLife, BeenVerified, etc.) - CAPTCHA detection and graceful failure (manual fallback flow) - Removal request status tracking with actual polling - Email notification service integration for opt-out confirmations steps: 1. Install Playwright: `npm install -D playwright @playwright/test` 2. Analyze opt-out flows for top 20 brokers from existing registry data 3. Create `removebrokers/adapters/` directory with one module per broker 4. Implement base adapter interface: `scanForProfile`, `submitOptOut`, `verifyRemoval`, `getStatus` 5. Implement adapters for each top 20 broker with navigation, form filling, and submission logic 6. Add proxy rotation support (BrightData or similar) to avoid IP blocking 7. Add stealth mode (playwright-stealth) to reduce detection 8. Implement `submitAutomatedRemoval()` to select correct adapter by broker ID and execute 9. Store actual request IDs from brokers (not generated UUIDs) in database 10. Implement `trackRemovalStatus()` with periodic re-scans for submitted requests 11. Integrate with notification service to email user when removal is confirmed 12. Add job handler for batch removal processing queue 13. Handle failures gracefully: retry with backoff, escalate to manual queue after 3 failures tests: - Unit: Mock Playwright browser, verify adapter navigation sequences - Integration: Run adapter against real broker site in headful mode, verify opt-out form submission - E2E: Full flow — add broker to watchlist → trigger removal → verify status progression acceptance_criteria: - [ ] Top 20 broker adapters are implemented and tested against live sites - [ ] `submitAutomatedRemoval()` no longer returns mock UUIDs — it submits real opt-out requests - [ ] Removal status tracks actual broker state (pending → submitted → completed/failed) - [ ] Failed removals are retried 3 times with exponential backoff, then escalated to manual queue - [ ] CAPTCHA challenges are detected and flagged for manual processing (not silently failing) - [ ] Job queue processes removals asynchronously without blocking API responses - [ ] User dashboard shows real removal progress per broker - [ ] All Playwright browsers are properly closed after each session (no resource leaks) validation: - Run `vitest run removebrokers.service.test.ts` — all tests pass - Manual test: Trigger removal for Spokeo, verify opt-out email received - Check database: `removal_requests` table has real request IDs and actual status values - Run removal job: `bun run job:removebrokers` processes queue without errors notes: - Broker sites change frequently — expect 15–25% of adapters to break per quarter - Some brokers require email verification sent to the listed email (often outdated) — flag these - Start with brokers that have simple form-based opt-outs; defer email/physical mail brokers to Phase 3 - The existing broker registry in `broker.registry.ts` already has removal URLs — use these as starting points - Budget $1K–$3K/mo for proxy infrastructure at scale