# 08. Expand Broker Coverage to 50+ with CAPTCHA Solving and Re-Scan Pipeline meta: id: core-services-08 feature: core-services-implementation priority: P2 depends_on: [core-services-02] tags: [removebrokers, automation, captcha, scaling, maintenance] objective: - Scale from top 20 brokers to 50+ automated removals, implement CAPTCHA solving, and build the re-scan pipeline that detects re-listings. deliverables: - 30+ additional broker adapters (total 50+) - CAPTCHA solving integration (2Captcha or AntiCaptcha API) - Re-scan scheduler that checks if removed profiles have reappeared - Email verification handling for opt-out confirmation emails - Removal success rate dashboard metric steps: 1. Select next 30 brokers from registry by opt-out complexity (medium-difficulty form-based flows) 2. Create adapter modules for each broker in `removebrokers/adapters/` 3. Implement CAPTCHA solving: - Detect reCAPTCHA v2/v3, hCaptcha, image challenges - Integrate 2Captcha API ($0.001–$0.01 per solve) - Add `CAPTCHA_SOLVER_API_KEY` to environment config - Fallback to manual queue if CAPTCHA solving fails 3 times 4. Implement email verification handling: - Monitor mailbox for opt-out confirmation emails - Parse confirmation links and auto-click them - Store confirmation status in database 5. Build re-scan pipeline: - Weekly scheduled job that re-scans all "completed" removals - If profile reappears, create new removal request automatically - Track re-listing rate per broker (some re-list every 30 days) 6. Add success metrics: - Track removal success rate per broker (% of opt-outs that stick) - Dashboard widget showing "X of Y brokers removed" - Alert user when re-listing detected 7. Implement proxy rotation pool: - Use residential proxy service (BrightData, IPRoyal) - Rotate IP per broker session to avoid blocks - Budget $1K–$3K/mo for proxy infrastructure 8. Add adapter health monitoring: - Track adapter breakage rate - Alert engineering when >5% of adapters fail in 24h - Auto-disable broken adapters, queue for manual fix tests: - Unit: Mock CAPTCHA solver, verify retry and fallback logic - Integration: Test CAPTCHA solving against real broker site - E2E: Complete removal for broker with CAPTCHA → verify re-scan detects re-listing acceptance_criteria: - [ ] 50+ broker adapters implemented and tested - [ ] CAPTCHA challenges are detected and solved automatically (2Captcha integration) - [ ] Failed CAPTCHA solving escalates to manual queue after 3 attempts - [ ] Email confirmation links are parsed and clicked automatically - [ ] Re-scan job runs weekly and detects re-listings within 7 days - [ ] Re-listed profiles trigger automatic new removal requests - [ ] Dashboard shows accurate removal progress: "47 of 50 brokers completed" - [ ] Per-broker success rate is tracked and visible in admin panel - [ ] Proxy rotation prevents IP blocking on high-volume brokers - [ ] Adapter breakage is detected within 24 hours and auto-disabled - [ ] Monthly proxy + CAPTCHA cost per user < $4 (within gross margin target) validation: - Run `vitest run removebrokers.service.test.ts` — extended tests for 50 brokers - Manual: Test CAPTCHA broker (e.g., MyLife), verify automatic solving works - Check re-scan: Run `bun run job:removebrokers:rescan`, verify re-listings detected - Monitor costs: Dashboard shows monthly proxy/CAPTCHA spend per customer notes: - Broker sites change frequently — budget 20% engineering time for adapter maintenance - Some brokers (Acxiom, Epsilon) require physical mail — flag these for manual processing - Re-listing is common — data brokers rebuild databases from public records every 30–90 days - Consider AI-assisted form field detection (GPT-4 Vision) to reduce per-adapter development time - The existing `broker.registry.ts` already has 100+ entries — prioritize by traffic/popularity - Success rate target: 80%+ for automated removals, 90%+ with manual fallback