shortcommings

This commit is contained in:
2026-05-31 22:03:18 -04:00
parent 3b29de3234
commit c159f07322
17 changed files with 1535 additions and 4 deletions

View File

@@ -0,0 +1,79 @@
# 08. Expand Broker Coverage to 50+ with CAPTCHA Solving and Re-Scan Pipeline
meta:
id: core-services-08
feature: core-services-implementation
priority: P2
depends_on: [core-services-02]
tags: [removebrokers, automation, captcha, scaling, maintenance]
objective:
- Scale from top 20 brokers to 50+ automated removals, implement CAPTCHA solving, and build the re-scan pipeline that detects re-listings.
deliverables:
- 30+ additional broker adapters (total 50+)
- CAPTCHA solving integration (2Captcha or AntiCaptcha API)
- Re-scan scheduler that checks if removed profiles have reappeared
- Email verification handling for opt-out confirmation emails
- Removal success rate dashboard metric
steps:
1. Select next 30 brokers from registry by opt-out complexity (medium-difficulty form-based flows)
2. Create adapter modules for each broker in `removebrokers/adapters/`
3. Implement CAPTCHA solving:
- Detect reCAPTCHA v2/v3, hCaptcha, image challenges
- Integrate 2Captcha API ($0.001$0.01 per solve)
- Add `CAPTCHA_SOLVER_API_KEY` to environment config
- Fallback to manual queue if CAPTCHA solving fails 3 times
4. Implement email verification handling:
- Monitor mailbox for opt-out confirmation emails
- Parse confirmation links and auto-click them
- Store confirmation status in database
5. Build re-scan pipeline:
- Weekly scheduled job that re-scans all "completed" removals
- If profile reappears, create new removal request automatically
- Track re-listing rate per broker (some re-list every 30 days)
6. Add success metrics:
- Track removal success rate per broker (% of opt-outs that stick)
- Dashboard widget showing "X of Y brokers removed"
- Alert user when re-listing detected
7. Implement proxy rotation pool:
- Use residential proxy service (BrightData, IPRoyal)
- Rotate IP per broker session to avoid blocks
- Budget $1K$3K/mo for proxy infrastructure
8. Add adapter health monitoring:
- Track adapter breakage rate
- Alert engineering when >5% of adapters fail in 24h
- Auto-disable broken adapters, queue for manual fix
tests:
- Unit: Mock CAPTCHA solver, verify retry and fallback logic
- Integration: Test CAPTCHA solving against real broker site
- E2E: Complete removal for broker with CAPTCHA → verify re-scan detects re-listing
acceptance_criteria:
- [ ] 50+ broker adapters implemented and tested
- [ ] CAPTCHA challenges are detected and solved automatically (2Captcha integration)
- [ ] Failed CAPTCHA solving escalates to manual queue after 3 attempts
- [ ] Email confirmation links are parsed and clicked automatically
- [ ] Re-scan job runs weekly and detects re-listings within 7 days
- [ ] Re-listed profiles trigger automatic new removal requests
- [ ] Dashboard shows accurate removal progress: "47 of 50 brokers completed"
- [ ] Per-broker success rate is tracked and visible in admin panel
- [ ] Proxy rotation prevents IP blocking on high-volume brokers
- [ ] Adapter breakage is detected within 24 hours and auto-disabled
- [ ] Monthly proxy + CAPTCHA cost per user < $4 (within gross margin target)
validation:
- Run `vitest run removebrokers.service.test.ts` — extended tests for 50 brokers
- Manual: Test CAPTCHA broker (e.g., MyLife), verify automatic solving works
- Check re-scan: Run `bun run job:removebrokers:rescan`, verify re-listings detected
- Monitor costs: Dashboard shows monthly proxy/CAPTCHA spend per customer
notes:
- Broker sites change frequently — budget 20% engineering time for adapter maintenance
- Some brokers (Acxiom, Epsilon) require physical mail — flag these for manual processing
- Re-listing is common — data brokers rebuild databases from public records every 3090 days
- Consider AI-assisted form field detection (GPT-4 Vision) to reduce per-adapter development time
- The existing `broker.registry.ts` already has 100+ entries — prioritize by traffic/popularity
- Success rate target: 80%+ for automated removals, 90%+ with manual fallback