shortcommings
This commit is contained in:
@@ -0,0 +1,61 @@
|
||||
# 02. Automated Removal Engine for Top 20 Data Brokers
|
||||
|
||||
meta:
|
||||
id: core-services-02
|
||||
feature: core-services-implementation
|
||||
priority: P0
|
||||
depends_on: [core-services-01]
|
||||
tags: [removebrokers, automation, playwright, scraping, revenue]
|
||||
|
||||
objective:
|
||||
- Replace the `submitAutomatedRemoval()` stub that returns `crypto.randomUUID()` with a real Playwright-based browser automation that submits opt-out requests to the top 20 data brokers.
|
||||
|
||||
deliverables:
|
||||
- Playwright-based removal engine in `removebrokers/removal.engine.ts`
|
||||
- Per-broker adapter modules for top 20 brokers (Spokeo, Whitepages, MyLife, BeenVerified, etc.)
|
||||
- CAPTCHA detection and graceful failure (manual fallback flow)
|
||||
- Removal request status tracking with actual polling
|
||||
- Email notification service integration for opt-out confirmations
|
||||
|
||||
steps:
|
||||
1. Install Playwright: `npm install -D playwright @playwright/test`
|
||||
2. Analyze opt-out flows for top 20 brokers from existing registry data
|
||||
3. Create `removebrokers/adapters/` directory with one module per broker
|
||||
4. Implement base adapter interface: `scanForProfile`, `submitOptOut`, `verifyRemoval`, `getStatus`
|
||||
5. Implement adapters for each top 20 broker with navigation, form filling, and submission logic
|
||||
6. Add proxy rotation support (BrightData or similar) to avoid IP blocking
|
||||
7. Add stealth mode (playwright-stealth) to reduce detection
|
||||
8. Implement `submitAutomatedRemoval()` to select correct adapter by broker ID and execute
|
||||
9. Store actual request IDs from brokers (not generated UUIDs) in database
|
||||
10. Implement `trackRemovalStatus()` with periodic re-scans for submitted requests
|
||||
11. Integrate with notification service to email user when removal is confirmed
|
||||
12. Add job handler for batch removal processing queue
|
||||
13. Handle failures gracefully: retry with backoff, escalate to manual queue after 3 failures
|
||||
|
||||
tests:
|
||||
- Unit: Mock Playwright browser, verify adapter navigation sequences
|
||||
- Integration: Run adapter against real broker site in headful mode, verify opt-out form submission
|
||||
- E2E: Full flow — add broker to watchlist → trigger removal → verify status progression
|
||||
|
||||
acceptance_criteria:
|
||||
- [ ] Top 20 broker adapters are implemented and tested against live sites
|
||||
- [ ] `submitAutomatedRemoval()` no longer returns mock UUIDs — it submits real opt-out requests
|
||||
- [ ] Removal status tracks actual broker state (pending → submitted → completed/failed)
|
||||
- [ ] Failed removals are retried 3 times with exponential backoff, then escalated to manual queue
|
||||
- [ ] CAPTCHA challenges are detected and flagged for manual processing (not silently failing)
|
||||
- [ ] Job queue processes removals asynchronously without blocking API responses
|
||||
- [ ] User dashboard shows real removal progress per broker
|
||||
- [ ] All Playwright browsers are properly closed after each session (no resource leaks)
|
||||
|
||||
validation:
|
||||
- Run `vitest run removebrokers.service.test.ts` — all tests pass
|
||||
- Manual test: Trigger removal for Spokeo, verify opt-out email received
|
||||
- Check database: `removal_requests` table has real request IDs and actual status values
|
||||
- Run removal job: `bun run job:removebrokers` processes queue without errors
|
||||
|
||||
notes:
|
||||
- Broker sites change frequently — expect 15–25% of adapters to break per quarter
|
||||
- Some brokers require email verification sent to the listed email (often outdated) — flag these
|
||||
- Start with brokers that have simple form-based opt-outs; defer email/physical mail brokers to Phase 3
|
||||
- The existing broker registry in `broker.registry.ts` already has removal URLs — use these as starting points
|
||||
- Budget $1K–$3K/mo for proxy infrastructure at scale
|
||||
Reference in New Issue
Block a user