4.1 KiB
4.1 KiB
05. Periodic Scan Scheduling, WebSocket Progress, and Alert Deduplication
meta: id: core-services-05 feature: core-services-implementation priority: P1 depends_on: [core-services-03, core-services-04] tags: [darkwatch, scheduler, websocket, real-time, deduplication, alerts]
objective:
- Make DarkWatch continuously useful by scheduling periodic scans, providing real-time progress via WebSocket, and eliminating alert fatigue through intelligent deduplication.
deliverables:
- Cron-based scan scheduler with configurable frequency per tier
- WebSocket real-time scan progress updates (already have
websocket.ts) - Alert cooldown periods to prevent duplicate notifications
- Digest mode: batch low-priority alerts into daily/weekly summaries
- Scan history and metrics dashboard data
steps:
- Implement cron job scheduler in
jobs/handlers/darkwatch.scan.ts:- Daily scans for active subscriptions
- Respects tier limits (Shield = HIBP only daily, Guard+ = full suite weekly)
- Add
scanFrequencyfield to subscription schema (daily, weekly, monthly) - Wire WebSocket push from existing
websocket.tsinto scan engine:- Emit
scan:started,scan:progress(completedSources/totalSources),scan:completedevents - Client dashboard subscribes to user-specific scan events
- Emit
- Enhance alert deduplication beyond existing exposure dedup:
- Add
alertCooldownHoursper alert type (e.g., 24h for same breach, 72h for property changes) - Track lastAlertSentAt per (userId, alertType, source) tuple
- Don't create new alerts during cooldown unless severity increases
- Add
- Implement digest mode:
- Low-priority alerts (info) batched into daily digest email
- Warning/critical alerts sent immediately via push + email
- User preference: immediate vs. digest per severity level
- Add scan metrics:
- Store scan duration, sources checked, exposures found, alerts generated
- Aggregate for dashboard "threat score" calculation
- Implement scan failure recovery:
- Partial scan results saved even if one source fails
- Failed sources retried individually in next scan window
- Add rate limit per user: max 1 concurrent scan, queue subsequent requests
tests:
- Unit: Verify cron expression parsing, cooldown logic, digest batching
- Integration: Trigger scheduled scan, verify WebSocket events emitted in correct order
- E2E: Start scan from dashboard → watch progress bar → receive completion notification
acceptance_criteria:
- Scans run automatically on schedule without manual trigger (cron job)
- WebSocket pushes real-time progress:
scan:progressevents with percentage complete - Only one scan runs per user at a time; additional requests are queued
- Duplicate alerts are suppressed during cooldown period (configurable per type)
- Info-level alerts are batched into daily digest; warning/critical sent immediately
- Scan history is persisted and visible in dashboard (last scan date, sources checked, findings)
- Failed sources don't fail entire scan — partial results are saved
- Dashboard threat score updates automatically after each scan completion
- Free tier gets weekly scans; paid tiers get daily scans
- No duplicate notifications for same exposure across multiple scans
validation:
- Run cron job manually:
bun run job:darkwatch:scan, verify scan completes and exposures created - Connect to WebSocket:
wscat -c ws://localhost:3000/ws, subscribe to scan events - Check dashboard: Scan progress bar animates during active scan, threat score updates after
- Test cooldown: Trigger same scan twice rapidly, verify second scan doesn't create duplicate alerts
notes:
- The existing
scanStatesMap indarkwatch.service.tsis in-memory — move to Redis for multi-instance safety - WebSocket infrastructure exists at
websocket.ts— extend it for scan-specific events - The scheduler directory (
scheduler/) currently only has Dockerfiles — this task creates actual job logic - Consider using Honker (Rust queue) for scan job distribution once it's production-ready
- Alert fatigue is a real churn driver — aggressive deduplication is a competitive advantage