73 lines
4.1 KiB
Markdown
73 lines
4.1 KiB
Markdown
# 05. Periodic Scan Scheduling, WebSocket Progress, and Alert Deduplication
|
|
|
|
meta:
|
|
id: core-services-05
|
|
feature: core-services-implementation
|
|
priority: P1
|
|
depends_on: [core-services-03, core-services-04]
|
|
tags: [darkwatch, scheduler, websocket, real-time, deduplication, alerts]
|
|
|
|
objective:
|
|
- Make DarkWatch continuously useful by scheduling periodic scans, providing real-time progress via WebSocket, and eliminating alert fatigue through intelligent deduplication.
|
|
|
|
deliverables:
|
|
- Cron-based scan scheduler with configurable frequency per tier
|
|
- WebSocket real-time scan progress updates (already have `websocket.ts`)
|
|
- Alert cooldown periods to prevent duplicate notifications
|
|
- Digest mode: batch low-priority alerts into daily/weekly summaries
|
|
- Scan history and metrics dashboard data
|
|
|
|
steps:
|
|
1. Implement cron job scheduler in `jobs/handlers/darkwatch.scan.ts`:
|
|
- Daily scans for active subscriptions
|
|
- Respects tier limits (Shield = HIBP only daily, Guard+ = full suite weekly)
|
|
2. Add `scanFrequency` field to subscription schema (daily, weekly, monthly)
|
|
3. Wire WebSocket push from existing `websocket.ts` into scan engine:
|
|
- Emit `scan:started`, `scan:progress` (completedSources/totalSources), `scan:completed` events
|
|
- Client dashboard subscribes to user-specific scan events
|
|
4. Enhance alert deduplication beyond existing exposure dedup:
|
|
- Add `alertCooldownHours` per alert type (e.g., 24h for same breach, 72h for property changes)
|
|
- Track lastAlertSentAt per (userId, alertType, source) tuple
|
|
- Don't create new alerts during cooldown unless severity increases
|
|
5. Implement digest mode:
|
|
- Low-priority alerts (info) batched into daily digest email
|
|
- Warning/critical alerts sent immediately via push + email
|
|
- User preference: immediate vs. digest per severity level
|
|
6. Add scan metrics:
|
|
- Store scan duration, sources checked, exposures found, alerts generated
|
|
- Aggregate for dashboard "threat score" calculation
|
|
7. Implement scan failure recovery:
|
|
- Partial scan results saved even if one source fails
|
|
- Failed sources retried individually in next scan window
|
|
8. Add rate limit per user: max 1 concurrent scan, queue subsequent requests
|
|
|
|
tests:
|
|
- Unit: Verify cron expression parsing, cooldown logic, digest batching
|
|
- Integration: Trigger scheduled scan, verify WebSocket events emitted in correct order
|
|
- E2E: Start scan from dashboard → watch progress bar → receive completion notification
|
|
|
|
acceptance_criteria:
|
|
- [ ] Scans run automatically on schedule without manual trigger (cron job)
|
|
- [ ] WebSocket pushes real-time progress: `scan:progress` events with percentage complete
|
|
- [ ] Only one scan runs per user at a time; additional requests are queued
|
|
- [ ] Duplicate alerts are suppressed during cooldown period (configurable per type)
|
|
- [ ] Info-level alerts are batched into daily digest; warning/critical sent immediately
|
|
- [ ] Scan history is persisted and visible in dashboard (last scan date, sources checked, findings)
|
|
- [ ] Failed sources don't fail entire scan — partial results are saved
|
|
- [ ] Dashboard threat score updates automatically after each scan completion
|
|
- [ ] Free tier gets weekly scans; paid tiers get daily scans
|
|
- [ ] No duplicate notifications for same exposure across multiple scans
|
|
|
|
validation:
|
|
- Run cron job manually: `bun run job:darkwatch:scan`, verify scan completes and exposures created
|
|
- Connect to WebSocket: `wscat -c ws://localhost:3000/ws`, subscribe to scan events
|
|
- Check dashboard: Scan progress bar animates during active scan, threat score updates after
|
|
- Test cooldown: Trigger same scan twice rapidly, verify second scan doesn't create duplicate alerts
|
|
|
|
notes:
|
|
- The existing `scanStates` Map in `darkwatch.service.ts` is in-memory — move to Redis for multi-instance safety
|
|
- WebSocket infrastructure exists at `websocket.ts` — extend it for scan-specific events
|
|
- The scheduler directory (`scheduler/`) currently only has Dockerfiles — this task creates actual job logic
|
|
- Consider using Honker (Rust queue) for scan job distribution once it's production-ready
|
|
- Alert fatigue is a real churn driver — aggressive deduplication is a competitive advantage
|