3.8 KiB
3.8 KiB
03. HaveIBeenPwned API Integration for Email Breach Monitoring
meta: id: core-services-03 feature: core-services-implementation priority: P0 depends_on: [core-services-01] tags: [darkwatch, hibp, breach-monitoring, api-integration, table-stakes]
objective:
- Replace the stub
scanHIBP()function in the DarkWatch scan engine with a real HaveIBeenPwned API integration that checks user emails against known breach databases and creates exposure records.
deliverables:
- HIBP API client with k-anonymity support for password checking
- Email breach lookup with result parsing and normalization
- Exposure record creation in database with proper severity scoring
- Alert generation via existing alert pipeline
- Circuit breaker integration (already exists in scan engine)
steps:
- Sign up for HIBP API key at https://haveibeenpwned.com/API/Key (free tier: 1,500 req/mo)
- Add
HIBP_API_KEYto.env.exampleand validate inenv.ts - Create
darkwatch/hibp.client.tswith functions:checkEmail(email): BreachResult[]— query breachedaccount endpointcheckPassword(passwordHash): PwnedPasswordResult— query pwnedpasswords endpoint using k-anonymitygetBreaches(): Breach[]— fetch breach metadata for caching
- Parse HIBP response: breach name, date, compromised data types, affected accounts
- Map data types to internal schema: email, password, phone, address, ssn, domain
- Calculate severity: critical if SSN/credit card, warning if email/phone, info if username only
- Deduplicate against existing exposures using
identifierHash(already implemented) - Create exposure records via existing
processExposure()pipeline - Cache breach metadata in Redis (update daily) to reduce API calls
- Handle rate limits: 1 req/sec free tier, 10 req/sec paid — implement request queue
- Add comprehensive error handling for 404 (no breach), 429 (rate limit), 503 (service unavailable)
tests:
- Unit: Mock HIBP API responses, verify parsing and severity scoring
- Integration: Test with real HIBP API using test email
test@example.com(no breaches expected) - E2E: Add email to watchlist → trigger scan → verify exposure records created for breached email
acceptance_criteria:
scanHIBP(email)makes real HTTP request tohttps://haveibeenpwned.com/api/v3/breachedaccount/{email}- Breached emails create exposure records with correct breach metadata (name, date, data classes)
- Non-breached emails return empty results without creating false exposure records
- Rate limits are respected (1 req/sec free tier, configurable for paid)
- 404 responses are handled gracefully (no breach = no exposure, not an error)
- Circuit breaker opens after 3 consecutive failures and stays open for 60 seconds
- Exposure deduplication prevents duplicate records for same email + breach combination
- Alerts are generated for critical exposures (SSN, password) via existing pipeline
- HIBP breach metadata is cached in Redis and refreshed daily
validation:
- Run
vitest run darkwatch.test.ts— all tests pass - Manual: Add known breached email to watchlist, trigger scan, verify alert received
- Check Redis:
GET hibp:breachesreturns cached breach metadata - Monitor logs: No
"not yet implemented"orconsole.log("[darkwatch] stub")messages
notes:
- HIBP free tier is 1,500 requests/month — enough for development, need paid tier ($3.50/mo) for production
- The k-anonymity password check sends only first 5 chars of SHA-1 hash — already privacy-safe
- The existing
scan.engine.tshas the circuit breaker infrastructure — wire HIBP client into it - HIBP does NOT crawl dark web — it only aggregates known public breaches. For live dark web monitoring, add Breachsense later (Phase 3)
- Consider subscribing to HIBP domain monitoring for enterprise upsell later