64 lines
3.8 KiB
Markdown
64 lines
3.8 KiB
Markdown
# 03. HaveIBeenPwned API Integration for Email Breach Monitoring
|
|
|
|
meta:
|
|
id: core-services-03
|
|
feature: core-services-implementation
|
|
priority: P0
|
|
depends_on: [core-services-01]
|
|
tags: [darkwatch, hibp, breach-monitoring, api-integration, table-stakes]
|
|
|
|
objective:
|
|
- Replace the stub `scanHIBP()` function in the DarkWatch scan engine with a real HaveIBeenPwned API integration that checks user emails against known breach databases and creates exposure records.
|
|
|
|
deliverables:
|
|
- HIBP API client with k-anonymity support for password checking
|
|
- Email breach lookup with result parsing and normalization
|
|
- Exposure record creation in database with proper severity scoring
|
|
- Alert generation via existing alert pipeline
|
|
- Circuit breaker integration (already exists in scan engine)
|
|
|
|
steps:
|
|
1. Sign up for HIBP API key at https://haveibeenpwned.com/API/Key (free tier: 1,500 req/mo)
|
|
2. Add `HIBP_API_KEY` to `.env.example` and validate in `env.ts`
|
|
3. Create `darkwatch/hibp.client.ts` with functions:
|
|
- `checkEmail(email): BreachResult[]` — query breachedaccount endpoint
|
|
- `checkPassword(passwordHash): PwnedPasswordResult` — query pwnedpasswords endpoint using k-anonymity
|
|
- `getBreaches(): Breach[]` — fetch breach metadata for caching
|
|
4. Parse HIBP response: breach name, date, compromised data types, affected accounts
|
|
5. Map data types to internal schema: email, password, phone, address, ssn, domain
|
|
6. Calculate severity: critical if SSN/credit card, warning if email/phone, info if username only
|
|
7. Deduplicate against existing exposures using `identifierHash` (already implemented)
|
|
8. Create exposure records via existing `processExposure()` pipeline
|
|
9. Cache breach metadata in Redis (update daily) to reduce API calls
|
|
10. Handle rate limits: 1 req/sec free tier, 10 req/sec paid — implement request queue
|
|
11. Add comprehensive error handling for 404 (no breach), 429 (rate limit), 503 (service unavailable)
|
|
|
|
tests:
|
|
- Unit: Mock HIBP API responses, verify parsing and severity scoring
|
|
- Integration: Test with real HIBP API using test email `test@example.com` (no breaches expected)
|
|
- E2E: Add email to watchlist → trigger scan → verify exposure records created for breached email
|
|
|
|
acceptance_criteria:
|
|
- [ ] `scanHIBP(email)` makes real HTTP request to `https://haveibeenpwned.com/api/v3/breachedaccount/{email}`
|
|
- [ ] Breached emails create exposure records with correct breach metadata (name, date, data classes)
|
|
- [ ] Non-breached emails return empty results without creating false exposure records
|
|
- [ ] Rate limits are respected (1 req/sec free tier, configurable for paid)
|
|
- [ ] 404 responses are handled gracefully (no breach = no exposure, not an error)
|
|
- [ ] Circuit breaker opens after 3 consecutive failures and stays open for 60 seconds
|
|
- [ ] Exposure deduplication prevents duplicate records for same email + breach combination
|
|
- [ ] Alerts are generated for critical exposures (SSN, password) via existing pipeline
|
|
- [ ] HIBP breach metadata is cached in Redis and refreshed daily
|
|
|
|
validation:
|
|
- Run `vitest run darkwatch.test.ts` — all tests pass
|
|
- Manual: Add known breached email to watchlist, trigger scan, verify alert received
|
|
- Check Redis: `GET hibp:breaches` returns cached breach metadata
|
|
- Monitor logs: No `"not yet implemented"` or `console.log("[darkwatch] stub")` messages
|
|
|
|
notes:
|
|
- HIBP free tier is 1,500 requests/month — enough for development, need paid tier ($3.50/mo) for production
|
|
- The k-anonymity password check sends only first 5 chars of SHA-1 hash — already privacy-safe
|
|
- The existing `scan.engine.ts` has the circuit breaker infrastructure — wire HIBP client into it
|
|
- HIBP does NOT crawl dark web — it only aggregates known public breaches. For live dark web monitoring, add Breachsense later (Phase 3)
|
|
- Consider subscribing to HIBP domain monitoring for enterprise upsell later
|