# 20. Backend Router — Alert Correlation & Normalization Engine meta: id: kordant-unified-restructure-20 feature: kordant-unified-restructure priority: P1 depends_on: [kordant-unified-restructure-15, kordant-unified-restructure-16, kordant-unified-restructure-17, kordant-unified-restructure-18, kordant-unified-restructure-19] tags: [backend, trpc, correlation, alerts, api] objective: - Build the tRPC router and service layer for the cross-service alert correlation and normalization engine. Port logic from `packages/correlation/` into a unified `correlation` router that aggregates alerts from all services into a unified threat view. deliverables: - `web/src/server/api/routers/correlation.ts` — Correlation router: - `correlation.getAlerts` — `protectedProcedure` returning normalized alerts for user - `correlation.getAlertDetails` — `protectedProcedure` returning single alert with correlated data - `correlation.getGroups` — `protectedProcedure` returning correlation groups - `correlation.getGroupDetails` — `protectedProcedure` returning group with all member alerts - `correlation.resolveAlert` — `protectedProcedure` marking alert as resolved or false positive - `correlation.getStats` — `protectedProcedure` returning alert statistics - `web/src/server/services/correlation.service.ts` — Core business logic: - `normalizeAlert(source, sourceAlertId, category, severity, userId, title, description, entities)` — create NormalizedAlert - `correlateAlerts(userId)` — group related alerts by shared entities (email, phone, SSN) - `getAlertTimeline(userId, filters?)` — chronological view of all alerts - `resolveAlert(alertId, resolution)` — mark as resolved or false positive - `getThreatScore(userId)` — calculate overall threat score based on alert severity and frequency - `web/src/server/services/correlation/engine.ts` — Correlation engine: - `findRelatedAlerts(alert)` — find alerts sharing entities with given alert - `createCorrelationGroup(alerts)` — group related alerts into CorrelationGroup - `updateGroupSeverity(group)` — recalculate highest severity for group - `deduplicateAlerts(alerts)` — remove duplicate alerts based on sourceAlertId - `web/src/server/services/correlation/normalizer.ts` — Alert normalization: - `normalizeDarkWatchAlert(exposure)` — convert exposure to NormalizedAlert - `normalizeSpamShieldAlert(detection)` — convert spam detection to NormalizedAlert - `normalizeVoicePrintAlert(analysis)` — convert voice analysis to NormalizedAlert - `normalizeHomeTitleAlert(change)` — convert property change to NormalizedAlert - `normalizeRemoveBrokersAlert(listing)` — convert broker listing to NormalizedAlert steps: 1. Create `web/src/server/api/routers/correlation.ts`. 2. Define Zod schemas: - `alertFilterSchema`: `source: z.enum([...]).optional()`, `severity: z.enum([...]).optional()`, `status: z.enum([...]).optional()`, `page`, `limit` - `resolveSchema`: `alertId: z.string().uuid()`, `resolution: z.enum(['RESOLVED', 'FALSE_POSITIVE'])` 3. Implement router procedures: - Alert listing with filtering and pagination - Group listing and details - Alert resolution with audit logging - Stats aggregation 4. Create `web/src/server/services/correlation.service.ts`: - Port from `packages/correlation/src/` - Implement normalization pipeline - Implement correlation grouping 5. Create correlation engine: - `findRelatedAlerts`: query alerts by shared entities (email, phone, SSN) within time window - `createCorrelationGroup`: create group record, link alerts - `updateGroupSeverity`: aggregate severity of all alerts in group - `deduplicateAlerts`: ensure no duplicate sourceAlertId in normalized table 6. Create normalizer module: - One function per service domain that converts domain-specific alert to NormalizedAlert - Map domain severity to NormalizedAlertSeverity enum - Extract entities (emails, phones, SSNs) from payload for correlation 7. Implement threat scoring: - Formula: weighted sum of alert severities over 30-day window - Decay older alerts - Return score 0-100 8. Integrate with other services: - Call `correlationService.normalizeAlert()` from each service's alert pipeline - DarkWatch (task 15), VoicePrint (task 16), SpamShield (task 17), HomeTitle (task 18), RemoveBrokers (task 19) 9. Wire router into `web/src/server/api/root.ts`. 10. Write unit tests for engine functions. steps: - Unit: `normalizeAlert` creates correct NormalizedAlert for each source type - Unit: `findRelatedAlerts` groups alerts sharing an email address - Unit: `createCorrelationGroup` creates group with correct highest severity - Unit: `deduplicateAlerts` prevents duplicate sourceAlertId - Unit: `getThreatScore` returns higher score for more severe/recent alerts - Integration: tRPC `getAlerts` returns normalized alerts for authenticated user acceptance_criteria: - [ ] Alerts from all 5 services are normalized into a unified schema - [ ] Related alerts are grouped by shared entities (email, phone, SSN) - [ ] Correlation groups update their severity when new alerts are added - [ ] Users can resolve alerts or mark them as false positives - [ ] Alert timeline provides chronological view across all services - [ ] Threat score accurately reflects user's current risk level - [ ] Deduplication prevents duplicate alerts from the same source validation: - Create normalized alerts from different services with shared email - Verify correlation engine groups them into a single group - Resolve an alert and verify status updated in DB - Calculate threat score for a user with mixed alert severities - Run `cd web && pnpm test` for correlation unit tests notes: - Reference legacy: `packages/correlation/src/`, `packages/api/src/routes/correlation.routes.ts` - The correlation engine is the "brain" that makes Kordant feel unified. Invest time in getting the entity extraction and grouping logic right. - Entity extraction should use regex patterns for emails, phones, and SSNs. Consider using a library like `compromise` for NLP extraction if payloads are unstructured. - Time window for correlation: alerts within 30 days sharing an entity should be grouped. Adjust based on testing. - The threat score algorithm should be transparent to users. Consider showing the breakdown (e.g., "+20 from DarkWatch exposure, +15 from SpamShield detection"). - False positive tracking is important for ML model improvement. Log all false positive marks with context.