3.6 KiB
Security Review: FRE-4806 — Datadog APM + Sentry Error Tracking Integration
Reviewer: Security Reviewer Date: 2026-05-12 Scope: Runtime security — env var secrets, network egress, API key handling, data exposure Files Reviewed:
packages/monitoring/src/config.tspackages/monitoring/src/sentry.tspackages/monitoring/src/datadog.tspackages/monitoring/src/datadog-init.tspackages/monitoring/src/datadog-logs.tspackages/monitoring/src/cloudwatch.tspackages/api/src/middleware/auth.middleware.tspackages/api/src/middleware/monitoring.middleware.tspackages/api/src/middleware/error-handling.middleware.tspackages/api/src/server.tspackages/api/src/index.ts
P1 Findings (Resolved)
P1 #1 — API key leaked to Sentry as user ID ✅ FIXED
File: auth.middleware.ts:49-52
Finding: Raw API key stored in user.id, passed to setSentryUser() on 5xx errors.
Fix: Key truncated to api-{first-8-chars}... before assignment. Verified error-handling.middleware.ts:29-30 reads from user.id (truncated), not authReq.apiKey (raw).
P1 #2 — DD_API_KEY missing from Zod schema ✅ FIXED
File: config.ts:10-11
Finding: DD_API_KEY consumed in datadog-logs.ts but not validated in schema.
Fix: Added DD_API_KEY: z.string().default('') and DD_SITE: z.string().default('datadoghq.com') to schema and parse call.
P2 Findings (Hardening Recommendations — non-blocking)
P2 #1 — No circuit breaker on Datadog log forwarding
File: datadog-logs.ts:17-44
Risk: Every forwardLog() call does async fetch() with no timeout. Slow intake API holds open connections.
Recommendation: Add AbortSignal.timeout(5000) to fetch; consider simple circuit breaker.
P2 #2 — dd-trace sample rate defaults to 100%
File: config.ts:8
Risk: DD_TRACE_SAMPLE_RATE defaults to 1.0. High production traffic = full span volume to Datadog.
Recommendation: Default to 0.1; override via env for development.
P2 #3 — CloudWatch rate limit not enforced
File: cloudwatch.ts:46-56, monitoring.middleware.ts:46
Risk: CloudWatch allows 5 TPS per metric/region. emitError on 5xx adds second call per request.
Recommendation: Add in-memory batching or token bucket rate limiter.
P2 #4 — Sentry beforeSend: pathname exposes resource IDs
File: sentry.ts:28-33
Risk: Query strings stripped, but path segments like /api/v1/users/42/orders expose resource IDs.
Recommendation: Regex-based path masking for sensitive routes.
P3 Findings (Low Priority — non-blocking)
P3 #1 — Error response leaks internal error name/message to client
File: error-handling.middleware.ts:18-25
Risk: err.name and err.message returned directly in JSON response.
Recommendation: Generic messages for 5xx; keep details in logs only.
P3 #2 — AWS credential chain not explicit
File: cloudwatch.ts:10
Risk: CloudWatchClient uses default credential chain; may pick up ~/.aws/credentials locally.
Recommendation: Document expected credential source per environment.
P3 #3 — Sentry DSN empty default fails open in production
File: config.ts:14
Risk: Empty SENTRY_DSN silently disables error tracking in production.
Recommendation: Startup health check warning when DD_ENV === "production" and DSN is empty.
Verdict: ✅ SECURITY PASS
Both P1 findings remediated and verified. The 4 P2 and 3 P3 findings are hardening recommendations suitable for follow-up child issues if the team desires. No blocking security vulnerabilities remain.
Disposition: Issue approved for merge.