Security Review: FRE-4806 — Datadog APM + Sentry Error Tracking Integration

Reviewer: Security Reviewer Date: 2026-05-12 Scope: Runtime security — env var secrets, network egress, API key handling, data exposure Files Reviewed:

packages/monitoring/src/config.ts
packages/monitoring/src/sentry.ts
packages/monitoring/src/datadog.ts
packages/monitoring/src/datadog-init.ts
packages/monitoring/src/datadog-logs.ts
packages/monitoring/src/cloudwatch.ts
packages/api/src/middleware/auth.middleware.ts
packages/api/src/middleware/monitoring.middleware.ts
packages/api/src/middleware/error-handling.middleware.ts
packages/api/src/server.ts
packages/api/src/index.ts

P1 Findings (Resolved)

P1 #1 — API key leaked to Sentry as user ID ✅ FIXED

File: auth.middleware.ts:49-52 Finding: Raw API key stored in user.id, passed to setSentryUser() on 5xx errors. Fix: Key truncated to api-{first-8-chars}... before assignment. Verified error-handling.middleware.ts:29-30 reads from user.id (truncated), not authReq.apiKey (raw).

P1 #2 — DD_API_KEY missing from Zod schema ✅ FIXED

File: config.ts:10-11 Finding: DD_API_KEY consumed in datadog-logs.ts but not validated in schema. Fix: Added DD_API_KEY: z.string().default('') and DD_SITE: z.string().default('datadoghq.com') to schema and parse call.

P2 Findings (Hardening Recommendations — non-blocking)

P2 #1 — No circuit breaker on Datadog log forwarding

File: datadog-logs.ts:17-44 Risk: Every forwardLog() call does async fetch() with no timeout. Slow intake API holds open connections. Recommendation: Add AbortSignal.timeout(5000) to fetch; consider simple circuit breaker.

P2 #2 — dd-trace sample rate defaults to 100%

File: config.ts:8 Risk: DD_TRACE_SAMPLE_RATE defaults to 1.0. High production traffic = full span volume to Datadog. Recommendation: Default to 0.1; override via env for development.

P2 #3 — CloudWatch rate limit not enforced

File: cloudwatch.ts:46-56, monitoring.middleware.ts:46 Risk: CloudWatch allows 5 TPS per metric/region. emitError on 5xx adds second call per request. Recommendation: Add in-memory batching or token bucket rate limiter.

P2 #4 — Sentry beforeSend: pathname exposes resource IDs

File: sentry.ts:28-33 Risk: Query strings stripped, but path segments like /api/v1/users/42/orders expose resource IDs. Recommendation: Regex-based path masking for sensitive routes.

P3 Findings (Low Priority — non-blocking)

P3 #1 — Error response leaks internal error name/message to client

File: error-handling.middleware.ts:18-25 Risk: err.name and err.message returned directly in JSON response. Recommendation: Generic messages for 5xx; keep details in logs only.

P3 #2 — AWS credential chain not explicit

File: cloudwatch.ts:10 Risk: CloudWatchClient uses default credential chain; may pick up ~/.aws/credentials locally. Recommendation: Document expected credential source per environment.

P3 #3 — Sentry DSN empty default fails open in production

File: config.ts:14 Risk: Empty SENTRY_DSN silently disables error tracking in production. Recommendation: Startup health check warning when DD_ENV === "production" and DSN is empty.

Verdict: ✅ SECURITY PASS

Both P1 findings remediated and verified. The 4 P2 and 3 P3 findings are hardening recommendations suitable for follow-up child issues if the team desires. No blocking security vulnerabilities remain.

Disposition: Issue approved for merge.

3.6 KiB Raw Blame History