Kordant/tasks/security-fixes/10-fix-voiceprint-resource-exhaustion.md

# 10. Fix VoicePrint resource exhaustion via unbounded audio upload

meta:
  id: security-fixes-10
  feature: security-fixes
  priority: P1
  depends_on: []
  tags: [implementation, tests-required, medium-severity]

objective:
- Prevent memory exhaustion by enforcing maximum payload size on VoicePrint audio endpoints

deliverables:
- `maxLength` constraint on `AnalyzeAudioSchema` in `web/src/server/api/schemas/voiceprint.ts`
- Request body size limit middleware for audio endpoints
- Size validation in `voiceprint.service.ts` before base64 decoding
- Unit tests for size limits

steps:
1. Examine `AnalyzeAudioSchema` at `web/src/server/api/schemas/voiceprint.ts:8-10` and `analyzeAudio()` at `web/src/server/services/voiceprint.service.ts:135-140`
2. Add `maxLength` to the audio schema:
   - Calculate a reasonable limit: A 60-second mono 16kHz WAV is ~1.2MB raw, ~1.6MB base64
   - Set `maxLength` to ~2MB base64 (~1.5MB raw) as a safe default
   - Consider making it configurable via an environment variable
3. Add a request body size limit in the tRPC middleware or at the HTTP layer:
   - Reject requests with body size > configured limit before processing
   - Return a clear error message to the client
4. Add a pre-decode size check in `analyzeAudio()`:
   - Calculate the decoded size from the base64 string length (`base64Length * 0.75`)
   - Reject if the decoded size exceeds the configured memory limit
5. Update `protectedProcedure` rate limit for voiceprint endpoints if not already covered by task 04

tests:
- Unit: `AnalyzeAudioSchema` rejects payloads exceeding `maxLength`
- Unit: `analyzeAudio()` rejects base64 strings that would decode to > configured memory limit
- Unit: Valid audio payloads within the limit are accepted
- Integration: Sending a 100MB base64 payload to the audio endpoint is rejected with a size error
- Integration: Sending a valid 30-second audio recording succeeds

acceptance_criteria:
- Audio schema enforces `maxLength` on the base64 payload
- Request body size limit middleware rejects oversized requests before processing
- Pre-decode size check prevents memory exhaustion from valid-length but high-entropy payloads
- Clear error messages are returned when size limits are exceeded
- Valid audio recordings within the size limit are processed normally

validation:
- `cd web && bun test` — all tests pass
- Send a base64 payload exceeding the maxLength and verify it is rejected
- Send a valid audio recording and verify it is processed correctly
- Verify the rate limit for voiceprint endpoints is appropriate (task 04)

notes:
- Finding p8-010: A 100MB base64 payload consumes 300MB+ memory per request
- The `protectedProcedure` rate limit (100/min) is insufficient — at 100 requests/min with 100MB payloads, that's 10GB/min of memory pressure
- Consider streaming or chunked upload for large audio files instead of base64 in the request body
- The maxLength should account for realistic use cases: voice biometrics typically need 3-30 seconds of audio