Kordant/piolium/findings/p8-010-voiceprint-resource-exhaustion/report.md

Phase: 8
Sequence: 010
Slug: voiceprint-resource-exhaustion
Verdict: VALID
Rationale: VoicePrint audio endpoints accept unbounded base64 payloads with no maximum length; 100/min rate limit allows rapid large uploads that can exhaust server memory and disk
Severity-Original: medium
Severity: medium
PoC-Status: pending
Pre-FP-Flag: none
Debate: piolium/attack-surface/balanced-chamber-summary.md

## Summary
The `voiceprintRouter.analyzeAudio` and `voiceprintRouter.createEnrollment` procedures accept `audioBase64` with only a `minLength(1)` validation. There is no maximum length, no content-type validation, and no size check before decoding. An authenticated attacker can send extremely large base64-encoded payloads that, when decoded, consume significant server memory during base64 decoding, ML preprocessing, and ML inference. The procedures use `protectedProcedure` (100/min default rate limit), providing weak protection against sustained attacks.

## Location
- `web/src/server/api/schemas/voiceprint.ts` lines 8–10 (schemas)
- `web/src/server/services/voiceprint.service.ts` lines 135–140 (service)
- `web/src/server/api/utils.ts` lines 23–28 (protectedProcedure)

## Attacker Control
An authenticated user can send extremely large base64-encoded audio payloads. A 100MB base64 payload (representing ~75MB of audio data) consumes ~300MB+ memory per request (base64 string + decoded buffer + ML features + model inference + disk write).

## Trust Boundary Crossed
Resource boundary. Unbounded input exceeds expected resource allocation, affecting all users on the same server.

## Impact
- **Memory exhaustion**: Single request can consume 300MB+; 100 rapid requests can exhaust server memory (OOM kill)
- **Disk exhaustion**: Each request writes a ~75MB audio file to disk; rapid uploads fill disk
- **ML model resource exhaustion**: ML preprocessing and inference are CPU-intensive; large inputs increase processing time
- **Service disruption**: Memory exhaustion affects all users on the same server

## Evidence
```typescript
// Schema — no maximum length
export const AnalyzeAudioSchema = object({
  audioBase64: string([minLength(1)]),  // No maxLength
});

// Service — no size check before decoding
export async function analyzeAudio(userId: string, audioBase64: string) {
  const audioBuffer = Buffer.from(audioBase64, "base64");  // No size check
  // ...
  const features = await preprocessAudio(audioBuffer);     // ML preprocessing
  const detection = await detectSynthetic(features);        // ML inference
}

// Rate limit — 100/min for authenticated users
const rateLimitTiers = {
  authenticated: { limit: 100, windowMs: 60_000 },
};
```

## Reproduction Steps
1. Authenticated user sends `voiceprintRouter.analyzeAudio` with 100MB base64 payload
2. Server decodes base64 → 75MB buffer
3. ML preprocessing and inference consume additional memory
4. Audio file written to disk (~75MB)
5. Repeat 100 times in 1 minute → ~30GB+ memory usage → OOM kill or service disruption

## Defense Search Results
- valibot `minLength(1)` only sets minimum, no maximum
- `protectedProcedure` auth check requires authentication
- Rate limit (authenticated tier) allows 100/min — insufficient for large payloads
- No content-type validation (no MIME type check)
- No payload size limit on the HTTP request body
- No streaming upload support (entire payload loaded into memory)