Kordant/piolium/findings/p8-002-puppeteer-ssrf/draft.md

Phase: 8
Sequence: 002
Slug: puppeteer-ssrf
Verdict: VALID
Rationale: Puppeteer launched with --no-sandbox and page.setContent() accepting arbitrary HTML; report data from database can contain URLs that Puppeteer resolves
Severity-Original: high
Severity: medium
PoC-Status: pending
Pre-FP-Flag: none
Debate: piolium/attack-surface/balanced-chamber-summary.md

## Summary
The report PDF generator in `web/src/server/services/reports/generator.ts` uses Puppeteer in headless mode with `--no-sandbox` flag and `page.setContent()` to render HTML templates to PDF. The `compileData()` function populates the report with data from the database (alert breakdowns, threat scores, recommendations) that are rendered as HTML strings. If any data contains URLs (e.g., `file://` or `http://` schemes), Puppeteer will resolve them, enabling SSRF.

## Location
- `web/src/server/services/reports/generator.ts` lines 141–150 (generatePDF function)
- `web/src/server/services/reports/generator.ts` lines 53–137 (compileData function)

## Attacker Control
An attacker with admin access can control report template files in `web/src/server/services/reports/templates/`, or an attacker with SQL injection access (DFD-1) can inject URLs into the `normalizedAlerts` table that gets rendered in reports. The `compileData()` function uses `source` values from the database and generates HTML with these values.

## Trust Boundary Crossed
Database-stored data → Browser rendering context (Puppeteer). This crosses the server-to-browser trust boundary within the server process, allowing controlled data to trigger network requests to arbitrary URLs.

## Impact
SSRF to internal services (metadata endpoints, internal APIs), local file read via `file://` URLs. The `--no-sandbox` flag disables Chrome sandboxing, significantly expanding the attack surface.

## Evidence
```typescript
// generatePDF() — no-sandbox + arbitrary HTML
export async function generatePDF(html: string): Promise<Buffer> {
  const browser = await puppeteer.launch({ headless: true, args: ["--no-sandbox"] });
  const page = await browser.newPage();
  await page.setContent(html, { waitUntil: "load" });  // Arbitrary HTML
  // ...
}

// compileData() — populates report with database data
// alertBreakdownRows contains source values from normalizedAlerts table
// recommendations generates HTML with emoji and markdown-like content
```

## Reproduction Steps
1. Admin (or attacker with SQL injection) controls report data or template files
2. Data contains `<img src="file:///etc/passwd">` or `<img src="http://169.254.169.254/latest/meta-data/">`
3. `generatePDF()` renders the report via Puppeteer
4. Puppeteer resolves the URL, reading local files or accessing cloud metadata
5. Attack succeeds because `--no-sandbox` disables Chrome sandboxing

## Defense Search Results
- `--no-sandbox` flag is present — disables Chrome sandboxing
- No URL allowlisting or blocking in Puppeteer
- No `page.setRequestInterception(true)` to block non-allowed URLs
- CSP is not effective for Puppeteer headless browser
- HTML template system uses `{{key}}` substitution without escaping