# 08. Graceful Shutdown & Health Check Endpoints meta: id: web-production-08 feature: web-production priority: P1 depends_on: [] tags: [reliability, infrastructure, production] objective: - Implement health checks and graceful shutdown to ensure zero-downtime deployments and reliable operations deliverables: - Health check endpoint (/health) - Readiness probe endpoint (/ready) - Graceful shutdown handler - Dependency health checks (DB, Redis, Stripe) steps: 1. Create health check endpoints: - GET /health → basic liveness (HTTP 200 if process running) - GET /ready → readiness check (DB, Redis, Stripe connectivity) - GET /health/deep → comprehensive check with dependency status 2. Implement dependency health checks: - Database: simple SELECT 1 query - Redis: PING command - Stripe: retrieve account info (cached) - WebSocket server: connection count 3. Add graceful shutdown: - Handle SIGTERM/SIGINT signals - Stop accepting new connections - Wait for active requests to complete (30s timeout) - Close database connections - Close Redis connections - Exit process cleanly 4. Add startup probe: - Delay readiness until all services initialized - Retry logic for DB connection on startup 5. Add metrics endpoint (/metrics) for Prometheus: - Request count and duration - Error rates - Active connections - Dependency health status tests: - Unit: Test health check responses - Integration: Test graceful shutdown with active requests - Load: Verify zero failed requests during rolling restart acceptance_criteria: - /health returns 200 within 100ms - /ready returns 200 only when all dependencies healthy - /ready returns 503 with detailed error when dependency down - Graceful shutdown completes within 30 seconds - Zero failed requests during rolling deployment - Prometheus metrics endpoint available validation: - `curl /health` → {"status":"ok"} - `curl /ready` → {"status":"ok","dependencies":{"db":"ok","redis":"ok","stripe":"ok"}} - Stop container with active requests → all complete before exit - Block DB port → /ready returns 503 notes: - Nitro/SolidStart may need custom server plugin for signal handling - Use node-graceful-shutdown or similar library - Kubernetes/Docker health checks rely on these endpoints