2.3 KiB
2.3 KiB
08. Graceful Shutdown & Health Check Endpoints
meta: id: web-production-08 feature: web-production priority: P1 depends_on: [] tags: [reliability, infrastructure, production]
objective:
- Implement health checks and graceful shutdown to ensure zero-downtime deployments and reliable operations
deliverables:
- Health check endpoint (/health)
- Readiness probe endpoint (/ready)
- Graceful shutdown handler
- Dependency health checks (DB, Redis, Stripe)
steps:
- Create health check endpoints:
- GET /health → basic liveness (HTTP 200 if process running)
- GET /ready → readiness check (DB, Redis, Stripe connectivity)
- GET /health/deep → comprehensive check with dependency status
- Implement dependency health checks:
- Database: simple SELECT 1 query
- Redis: PING command
- Stripe: retrieve account info (cached)
- WebSocket server: connection count
- Add graceful shutdown:
- Handle SIGTERM/SIGINT signals
- Stop accepting new connections
- Wait for active requests to complete (30s timeout)
- Close database connections
- Close Redis connections
- Exit process cleanly
- Add startup probe:
- Delay readiness until all services initialized
- Retry logic for DB connection on startup
- Add metrics endpoint (/metrics) for Prometheus:
- Request count and duration
- Error rates
- Active connections
- Dependency health status
tests:
- Unit: Test health check responses
- Integration: Test graceful shutdown with active requests
- Load: Verify zero failed requests during rolling restart
acceptance_criteria:
- /health returns 200 within 100ms
- /ready returns 200 only when all dependencies healthy
- /ready returns 503 with detailed error when dependency down
- Graceful shutdown completes within 30 seconds
- Zero failed requests during rolling deployment
- Prometheus metrics endpoint available
validation:
curl /health→ {"status":"ok"}curl /ready→ {"status":"ok","dependencies":{"db":"ok","redis":"ok","stripe":"ok"}}- Stop container with active requests → all complete before exit
- Block DB port → /ready returns 503
notes:
- Nitro/SolidStart may need custom server plugin for signal handling
- Use node-graceful-shutdown or similar library
- Kubernetes/Docker health checks rely on these endpoints