get to prod tasks
This commit is contained in:
61
tasks/web-production/01-security-headers-cors.md
Normal file
61
tasks/web-production/01-security-headers-cors.md
Normal file
@@ -0,0 +1,61 @@
|
||||
# 01. Security Headers & CORS Configuration
|
||||
|
||||
meta:
|
||||
id: web-production-01
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [security, infrastructure, production]
|
||||
|
||||
objective:
|
||||
- Implement comprehensive security headers and CORS configuration to protect against common web vulnerabilities
|
||||
|
||||
deliverables:
|
||||
- Security headers middleware in web/src/middleware.ts or Nitro config
|
||||
- CORS configuration for API endpoints
|
||||
- Content Security Policy (CSP) headers
|
||||
- Remove X-Powered-By and other identifying headers
|
||||
|
||||
steps:
|
||||
1. Add helmet-like security headers via Nitro hooks or Vite plugin:
|
||||
- Strict-Transport-Security (HSTS)
|
||||
- X-Content-Type-Options: nosniff
|
||||
- X-Frame-Options: DENY
|
||||
- X-XSS-Protection: 1; mode=block
|
||||
- Referrer-Policy: strict-origin-when-cross-origin
|
||||
- Permissions-Policy for camera, microphone, geolocation
|
||||
2. Implement CSP header allowing only necessary sources:
|
||||
- script-src: 'self', stripe.com, clerk.dev
|
||||
- style-src: 'self', 'unsafe-inline' (needed for Tailwind)
|
||||
- img-src: 'self', data:, blob:, gravatar.com
|
||||
- connect-src: 'self', api endpoints, websocket URL
|
||||
- frame-src: 'self', stripe.com (for Checkout)
|
||||
3. Configure CORS for /api/trpc endpoints:
|
||||
- Allow origins: production domain, mobile app origins
|
||||
- Allow methods: GET, POST
|
||||
- Allow headers: Content-Type, Authorization, x-api-key
|
||||
- Credentials: true
|
||||
4. Remove server-identifying headers (X-Powered-By, Server)
|
||||
5. Add tests verifying headers are present on all responses
|
||||
|
||||
tests:
|
||||
- Unit: Test each header is present and correct value
|
||||
- Integration: Test API endpoints return correct CORS headers
|
||||
- Security scan: Use securityheaders.com or similar to verify A+ rating
|
||||
|
||||
acceptance_criteria:
|
||||
- All 8 security headers present on every HTTP response
|
||||
- CSP blocking inline scripts except nonce/hash approved
|
||||
- CORS preflight requests handled correctly for API endpoints
|
||||
- SecurityHeaders.com scan returns A+ rating
|
||||
- No server version information leaked in headers
|
||||
|
||||
validation:
|
||||
- Run `curl -I https://localhost:3000` and verify headers
|
||||
- Run automated security header scanner
|
||||
- Check browser dev tools Network tab for all response headers
|
||||
|
||||
notes:
|
||||
- SolidStart/Nitro may require custom plugin for headers
|
||||
- CSP 'unsafe-inline' for styles is acceptable with Tailwind v4 but document the trade-off
|
||||
- Consider using nonce-based CSP once Tailwind supports it fully
|
||||
58
tasks/web-production/02-rate-limiting-ddos.md
Normal file
58
tasks/web-production/02-rate-limiting-ddos.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# 02. Rate Limiting & DDoS Protection
|
||||
|
||||
meta:
|
||||
id: web-production-02
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [security, infrastructure, production]
|
||||
|
||||
objective:
|
||||
- Implement robust rate limiting and DDoS protection beyond the basic in-memory tRPC middleware
|
||||
|
||||
deliverables:
|
||||
- Redis-backed rate limiting for distributed deployment
|
||||
- Per-endpoint rate limit tiers
|
||||
- IP-based and user-based limiting
|
||||
- DDoS protection via Cloudflare or similar
|
||||
|
||||
steps:
|
||||
1. Replace in-memory rate limit map with Redis-backed solution:
|
||||
- Use ioredis or @upstash/ratelimit for distributed rate limiting
|
||||
- Create web/src/server/lib/ratelimit.ts with configurable tiers
|
||||
2. Define rate limit tiers:
|
||||
- Public endpoints (login, signup): 5 req/min per IP
|
||||
- Authenticated API: 100 req/min per user
|
||||
- Sensitive operations (password reset): 3 req/hour per email
|
||||
- WebSocket connections: 1 per user, reconnect max 5/min
|
||||
- Admin endpoints: 50 req/min per admin
|
||||
3. Add IP-based rate limiting at edge/Nitro level for anonymous traffic
|
||||
4. Configure Cloudflare (or alternative) for:
|
||||
- DDoS protection
|
||||
- Bot management
|
||||
- Challenge pages for suspicious traffic
|
||||
5. Add rate limit response headers (X-RateLimit-Remaining, X-RateLimit-Reset)
|
||||
6. Implement sliding window algorithm for fairer limiting
|
||||
|
||||
tests:
|
||||
- Unit: Test rate limiter correctly counts and resets
|
||||
- Integration: Flood endpoint with requests, verify 429 responses
|
||||
- Load: Use k6 or artillery to test limits under load
|
||||
|
||||
acceptance_criteria:
|
||||
- Redis-backed rate limiting active on all endpoints
|
||||
- 429 responses include Retry-After header
|
||||
- Rate limits enforced per-IP, per-user, and per-endpoint
|
||||
- DDoS protection layer active at edge
|
||||
- No single IP can exceed 1000 req/min to any endpoint
|
||||
- Rate limit headers present on all API responses
|
||||
|
||||
validation:
|
||||
- `ab -n 1000 -c 10` against login endpoint → 429s after limit
|
||||
- Verify Redis keys exist for rate limit counters
|
||||
- Check Cloudflare dashboard for blocked threats
|
||||
|
||||
notes:
|
||||
- Current in-memory rate limit in web/src/server/api/utils.ts will not work across multiple server instances
|
||||
- Upstash Redis recommended for serverless deployments
|
||||
- Consider implementing token bucket for burst tolerance
|
||||
62
tasks/web-production/03-input-validation-xss.md
Normal file
62
tasks/web-production/03-input-validation-xss.md
Normal file
@@ -0,0 +1,62 @@
|
||||
# 03. Input Validation & XSS Prevention Audit
|
||||
|
||||
meta:
|
||||
id: web-production-03
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [security, validation, production]
|
||||
|
||||
objective:
|
||||
- Audit and harden all input validation to prevent XSS, injection attacks, and malformed data
|
||||
|
||||
deliverables:
|
||||
- XSS prevention audit report
|
||||
- Input sanitization layer
|
||||
- HTML escaping on all user-generated content
|
||||
- SQL injection protection verification
|
||||
|
||||
steps:
|
||||
1. Audit all tRPC routers for input validation gaps:
|
||||
- Check web/src/server/api/routers/*.ts for missing valibot schemas
|
||||
- Ensure all user inputs have strict type validation
|
||||
- Add maxLength constraints to all string inputs
|
||||
2. Implement output escaping for user-generated content:
|
||||
- Blog posts, user names, alert messages
|
||||
- Use DOMPurify or similar on client-side rendering
|
||||
- Escape HTML entities server-side before DB storage
|
||||
3. Audit database queries for SQL injection:
|
||||
- Verify all queries use Drizzle parameterized queries
|
||||
- Check raw SQL usage in jobs and services
|
||||
- Ensure no string concatenation in SQL
|
||||
4. Add content validation for file uploads (if any):
|
||||
- MIME type verification
|
||||
- File size limits
|
||||
- Scan for malware
|
||||
5. Implement request body size limits:
|
||||
- 1MB max for JSON payloads
|
||||
- 10MB max for file uploads
|
||||
6. Add tests for malformed input handling
|
||||
|
||||
tests:
|
||||
- Unit: Test each router with XSS payloads, SQL injection attempts
|
||||
- Integration: Submit malicious inputs via API, verify safe handling
|
||||
- Security: Run OWASP ZAP or Burp Suite against app
|
||||
|
||||
acceptance_criteria:
|
||||
- All tRPC inputs have strict valibot validation with bounds
|
||||
- User-generated content escaped before rendering
|
||||
- No SQL injection vectors in any query
|
||||
- XSS payloads rendered as plain text, not executed
|
||||
- Request body size limits enforced
|
||||
- OWASP ZAP scan shows no high/critical vulnerabilities
|
||||
|
||||
validation:
|
||||
- Submit `<script>alert('xss')</script>` in all text fields → rendered safely
|
||||
- Submit SQL injection in search fields → no database errors
|
||||
- Run `npm audit` and address all high severity issues
|
||||
|
||||
notes:
|
||||
- Valibot schemas already in use — expand them with stricter bounds
|
||||
- Consider using zod for more complex validation if valibot is limiting
|
||||
- Sanitize inputs at API boundary, not just client-side
|
||||
71
tasks/web-production/04-auth-session-hardening.md
Normal file
71
tasks/web-production/04-auth-session-hardening.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# 04. Authentication & Session Security Hardening
|
||||
|
||||
meta:
|
||||
id: web-production-04
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [security, auth, production]
|
||||
|
||||
objective:
|
||||
- Harden authentication and session management to prevent session hijacking, fixation, and brute force attacks
|
||||
|
||||
deliverables:
|
||||
- Secure session configuration
|
||||
- JWT hardening
|
||||
- Brute force protection
|
||||
- Session invalidation on logout
|
||||
- Multi-factor authentication foundation
|
||||
|
||||
steps:
|
||||
1. Harden JWT implementation in web/src/server/auth/jwt.ts:
|
||||
- Remove fallback secret (currently uses dev secret if env missing)
|
||||
- Add JWT issuer and audience claims
|
||||
- Implement token blacklisting for logout
|
||||
- Add refresh token rotation
|
||||
2. Harden session management in web/src/server/auth/session.ts:
|
||||
- Use httpOnly, secure, sameSite=strict cookies
|
||||
- Add session fingerprinting (user agent hash)
|
||||
- Implement concurrent session limits (max 5 per user)
|
||||
- Add automatic session expiry refresh on activity
|
||||
3. Add brute force protection:
|
||||
- Track failed login attempts per IP/email
|
||||
- Progressive delays: 1s, 2s, 4s, 8s, 16s
|
||||
- Lock account after 10 failed attempts (1 hour)
|
||||
4. Implement secure logout:
|
||||
- Invalidate session in database
|
||||
- Clear all cookies
|
||||
- Blacklist JWT token
|
||||
- Revoke refresh token
|
||||
5. Add MFA foundation:
|
||||
- TOTP secret generation
|
||||
- QR code for authenticator apps
|
||||
- Backup codes
|
||||
6. Audit Clerk integration for security:
|
||||
- Verify webhook signature validation
|
||||
- Check Clerk session sync with custom sessions
|
||||
|
||||
tests:
|
||||
- Unit: Test JWT signing/verification with invalid tokens
|
||||
- Integration: Test brute force lockout, session expiry
|
||||
- Security: Test session hijacking resistance
|
||||
|
||||
acceptance_criteria:
|
||||
- No hardcoded or fallback secrets in auth code
|
||||
- All cookies have httpOnly, secure, sameSite=strict
|
||||
- Brute force protection active on login endpoints
|
||||
- Logout invalidates session completely
|
||||
- JWT tokens include iss, aud, iat, exp claims
|
||||
- Session fingerprinting prevents cookie theft reuse
|
||||
- MFA TOTP generation working with Google Authenticator
|
||||
|
||||
validation:
|
||||
- Attempt 10 failed logins → account locked
|
||||
- Steal session cookie from one browser → invalid in another (fingerprinting)
|
||||
- Logout → session token rejected on subsequent requests
|
||||
- Check JWT with jwt.io → valid iss and aud claims
|
||||
|
||||
notes:
|
||||
- Current JWT has fallback secret — this is critical to fix before production
|
||||
- Clerk handles frontend auth but backend needs its own hardening
|
||||
- Consider using Lucia Auth or NextAuth patterns for session management
|
||||
61
tasks/web-production/05-cdn-asset-optimization.md
Normal file
61
tasks/web-production/05-cdn-asset-optimization.md
Normal file
@@ -0,0 +1,61 @@
|
||||
# 05. CDN & Asset Optimization
|
||||
|
||||
meta:
|
||||
id: web-production-05
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [performance, infrastructure, production]
|
||||
|
||||
objective:
|
||||
- Configure CDN for static assets and optimize frontend bundle delivery
|
||||
|
||||
deliverables:
|
||||
- CDN configuration (Cloudflare, Vercel Edge, or AWS CloudFront)
|
||||
- Asset optimization (images, fonts, JS/CSS)
|
||||
- Brotli/Gzip compression
|
||||
- Cache-Control headers for static assets
|
||||
|
||||
steps:
|
||||
1. Configure CDN for static assets:
|
||||
- Set up Cloudflare or Vercel Edge Network
|
||||
- Point CDN to web/dist/client or .output/public
|
||||
- Configure cache rules for static files
|
||||
2. Optimize image delivery:
|
||||
- Convert landing page SVGs to optimized formats where appropriate
|
||||
- Add responsive image srcset for photos
|
||||
- Implement lazy loading for below-fold images
|
||||
3. Configure compression:
|
||||
- Enable Brotli compression (better than gzip)
|
||||
- Ensure Nitro/Vite build outputs compressed assets
|
||||
4. Set Cache-Control headers:
|
||||
- Immutable assets (hashed filenames): 1 year
|
||||
- HTML pages: no-cache (for SSR)
|
||||
- API responses: no-store or short cache
|
||||
5. Implement resource hints:
|
||||
- Preconnect to API domain, Stripe, Clerk
|
||||
- Prefetch critical routes
|
||||
6. Add tests verifying asset optimization
|
||||
|
||||
tests:
|
||||
- Unit: Test asset hashing and cache headers
|
||||
- Integration: Test CDN cache hit rates
|
||||
- Performance: Lighthouse performance audit >90
|
||||
|
||||
acceptance_criteria:
|
||||
- Static assets served from CDN with <50ms TTFB
|
||||
- Brotli compression active on all text assets
|
||||
- Cache-Control headers correct per asset type
|
||||
- Image optimization reducing total page weight by >30%
|
||||
- Lighthouse Performance score ≥ 90
|
||||
- Preconnect hints present on critical pages
|
||||
|
||||
validation:
|
||||
- `curl -I https://cdn.example.com/assets/main.js` → Cache-Control: public, max-age=31536000, immutable
|
||||
- Lighthouse CI run shows Performance ≥ 90
|
||||
- PageSpeed Insights shows <2s LCP on mobile
|
||||
|
||||
notes:
|
||||
- SolidStart with Nitro should handle asset hashing automatically
|
||||
- Vercel deployment may include CDN automatically
|
||||
- Consider using @solidjs/start image optimization if available
|
||||
62
tasks/web-production/06-db-connection-pooling.md
Normal file
62
tasks/web-production/06-db-connection-pooling.md
Normal file
@@ -0,0 +1,62 @@
|
||||
# 06. Database Connection Pooling & Query Optimization
|
||||
|
||||
meta:
|
||||
id: web-production-06
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [performance, database, production]
|
||||
|
||||
objective:
|
||||
- Optimize database connections and queries for production load
|
||||
|
||||
deliverables:
|
||||
- Connection pooling configuration
|
||||
- Query performance audit
|
||||
- Index optimization
|
||||
- Slow query logging
|
||||
|
||||
steps:
|
||||
1. Configure connection pooling:
|
||||
- If using PostgreSQL: configure PgBouncer or use @libsql/client pooling
|
||||
- Set max connections based on server instances (e.g., 20 per instance)
|
||||
- Add connection timeout and idle timeout settings
|
||||
2. Audit all Drizzle queries for performance:
|
||||
- Check web/src/server/db/schema/*.ts for missing indexes
|
||||
- Review web/src/server/api/routers/*.ts for N+1 queries
|
||||
- Add pagination to all list endpoints (default 50, max 100)
|
||||
3. Add database indexes:
|
||||
- createdAt indexes for time-range queries (alerts, exposures)
|
||||
- Composite indexes for common filter combinations
|
||||
- userId indexes on all user-scoped tables
|
||||
4. Implement query result caching:
|
||||
- Cache user profile lookups (5 min TTL)
|
||||
- Cache subscription status (1 min TTL)
|
||||
- Cache dashboard summary (30 sec TTL)
|
||||
5. Add slow query logging:
|
||||
- Log queries taking >500ms
|
||||
- Alert on >1s queries
|
||||
6. Set up database performance monitoring
|
||||
|
||||
tests:
|
||||
- Unit: Test query execution plans for major endpoints
|
||||
- Load: Run 1000 concurrent dashboard loads, verify <200ms p95
|
||||
- Integration: Test pagination boundaries
|
||||
|
||||
acceptance_criteria:
|
||||
- Database connection pool configured with max 20 connections
|
||||
- No N+1 queries in any API endpoint
|
||||
- All list endpoints paginated with cursor or offset
|
||||
- Slow query logging active
|
||||
- Dashboard load query <100ms p95
|
||||
- Alert endpoint query <50ms p95
|
||||
|
||||
validation:
|
||||
- EXPLAIN ANALYZE on major queries shows index usage
|
||||
- Load test with k6: 1000 concurrent users, p95 < 200ms
|
||||
- Database CPU <50% under normal load
|
||||
|
||||
notes:
|
||||
- Current schema has some indexes but may need more for production scale
|
||||
- Drizzle ORM doesn't automatically handle connection pooling — configure at driver level
|
||||
- Consider read replicas if dashboard load is heavy
|
||||
61
tasks/web-production/07-caching-strategy.md
Normal file
61
tasks/web-production/07-caching-strategy.md
Normal file
@@ -0,0 +1,61 @@
|
||||
# 07. Caching Strategy (Redis + HTTP Cache)
|
||||
|
||||
meta:
|
||||
id: web-production-07
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [performance, caching, production]
|
||||
|
||||
objective:
|
||||
- Implement multi-layer caching to reduce database load and improve response times
|
||||
|
||||
deliverables:
|
||||
- Redis caching layer for API responses
|
||||
- HTTP cache headers for client-side caching
|
||||
- Cache invalidation strategy
|
||||
- Stale-while-revalidate pattern
|
||||
|
||||
steps:
|
||||
1. Implement Redis caching for API responses:
|
||||
- Create web/src/server/lib/cache.ts with Redis-backed cache
|
||||
- Cache user profile: key `user:{id}`, TTL 5 minutes
|
||||
- Cache subscription: key `sub:{userId}`, TTL 1 minute
|
||||
- Cache dashboard summary: key `dash:{userId}`, TTL 30 seconds
|
||||
- Cache blog posts: key `blog:{slug}`, TTL 1 hour
|
||||
2. Add cache decorators/procedures:
|
||||
- Create cachedProcedure wrapper for tRPC
|
||||
- Support cache tags for invalidation
|
||||
3. Implement HTTP caching headers:
|
||||
- Static assets: Cache-Control: public, max-age=31536000, immutable
|
||||
- API responses: Cache-Control: private, max-age=30
|
||||
- HTML pages: Cache-Control: no-cache (SSR)
|
||||
4. Add cache invalidation:
|
||||
- Invalidate user cache on profile update
|
||||
- Invalidate subscription cache on billing event
|
||||
- Invalidate blog cache on publish/edit
|
||||
5. Implement stale-while-revalidate for dashboard data
|
||||
6. Add cache hit/miss metrics
|
||||
|
||||
tests:
|
||||
- Unit: Test cache set/get/delete operations
|
||||
- Integration: Test cache invalidation on mutations
|
||||
- Performance: Compare cached vs uncached response times
|
||||
|
||||
acceptance_criteria:
|
||||
- Redis cache layer active on all read-heavy endpoints
|
||||
- Cache hit rate >80% for user profile and subscription endpoints
|
||||
- Cache invalidation working on all mutations
|
||||
- HTTP cache headers correct per endpoint type
|
||||
- Stale-while-revalidate pattern on dashboard widgets
|
||||
- Cache metrics visible in monitoring dashboard
|
||||
|
||||
validation:
|
||||
- Load test: cached endpoint p95 < 20ms
|
||||
- Verify Redis keys created for cached data
|
||||
- Update profile → cache invalidated, next request hits DB
|
||||
|
||||
notes:
|
||||
- Redis already used for BullMQ jobs — share connection or use separate DB index
|
||||
- Be careful caching authenticated data — always include userId in key
|
||||
- Consider using Vercel KV or Upstash Redis for serverless
|
||||
67
tasks/web-production/08-health-checks-shutdown.md
Normal file
67
tasks/web-production/08-health-checks-shutdown.md
Normal file
@@ -0,0 +1,67 @@
|
||||
# 08. Graceful Shutdown & Health Check Endpoints
|
||||
|
||||
meta:
|
||||
id: web-production-08
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [reliability, infrastructure, production]
|
||||
|
||||
objective:
|
||||
- Implement health checks and graceful shutdown to ensure zero-downtime deployments and reliable operations
|
||||
|
||||
deliverables:
|
||||
- Health check endpoint (/health)
|
||||
- Readiness probe endpoint (/ready)
|
||||
- Graceful shutdown handler
|
||||
- Dependency health checks (DB, Redis, Stripe)
|
||||
|
||||
steps:
|
||||
1. Create health check endpoints:
|
||||
- GET /health → basic liveness (HTTP 200 if process running)
|
||||
- GET /ready → readiness check (DB, Redis, Stripe connectivity)
|
||||
- GET /health/deep → comprehensive check with dependency status
|
||||
2. Implement dependency health checks:
|
||||
- Database: simple SELECT 1 query
|
||||
- Redis: PING command
|
||||
- Stripe: retrieve account info (cached)
|
||||
- WebSocket server: connection count
|
||||
3. Add graceful shutdown:
|
||||
- Handle SIGTERM/SIGINT signals
|
||||
- Stop accepting new connections
|
||||
- Wait for active requests to complete (30s timeout)
|
||||
- Close database connections
|
||||
- Close Redis connections
|
||||
- Exit process cleanly
|
||||
4. Add startup probe:
|
||||
- Delay readiness until all services initialized
|
||||
- Retry logic for DB connection on startup
|
||||
5. Add metrics endpoint (/metrics) for Prometheus:
|
||||
- Request count and duration
|
||||
- Error rates
|
||||
- Active connections
|
||||
- Dependency health status
|
||||
|
||||
tests:
|
||||
- Unit: Test health check responses
|
||||
- Integration: Test graceful shutdown with active requests
|
||||
- Load: Verify zero failed requests during rolling restart
|
||||
|
||||
acceptance_criteria:
|
||||
- /health returns 200 within 100ms
|
||||
- /ready returns 200 only when all dependencies healthy
|
||||
- /ready returns 503 with detailed error when dependency down
|
||||
- Graceful shutdown completes within 30 seconds
|
||||
- Zero failed requests during rolling deployment
|
||||
- Prometheus metrics endpoint available
|
||||
|
||||
validation:
|
||||
- `curl /health` → {"status":"ok"}
|
||||
- `curl /ready` → {"status":"ok","dependencies":{"db":"ok","redis":"ok","stripe":"ok"}}
|
||||
- Stop container with active requests → all complete before exit
|
||||
- Block DB port → /ready returns 503
|
||||
|
||||
notes:
|
||||
- Nitro/SolidStart may need custom server plugin for signal handling
|
||||
- Use node-graceful-shutdown or similar library
|
||||
- Kubernetes/Docker health checks rely on these endpoints
|
||||
66
tasks/web-production/09-structured-logging.md
Normal file
66
tasks/web-production/09-structured-logging.md
Normal file
@@ -0,0 +1,66 @@
|
||||
# 09. Structured Logging & Log Aggregation
|
||||
|
||||
meta:
|
||||
id: web-production-09
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [observability, logging, production]
|
||||
|
||||
objective:
|
||||
- Replace ad-hoc logging with structured, aggregated logging for production debugging and auditing
|
||||
|
||||
deliverables:
|
||||
- Structured logging library integration (Pino or Winston)
|
||||
- Log aggregation pipeline (Datadog, Logtail, or CloudWatch)
|
||||
- Request ID propagation across all logs
|
||||
- Log rotation and retention policy
|
||||
|
||||
steps:
|
||||
1. Add structured logging library:
|
||||
- Install pino or winston in web/package.json
|
||||
- Create web/src/server/lib/logger.ts with configured logger
|
||||
- Replace all console.log/console.error with logger
|
||||
2. Implement request context logging:
|
||||
- Generate request ID for each incoming request
|
||||
- Attach user ID, session ID to log context
|
||||
- Propagate request ID through tRPC context
|
||||
3. Configure log levels:
|
||||
- ERROR: unhandled exceptions, auth failures, DB errors
|
||||
- WARN: rate limit hits, slow queries, deprecated API usage
|
||||
- INFO: requests, logins, signups, billing events
|
||||
- DEBUG: query details, cache hits/misses (dev only)
|
||||
4. Set up log aggregation:
|
||||
- Configure log shipping to aggregation service
|
||||
- Set up log parsing and indexing
|
||||
- Create saved searches for common issues
|
||||
5. Implement log rotation:
|
||||
- 100MB max per file
|
||||
- 7 days retention for production
|
||||
- 30 days retention for audit logs
|
||||
6. Add sensitive data redaction:
|
||||
- Mask credit card numbers, SSNs, passwords in logs
|
||||
- Redact JWT tokens (show only first 10 chars)
|
||||
|
||||
tests:
|
||||
- Unit: Test logger outputs valid JSON
|
||||
- Integration: Test request ID propagation
|
||||
- Security: Verify no sensitive data in logs
|
||||
|
||||
acceptance_criteria:
|
||||
- All logs output as structured JSON
|
||||
- Request ID present on every log line for a given request
|
||||
- Log aggregation service receiving logs in real-time
|
||||
- Sensitive data redacted from all log output
|
||||
- Log rotation preventing disk fill
|
||||
- Searchable logs by user ID, request ID, endpoint
|
||||
|
||||
validation:
|
||||
- Trigger error → log appears in aggregation with stack trace, request ID, user ID
|
||||
- Search logs by request ID → all related logs returned
|
||||
- Check log files → no credit card numbers, passwords, full JWTs
|
||||
|
||||
notes:
|
||||
- Pino is fastest and recommended for Node.js
|
||||
- Use pino-pretty for local development, JSON for production
|
||||
- Consider OpenTelemetry for unified tracing + logging
|
||||
69
tasks/web-production/10-error-tracking.md
Normal file
69
tasks/web-production/10-error-tracking.md
Normal file
@@ -0,0 +1,69 @@
|
||||
# 10. Error Tracking & Alerting (Sentry Integration)
|
||||
|
||||
meta:
|
||||
id: web-production-10
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [observability, error-tracking, production]
|
||||
|
||||
objective:
|
||||
- Implement comprehensive error tracking with Sentry to catch and alert on production errors in real-time
|
||||
|
||||
deliverables:
|
||||
- Sentry integration for backend and frontend
|
||||
- Error alerting rules
|
||||
- Source maps upload for production builds
|
||||
- Breadcrumbs for error context
|
||||
|
||||
steps:
|
||||
1. Add Sentry SDK:
|
||||
- Install @sentry/node for backend
|
||||
- Install @sentry/solid or @sentry/browser for frontend
|
||||
- Configure DSN from environment variable
|
||||
2. Initialize Sentry in backend:
|
||||
- Add to web/src/entry-server.tsx or Nitro plugin
|
||||
- Capture unhandled exceptions
|
||||
- Capture unhandled promise rejections
|
||||
- Attach user context (ID, email) when available
|
||||
3. Initialize Sentry in frontend:
|
||||
- Add to web/src/entry-client.tsx
|
||||
- Capture JavaScript errors
|
||||
- Capture SolidJS component errors via ErrorBoundary
|
||||
- Attach release version and environment
|
||||
4. Configure error alerting:
|
||||
- Slack/Discord/PagerDuty integration for P1 errors
|
||||
- Email alerts for new error types
|
||||
- Digest emails for recurring errors
|
||||
- Alert thresholds: >10 errors/minute or >1 unhandled exception
|
||||
5. Upload source maps:
|
||||
- Configure Vite plugin for source map generation
|
||||
- Upload maps to Sentry during build
|
||||
- Verify error stack traces show original source
|
||||
6. Add breadcrumbs:
|
||||
- Log navigation changes
|
||||
- Log API calls with response status
|
||||
- Log user actions (clicks, form submissions)
|
||||
|
||||
tests:
|
||||
- Unit: Test Sentry capture in error scenarios
|
||||
- Integration: Trigger error, verify appears in Sentry
|
||||
- Alert: Verify alert fires within 1 minute of error
|
||||
|
||||
acceptance_criteria:
|
||||
- 100% of unhandled exceptions captured in Sentry
|
||||
- All errors include user context, request URL, and environment
|
||||
- Source maps working → stack traces show original TypeScript
|
||||
- Alert fired within 60 seconds of first occurrence
|
||||
- No duplicate alerts for same error (grouping working)
|
||||
- Error rate dashboard showing trends over time
|
||||
|
||||
validation:
|
||||
- Deploy with intentional bug → error appears in Sentry within 30s
|
||||
- Check alert channel → notification received
|
||||
- View error detail → correct file, line number, user context
|
||||
|
||||
notes:
|
||||
- Sentry free tier: 5k errors/month — may need paid plan for scale
|
||||
- Use Sentry releases to track which deploy introduced errors
|
||||
- Consider integrating with GitHub for suspect commits
|
||||
70
tasks/web-production/11-metrics-dashboards.md
Normal file
70
tasks/web-production/11-metrics-dashboards.md
Normal file
@@ -0,0 +1,70 @@
|
||||
# 11. Application Metrics & Dashboards
|
||||
|
||||
meta:
|
||||
id: web-production-11
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [observability, metrics, production]
|
||||
|
||||
objective:
|
||||
- Collect and visualize application metrics for performance monitoring and capacity planning
|
||||
|
||||
deliverables:
|
||||
- Prometheus metrics endpoint
|
||||
- Custom business metrics
|
||||
- Grafana or Datadog dashboards
|
||||
- Alerting on metric thresholds
|
||||
|
||||
steps:
|
||||
1. Add metrics collection:
|
||||
- Install prom-client for Node.js metrics
|
||||
- Create web/src/server/lib/metrics.ts
|
||||
- Expose /metrics endpoint for Prometheus scraping
|
||||
2. Collect standard metrics:
|
||||
- HTTP request duration (histogram)
|
||||
- HTTP request count (counter, by status code, endpoint)
|
||||
- Active connections (gauge)
|
||||
- Memory usage (gauge)
|
||||
- Event loop lag (gauge)
|
||||
3. Collect business metrics:
|
||||
- Signup rate (counter)
|
||||
- Login success/failure rate (counter)
|
||||
- Subscription conversions (counter)
|
||||
- DarkWatch scan completions (counter)
|
||||
- Alert generation rate (counter)
|
||||
- Average threat score (gauge)
|
||||
4. Set up dashboards:
|
||||
- Grafana dashboard or Datadog dashboard
|
||||
- Request latency percentiles (p50, p95, p99)
|
||||
- Error rate over time
|
||||
- Business funnel (landing → signup → subscribe)
|
||||
- Infrastructure health (CPU, memory, DB connections)
|
||||
5. Configure alerts:
|
||||
- p99 latency > 500ms for 5 minutes
|
||||
- Error rate > 1% for 2 minutes
|
||||
- Memory usage > 80% for 10 minutes
|
||||
- DB connection pool > 90% for 5 minutes
|
||||
|
||||
tests:
|
||||
- Unit: Test metrics increment correctly
|
||||
- Integration: Verify /metrics endpoint returns valid Prometheus format
|
||||
- Dashboard: Confirm all panels show data
|
||||
|
||||
acceptance_criteria:
|
||||
- /metrics endpoint serving valid Prometheus exposition format
|
||||
- Request duration histogram with 0.1, 0.5, 1, 2, 5 second buckets
|
||||
- Business metrics visible in dashboard
|
||||
- Alert fires when p99 latency exceeds 500ms
|
||||
- Dashboard refreshes every 10 seconds with live data
|
||||
- Metrics retention for 30 days
|
||||
|
||||
validation:
|
||||
- `curl /metrics` → valid Prometheus output
|
||||
- Grafana dashboard shows request latency graph
|
||||
- Trigger slow endpoint → alert fires within 5 minutes
|
||||
|
||||
notes:
|
||||
- Prometheus + Grafana is open source and cost-effective
|
||||
- Datadog is easier but more expensive
|
||||
- Consider using Vercel Analytics if deployed on Vercel
|
||||
69
tasks/web-production/12-uptime-monitoring.md
Normal file
69
tasks/web-production/12-uptime-monitoring.md
Normal file
@@ -0,0 +1,69 @@
|
||||
# 12. Uptime & Performance Monitoring
|
||||
|
||||
meta:
|
||||
id: web-production-12
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [observability, uptime, production]
|
||||
|
||||
objective:
|
||||
- Monitor application uptime and performance from external vantage points to ensure reliability
|
||||
|
||||
deliverables:
|
||||
- External uptime monitoring (Pingdom, UptimeRobot, or Datadog Synthetics)
|
||||
- Synthetic monitoring for critical user journeys
|
||||
- Performance budget enforcement
|
||||
- Status page for incident communication
|
||||
|
||||
steps:
|
||||
1. Set up uptime monitoring:
|
||||
- Configure checks for homepage, API health, dashboard
|
||||
- Check from multiple regions (US East, US West, EU)
|
||||
- 1-minute interval checks
|
||||
- Alert on 2 consecutive failures
|
||||
2. Implement synthetic monitoring:
|
||||
- Signup flow: homepage → signup → verify email
|
||||
- Login flow: login → dashboard → view alerts
|
||||
- Billing flow: dashboard → pricing → checkout (test mode)
|
||||
- DarkWatch flow: dashboard → darkwatch → add watchlist item
|
||||
3. Set performance budgets:
|
||||
- LCP (Largest Contentful Paint) < 2.5s mobile, < 1.5s desktop
|
||||
- FID (First Input Delay) < 100ms
|
||||
- CLS (Cumulative Layout Shift) < 0.1
|
||||
- TTFB (Time to First Byte) < 200ms
|
||||
- API response p95 < 200ms
|
||||
4. Configure alerting:
|
||||
- Downtime alert via Slack/SMS
|
||||
- Performance degradation alert (LCP > 3s)
|
||||
- SSL certificate expiry alert (30 days before)
|
||||
- Domain expiry alert (30 days before)
|
||||
5. Set up status page:
|
||||
- Use statuspage.io or instatus.com
|
||||
- Auto-update from monitoring checks
|
||||
- Subscribe users for incident notifications
|
||||
- Post incident updates and post-mortems
|
||||
|
||||
tests:
|
||||
- Integration: Verify monitoring catches simulated outage
|
||||
- Performance: Confirm synthetic tests complete successfully
|
||||
- Alert: Test alert channels with deliberate failure
|
||||
|
||||
acceptance_criteria:
|
||||
- Uptime monitoring checking every 60 seconds from 3+ regions
|
||||
- 99.9% uptime SLA measured over 30 days
|
||||
- Synthetic tests covering signup, login, and core flows
|
||||
- Performance budget alerts for LCP > 2.5s
|
||||
- Status page accessible and auto-updating
|
||||
- SSL certificate expiry alert 30 days in advance
|
||||
|
||||
validation:
|
||||
- Simulate outage → alert received within 2 minutes
|
||||
- Check status page → shows incident with timeline
|
||||
- Run synthetic test → completes in <30 seconds
|
||||
- Lighthouse CI shows all metrics within budget
|
||||
|
||||
notes:
|
||||
- UptimeRobot free tier: 50 monitors, 5-minute intervals
|
||||
- Pingdom more reliable but paid
|
||||
- Consider using Checkly for synthetic monitoring with JS
|
||||
72
tasks/web-production/13-github-actions-ci.md
Normal file
72
tasks/web-production/13-github-actions-ci.md
Normal file
@@ -0,0 +1,72 @@
|
||||
# 13. GitHub Actions CI Pipeline
|
||||
|
||||
meta:
|
||||
id: web-production-13
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: [web-production-17, web-production-18, web-production-19, web-production-20]
|
||||
tags: [cicd, automation, production]
|
||||
|
||||
objective:
|
||||
- Build a comprehensive CI pipeline that runs tests, linting, type checking, and security scans on every pull request
|
||||
|
||||
deliverables:
|
||||
- GitHub Actions workflow files
|
||||
- PR checks for web and browser-ext
|
||||
- Test reporting and coverage
|
||||
- Dependency vulnerability scanning
|
||||
|
||||
steps:
|
||||
1. Create .github/workflows/ci.yml:
|
||||
- Trigger on pull_request and push to main
|
||||
- Set up Node.js 22 with pnpm
|
||||
- Install dependencies with frozen lockfile
|
||||
2. Add job: lint-and-typecheck:
|
||||
- Run `pnpm lint` (tsc --noEmit)
|
||||
- Run `pnpm lint:ext`
|
||||
- Fail on any TypeScript errors
|
||||
3. Add job: test:
|
||||
- Run `pnpm test` (vitest for web)
|
||||
- Run `pnpm test:ext` (vitest for browser-ext)
|
||||
- Generate coverage reports with @vitest/coverage-v8
|
||||
- Upload coverage to Codecov or similar
|
||||
4. Add job: build:
|
||||
- Run `pnpm build` for web
|
||||
- Run `pnpm build:ext` for browser-ext
|
||||
- Verify build artifacts exist
|
||||
5. Add job: security-scan:
|
||||
- Run `pnpm audit` with --audit-level=high
|
||||
- Run `npm audit fix` suggestions as PR comment
|
||||
- Add OWASP dependency check
|
||||
6. Add job: docker-build:
|
||||
- Build scheduler Dockerfile
|
||||
- Verify Docker image builds successfully
|
||||
7. Configure branch protection:
|
||||
- Require all checks to pass before merge
|
||||
- Require 1 reviewer approval
|
||||
- Require up-to-date branch before merge
|
||||
|
||||
tests:
|
||||
- Integration: Create test PR, verify all checks run
|
||||
- Security: Introduce vulnerable dependency, verify scan catches it
|
||||
- Build: Verify build artifacts are created
|
||||
|
||||
acceptance_criteria:
|
||||
- All PRs trigger CI pipeline automatically
|
||||
- Lint, typecheck, test, build, and security jobs run in parallel
|
||||
- Tests failing blocks PR merge
|
||||
- Coverage report uploaded for every PR
|
||||
- Security vulnerabilities (high+) block PR merge
|
||||
- Docker build verified on every PR
|
||||
- Pipeline completes in <10 minutes
|
||||
|
||||
validation:
|
||||
- Open test PR → all checks green
|
||||
- Introduce TypeScript error → lint job fails
|
||||
- Add vulnerable package → security scan fails
|
||||
- Check Codecov → coverage diff visible in PR
|
||||
|
||||
notes:
|
||||
- Use pnpm/action-setup for proper pnpm installation
|
||||
- Cache node_modules between runs for speed
|
||||
- Consider using GitHub Actions matrix for multiple Node versions
|
||||
75
tasks/web-production/14-deployment-pipeline.md
Normal file
75
tasks/web-production/14-deployment-pipeline.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# 14. Automated Deployment Pipeline
|
||||
|
||||
meta:
|
||||
id: web-production-14
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: [web-production-13, web-production-15, web-production-16]
|
||||
tags: [cicd, deployment, production]
|
||||
|
||||
objective:
|
||||
- Build automated deployment pipelines for staging and production environments with rollback capability
|
||||
|
||||
deliverables:
|
||||
- Staging deployment on merge to main
|
||||
- Production deployment with manual approval
|
||||
- Database migration automation
|
||||
- Rollback strategy
|
||||
|
||||
steps:
|
||||
1. Create .github/workflows/deploy-staging.yml:
|
||||
- Trigger on push to main
|
||||
- Build web application
|
||||
- Run database migrations (drizzle-kit push)
|
||||
- Deploy to staging environment (Vercel, Railway, or VPS)
|
||||
- Run smoke tests against staging
|
||||
2. Create .github/workflows/deploy-production.yml:
|
||||
- Trigger on release published or manual dispatch
|
||||
- Require manual approval from 1 team member
|
||||
- Build and tag Docker image
|
||||
- Run database migrations in dry-run first
|
||||
- Deploy to production with blue-green or rolling strategy
|
||||
- Run post-deploy smoke tests
|
||||
3. Implement database migration safety:
|
||||
- Migrations run before app deployment
|
||||
- Backward-compatible migrations only (add columns, don't drop)
|
||||
- Migration rollback script for each migration
|
||||
- Database backup before production migration
|
||||
4. Add deployment notifications:
|
||||
- Slack notification on deploy start, success, failure
|
||||
- Include commit SHA, author, and changelog
|
||||
5. Implement rollback:
|
||||
- One-click rollback to previous release
|
||||
- Database migration rollback (if safe)
|
||||
- CDN cache purge on rollback
|
||||
6. Add smoke tests:
|
||||
- Test homepage loads
|
||||
- Test login API responds
|
||||
- Test health endpoint
|
||||
- Test critical user journey with Playwright
|
||||
|
||||
tests:
|
||||
- Integration: Deploy to staging, verify app functional
|
||||
- Rollback: Trigger rollback, verify previous version restored
|
||||
- Migration: Test migration failure doesn't break deployment
|
||||
|
||||
acceptance_criteria:
|
||||
- Every merge to main auto-deploys to staging
|
||||
- Production deploy requires manual approval
|
||||
- Database migrations run automatically before app start
|
||||
- Rollback completes in <5 minutes
|
||||
- Smoke tests pass before marking deploy successful
|
||||
- Deployment notifications sent to Slack
|
||||
- Zero-downtime deployment for web app
|
||||
|
||||
validation:
|
||||
- Merge PR → staging deploys automatically within 5 minutes
|
||||
- Trigger production deploy → approval gate shown
|
||||
- Approve → production deploys, smoke tests pass
|
||||
- Introduce bug → rollback to previous version in <5 minutes
|
||||
|
||||
notes:
|
||||
- Vercel offers automatic preview deployments per PR
|
||||
- For VPS deployment, use Docker Compose with rolling restart
|
||||
- Consider using GitHub Environments for approval gates
|
||||
- Database migrations should be additive-only in production
|
||||
75
tasks/web-production/15-docker-infra.md
Normal file
75
tasks/web-production/15-docker-infra.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# 15. Docker & Infrastructure Optimization
|
||||
|
||||
meta:
|
||||
id: web-production-15
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [infrastructure, docker, production]
|
||||
|
||||
objective:
|
||||
- Optimize Docker images and infrastructure for production deployment with security and efficiency
|
||||
|
||||
deliverables:
|
||||
- Multi-stage optimized Dockerfile for web app
|
||||
- Docker Compose for local production simulation
|
||||
- Infrastructure as Code (Terraform or Pulumi)
|
||||
- Security scanning for Docker images
|
||||
|
||||
steps:
|
||||
1. Create optimized Dockerfile for web app:
|
||||
- Multi-stage build (deps → build → runtime)
|
||||
- Use node:22-alpine for minimal image size
|
||||
- Run as non-root user
|
||||
- Copy only necessary files to runtime stage
|
||||
- Health check in Dockerfile
|
||||
2. Optimize scheduler Dockerfile:
|
||||
- Reduce image size (currently copies many files)
|
||||
- Use .dockerignore to exclude unnecessary files
|
||||
- Pin base image versions
|
||||
3. Create docker-compose.prod.yml:
|
||||
- Web app service with replicas
|
||||
- Redis service with persistence
|
||||
- PostgreSQL service (or external)
|
||||
- Nginx reverse proxy with SSL termination
|
||||
- Watchtower for automatic updates
|
||||
4. Add security scanning:
|
||||
- Trivy or Snyk scan in CI pipeline
|
||||
- Fail build on CRITICAL vulnerabilities
|
||||
- Weekly automated scan of production images
|
||||
5. Implement Infrastructure as Code:
|
||||
- Terraform configuration for AWS/GCP/Vultr
|
||||
- VPC, subnets, security groups
|
||||
- ECS/Fargate or Kubernetes deployment
|
||||
- Load balancer with SSL
|
||||
- RDS/Cloud SQL for PostgreSQL
|
||||
- ElastiCache/Memorystore for Redis
|
||||
6. Add environment-specific configs:
|
||||
- Production nginx.conf with rate limiting
|
||||
- SSL certificate management (Let's Encrypt)
|
||||
- Firewall rules
|
||||
|
||||
tests:
|
||||
- Integration: Build image, verify size <200MB
|
||||
- Security: Trivy scan shows no CRITICAL vulnerabilities
|
||||
- Deploy: Terraform apply creates infrastructure
|
||||
|
||||
acceptance_criteria:
|
||||
- Web Docker image <200MB compressed
|
||||
- Scheduler Docker image <150MB compressed
|
||||
- No CRITICAL vulnerabilities in image scans
|
||||
- docker-compose.prod.yml runs full stack locally
|
||||
- Terraform creates reproducible infrastructure
|
||||
- Nginx reverse proxy with SSL and rate limiting
|
||||
- Non-root user running containers
|
||||
|
||||
validation:
|
||||
- `docker images` → web image <200MB
|
||||
- `trivy image kordant-web` → no CRITICAL
|
||||
- `docker-compose -f docker-compose.prod.yml up` → full stack running
|
||||
- `terraform plan` → no unexpected changes
|
||||
|
||||
notes:
|
||||
- Current scheduler/Dockerfile copies many source files — optimize with .dockerignore
|
||||
- Consider using distroless images for even smaller footprint
|
||||
- Use AWS Fargate or Google Cloud Run for serverless containers
|
||||
75
tasks/web-production/16-env-secrets.md
Normal file
75
tasks/web-production/16-env-secrets.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# 16. Environment Management & Secrets Rotation
|
||||
|
||||
meta:
|
||||
id: web-production-16
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [security, infrastructure, production]
|
||||
|
||||
objective:
|
||||
- Implement secure environment variable management and automated secrets rotation
|
||||
|
||||
deliverables:
|
||||
- Environment variable validation on startup
|
||||
- Secrets manager integration (AWS Secrets Manager, Doppler, or 1Password)
|
||||
- Automated secrets rotation
|
||||
- Environment documentation
|
||||
|
||||
steps:
|
||||
1. Create environment validation:
|
||||
- Create web/src/server/lib/env.ts with Zod/Valibot schema
|
||||
- Validate all required env vars on server startup
|
||||
- Fail fast with clear error messages for missing vars
|
||||
- Type-safe env access throughout codebase
|
||||
2. Migrate to secrets manager:
|
||||
- Set up Doppler or AWS Secrets Manager
|
||||
- Move DATABASE_URL, JWT_SECRET, STRIPE_SECRET_KEY, CLERK_SECRET_KEY to secrets manager
|
||||
- Remove secrets from .env files in production
|
||||
- Use short-lived tokens where possible
|
||||
3. Implement secrets rotation:
|
||||
- JWT secret: rotate quarterly
|
||||
- Database credentials: rotate monthly
|
||||
- Stripe keys: rotate after any suspected leak
|
||||
- API keys: rotate every 6 months
|
||||
- Automated rotation scripts
|
||||
4. Add environment documentation:
|
||||
- Document all environment variables in docs/ENVIRONMENT.md
|
||||
- Mark required vs optional
|
||||
- Include examples and validation rules
|
||||
- Document secrets rotation schedule
|
||||
5. Secure local development:
|
||||
- .env.example with dummy values
|
||||
- .env.local in .gitignore
|
||||
- Pre-commit hook to prevent secret commits
|
||||
- Use 1Password CLI or Doppler CLI for local secrets
|
||||
6. Audit existing secrets:
|
||||
- Scan git history for leaked secrets (git-secrets, truffleHog)
|
||||
- Rotate any potentially leaked secrets
|
||||
- Enable GitHub secret scanning
|
||||
|
||||
tests:
|
||||
- Unit: Test env validation catches missing vars
|
||||
- Security: Verify no secrets in codebase with scanner
|
||||
- Integration: Test secrets manager integration
|
||||
|
||||
acceptance_criteria:
|
||||
- Server fails to start with clear error if required env var missing
|
||||
- Zero secrets in codebase or git history
|
||||
- All production secrets stored in secrets manager
|
||||
- Rotation schedule documented and automated
|
||||
- Environment documentation complete and accurate
|
||||
- GitHub secret scanning enabled
|
||||
- Pre-commit hooks preventing secret commits
|
||||
|
||||
validation:
|
||||
- Remove DATABASE_URL → server exits with clear error
|
||||
- Run truffleHog → no secrets found in history
|
||||
- Check secrets manager → all production secrets stored
|
||||
- Run rotation script → new JWT secret generated, app continues working
|
||||
|
||||
notes:
|
||||
- Doppler is excellent for team secret management
|
||||
- AWS Secrets Manager integrates well with ECS/Fargate
|
||||
- Never commit .env files — use .env.example only
|
||||
- Consider using sealed secrets for Kubernetes
|
||||
73
tasks/web-production/17-e2e-testing.md
Normal file
73
tasks/web-production/17-e2e-testing.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# 17. End-to-End Testing (Playwright)
|
||||
|
||||
meta:
|
||||
id: web-production-17
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [testing, e2e, quality]
|
||||
|
||||
objective:
|
||||
- Implement comprehensive end-to-end tests covering critical user journeys using Playwright
|
||||
|
||||
deliverables:
|
||||
- Playwright test suite for critical flows
|
||||
- Test database seeding and cleanup
|
||||
- Visual regression testing setup
|
||||
- CI integration for E2E tests
|
||||
|
||||
steps:
|
||||
1. Install and configure Playwright:
|
||||
- Install @playwright/test in web/package.json
|
||||
- Create playwright.config.ts with project settings
|
||||
- Configure test database (separate from dev)
|
||||
2. Create test utilities:
|
||||
- Test user creation helper
|
||||
- Database reset between tests
|
||||
- Authentication state management
|
||||
- API mocking helpers
|
||||
3. Write critical path tests:
|
||||
- Landing page → Signup → Onboarding → Dashboard
|
||||
- Login → Dashboard → DarkWatch → Add watchlist item
|
||||
- Login → Settings → Update profile
|
||||
- Login → Billing → View pricing → Checkout (test mode)
|
||||
- Admin login → Blog → Create post → Publish
|
||||
- Real-time alerts: WebSocket connection and alert display
|
||||
4. Add visual regression tests:
|
||||
- Screenshot comparison for landing page
|
||||
- Screenshot comparison for dashboard
|
||||
- Screenshot comparison for mobile responsive layout
|
||||
5. Configure test data:
|
||||
- Seed test database with known data
|
||||
- Use test Stripe keys for billing tests
|
||||
- Mock external APIs (Twilio, FCM) in tests
|
||||
6. Add CI integration:
|
||||
- Run E2E tests on PR (not blocking initially)
|
||||
- Upload test artifacts (screenshots, videos)
|
||||
- Parallel test execution across browsers
|
||||
|
||||
tests:
|
||||
- E2E: All critical paths pass in CI
|
||||
- Visual: Screenshot diffs reviewed and approved
|
||||
- Cross-browser: Tests pass on Chromium, Firefox, WebKit
|
||||
|
||||
acceptance_criteria:
|
||||
- 10+ E2E tests covering critical user journeys
|
||||
- Tests run in <5 minutes with parallel execution
|
||||
- Tests pass on Chromium, Firefox, and WebKit
|
||||
- Visual regression catching UI changes
|
||||
- Test artifacts (screenshots, videos) uploaded on failure
|
||||
- Tests use isolated test database
|
||||
- Mobile viewport tests included
|
||||
|
||||
validation:
|
||||
- `npx playwright test` → all tests pass
|
||||
- CI pipeline runs E2E tests on PR
|
||||
- Change button color → visual regression test fails
|
||||
- Check test report → screenshots and traces available
|
||||
|
||||
notes:
|
||||
- Playwright is faster and more reliable than Cypress
|
||||
- Use test database to avoid polluting dev data
|
||||
- Start with 5 critical paths, expand over time
|
||||
- Consider using MSW for API mocking in tests
|
||||
78
tasks/web-production/18-load-testing.md
Normal file
78
tasks/web-production/18-load-testing.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# 18. Load & Stress Testing
|
||||
|
||||
meta:
|
||||
id: web-production-18
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [testing, performance, production]
|
||||
|
||||
objective:
|
||||
- Validate application performance under production-like load and identify bottlenecks
|
||||
|
||||
deliverables:
|
||||
- Load test suite with k6 or Artillery
|
||||
- Performance baseline documentation
|
||||
- Bottleneck identification report
|
||||
- Scaling recommendations
|
||||
|
||||
steps:
|
||||
1. Set up load testing tool:
|
||||
- Install k6 or Artillery
|
||||
- Create tests/ directory for load tests
|
||||
- Configure test environment (staging)
|
||||
2. Write load tests for critical endpoints:
|
||||
- GET / (landing page)
|
||||
- POST /api/trpc/user.login
|
||||
- GET /api/trpc/user.me (authenticated)
|
||||
- GET /api/trpc/darkwatch.getExposures
|
||||
- GET /api/trpc/alerts.getAlerts
|
||||
- WebSocket connection and alert subscription
|
||||
3. Define load scenarios:
|
||||
- Baseline: 100 concurrent users, 5 minutes
|
||||
- Target: 1000 concurrent users, 10 minutes
|
||||
- Stress: 5000 concurrent users, 5 minutes
|
||||
- Spike: 0 to 2000 users in 10 seconds
|
||||
4. Measure and record:
|
||||
- Response time percentiles (p50, p95, p99)
|
||||
- Error rate
|
||||
- Requests per second (throughput)
|
||||
- CPU and memory usage on server
|
||||
- Database connection pool utilization
|
||||
- Redis memory usage
|
||||
5. Identify bottlenecks:
|
||||
- Slow queries from database
|
||||
- Memory leaks
|
||||
- Connection pool exhaustion
|
||||
- CPU-bound operations
|
||||
6. Document scaling recommendations:
|
||||
- Horizontal scaling (more instances)
|
||||
- Vertical scaling (bigger instances)
|
||||
- Caching improvements
|
||||
- Query optimization
|
||||
|
||||
tests:
|
||||
- Load: Baseline test passes with <200ms p95
|
||||
- Stress: App remains functional under 5x normal load
|
||||
- Spike: App recovers within 30 seconds after spike
|
||||
|
||||
acceptance_criteria:
|
||||
- Baseline load (100 concurrent) → p95 < 200ms, 0% errors
|
||||
- Target load (1000 concurrent) → p95 < 500ms, <1% errors
|
||||
- Stress load (5000 concurrent) → no crashes, <5% errors
|
||||
- Spike test → recovery within 30 seconds
|
||||
- Performance baseline documented with metrics
|
||||
- Bottleneck report with actionable recommendations
|
||||
- Scaling plan documented
|
||||
|
||||
validation:
|
||||
- Run k6 against staging → results within acceptable thresholds
|
||||
- Check server metrics during test → CPU <80%, memory <80%
|
||||
- Database connections → pool not exhausted
|
||||
- Review report → identified 3+ bottlenecks with fixes
|
||||
|
||||
notes:
|
||||
- Always test against staging, never production
|
||||
- Schedule load tests during low-traffic periods
|
||||
- Use k6 Cloud for distributed load testing if needed
|
||||
- Consider using Vercel Analytics for real-user monitoring (RUM)
|
||||
78
tasks/web-production/19-accessibility-audit.md
Normal file
78
tasks/web-production/19-accessibility-audit.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# 19. Accessibility Audit & WCAG Compliance
|
||||
|
||||
meta:
|
||||
id: web-production-19
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [testing, accessibility, compliance]
|
||||
|
||||
objective:
|
||||
- Ensure the web application meets WCAG 2.1 AA standards and is usable by people with disabilities
|
||||
|
||||
deliverables:
|
||||
- Automated accessibility testing with axe-core
|
||||
- Manual keyboard navigation audit
|
||||
- Screen reader testing
|
||||
- Accessibility statement page
|
||||
|
||||
steps:
|
||||
1. Set up automated accessibility testing:
|
||||
- Install @axe-core/react or jest-axe
|
||||
- Add accessibility tests to component test suite
|
||||
- Integrate axe-core with Playwright E2E tests
|
||||
- Fail build on critical accessibility violations
|
||||
2. Run automated audit:
|
||||
- Test all pages: landing, auth, dashboard, settings
|
||||
- Check for: missing alt text, low contrast, missing labels, focus issues
|
||||
- Generate report with violation severity
|
||||
3. Manual keyboard navigation audit:
|
||||
- Navigate entire app using only Tab, Enter, Space, Escape
|
||||
- Verify focus indicators visible on all interactive elements
|
||||
- Test skip links and logical tab order
|
||||
- Verify no keyboard traps
|
||||
4. Screen reader testing:
|
||||
- Test with NVDA (Windows) or VoiceOver (macOS)
|
||||
- Verify all interactive elements have accessible names
|
||||
- Test live regions for dynamic content (alerts, toasts)
|
||||
- Verify form error messages announced
|
||||
5. Fix critical issues:
|
||||
- Add missing aria-labels and aria-describedby
|
||||
- Fix color contrast ratios (minimum 4.5:1 for normal text)
|
||||
- Ensure all images have alt text
|
||||
- Add proper heading hierarchy (h1 → h2 → h3)
|
||||
6. Create accessibility statement:
|
||||
- Page at /accessibility
|
||||
- Commitment to WCAG 2.1 AA
|
||||
- Known limitations
|
||||
- Contact for accessibility feedback
|
||||
7. Add accessibility CI check:
|
||||
- Lighthouse accessibility audit >95
|
||||
- axe-core scan in CI pipeline
|
||||
|
||||
tests:
|
||||
- Automated: axe-core scan passes with 0 violations
|
||||
- Manual: Keyboard navigation completes all flows
|
||||
- Screen reader: All critical paths navigable
|
||||
|
||||
acceptance_criteria:
|
||||
- WCAG 2.1 AA compliance on all pages
|
||||
- Lighthouse accessibility score ≥ 95
|
||||
- 0 critical or serious axe-core violations
|
||||
- All interactive elements keyboard accessible
|
||||
- Focus indicators visible and logical
|
||||
- All images have descriptive alt text
|
||||
- Color contrast ratios ≥ 4.5:1 for normal text
|
||||
- Accessibility statement page live
|
||||
|
||||
validation:
|
||||
- Run axe-core → 0 critical/serious violations
|
||||
- Lighthouse CI → Accessibility score ≥ 95
|
||||
- Navigate with keyboard only → complete signup flow
|
||||
- Screen reader test → all elements announced correctly
|
||||
|
||||
notes:
|
||||
- Current app has some accessibility features (skip link, aria-live) but needs audit
|
||||
- SolidJS components need proper aria attributes
|
||||
- Consider using Radix UI primitives for built-in accessibility
|
||||
- Test with actual assistive technology, not just automated tools
|
||||
71
tasks/web-production/20-dependency-scanning.md
Normal file
71
tasks/web-production/20-dependency-scanning.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# 20. Dependency Vulnerability Scanning
|
||||
|
||||
meta:
|
||||
id: web-production-20
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [security, dependencies, production]
|
||||
|
||||
objective:
|
||||
- Implement continuous dependency vulnerability scanning and automated updates
|
||||
|
||||
deliverables:
|
||||
- npm audit integration in CI
|
||||
- Snyk or Dependabot monitoring
|
||||
- Automated security patch PRs
|
||||
- SBOM (Software Bill of Materials) generation
|
||||
|
||||
steps:
|
||||
1. Set up automated scanning:
|
||||
- Enable Dependabot alerts in GitHub repository settings
|
||||
- Configure Dependabot version updates (weekly)
|
||||
- Add Snyk integration for deeper analysis
|
||||
- Configure Snyk to fail builds on high+ severity
|
||||
2. Add CI scanning:
|
||||
- `pnpm audit --audit-level=high` in GitHub Actions
|
||||
- `snyk test` in CI pipeline
|
||||
- Block PR merge on high/critical vulnerabilities
|
||||
3. Implement automated patching:
|
||||
- Dependabot auto-PR for patch updates
|
||||
- Snyk auto-fix PRs for fixable vulnerabilities
|
||||
- Manual review required for major version updates
|
||||
4. Generate SBOM:
|
||||
- Use cyclonedx or spdx-sbom-generator
|
||||
- Generate on every release
|
||||
- Store with release artifacts
|
||||
5. Audit current dependencies:
|
||||
- Run `pnpm audit` and fix all high/critical issues
|
||||
- Check for unmaintained packages
|
||||
- Review direct dependencies for necessity
|
||||
- Remove unused dependencies
|
||||
6. Set up alerting:
|
||||
- Slack notification for new vulnerabilities
|
||||
- Weekly vulnerability report
|
||||
- Emergency alert for critical CVEs
|
||||
|
||||
tests:
|
||||
- Security: Introduce vulnerable package → CI blocks merge
|
||||
- Integration: Verify Dependabot creates PR for outdated package
|
||||
- Audit: SBOM generated and contains all dependencies
|
||||
|
||||
acceptance_criteria:
|
||||
- Zero high or critical vulnerabilities in dependencies
|
||||
- Dependabot monitoring all dependencies
|
||||
- CI fails on high+ severity vulnerabilities
|
||||
- SBOM generated for every release
|
||||
- Automated PRs for security patches within 24 hours
|
||||
- Weekly dependency update report
|
||||
- All unused dependencies removed
|
||||
|
||||
validation:
|
||||
- `pnpm audit` → 0 high/critical findings
|
||||
- Check GitHub Security tab → no open alerts
|
||||
- Merge PR with vulnerable package → CI fails
|
||||
- Create release → SBOM artifact attached
|
||||
|
||||
notes:
|
||||
- Some vulnerabilities may be in devDependencies — these are lower priority
|
||||
- Focus on production dependencies first
|
||||
- Consider using pnpm overrides for emergency patches
|
||||
- Review major version updates carefully for breaking changes
|
||||
78
tasks/web-production/21-legal-pages.md
Normal file
78
tasks/web-production/21-legal-pages.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# 21. Privacy Policy, TOS & Legal Pages
|
||||
|
||||
meta:
|
||||
id: web-production-21
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [compliance, legal, production]
|
||||
|
||||
objective:
|
||||
- Create and deploy all required legal pages for production operation
|
||||
|
||||
deliverables:
|
||||
- Privacy Policy page (/privacy)
|
||||
- Terms of Service page (/terms)
|
||||
- Cookie Policy page (/cookies)
|
||||
- Data Processing Agreement (DPA) page
|
||||
- Legal pages linked in footer
|
||||
|
||||
steps:
|
||||
1. Create Privacy Policy:
|
||||
- Data collection practices (what, why, how long)
|
||||
- Third-party services (Stripe, Clerk, Twilio, Firebase)
|
||||
- User rights (access, rectification, deletion, portability)
|
||||
- Contact information for privacy inquiries
|
||||
- Last updated date
|
||||
2. Create Terms of Service:
|
||||
- Service description and limitations
|
||||
- User responsibilities and prohibited conduct
|
||||
- Subscription terms and billing
|
||||
- Termination clauses
|
||||
- Limitation of liability
|
||||
- Dispute resolution
|
||||
3. Create Cookie Policy:
|
||||
- Types of cookies used (essential, analytics, marketing)
|
||||
- Purpose of each cookie
|
||||
- How to manage cookies
|
||||
- Third-party cookies
|
||||
4. Create Data Processing Agreement:
|
||||
- Roles and responsibilities
|
||||
- Data security measures
|
||||
- Subprocessor list
|
||||
- Breach notification procedures
|
||||
5. Add legal pages to app:
|
||||
- Create routes: /privacy, /terms, /cookies, /dpa
|
||||
- Add links in Footer component
|
||||
- Ensure pages are server-rendered for SEO
|
||||
6. Review with legal counsel:
|
||||
- Have privacy policy reviewed by attorney
|
||||
- Ensure compliance with applicable jurisdictions
|
||||
- Update based on feedback
|
||||
|
||||
tests:
|
||||
- Unit: Test routes render correctly
|
||||
- Integration: Verify links in footer navigate correctly
|
||||
- Compliance: Review with legal counsel
|
||||
|
||||
acceptance_criteria:
|
||||
- Privacy Policy live at /privacy
|
||||
- Terms of Service live at /terms
|
||||
- Cookie Policy live at /cookies
|
||||
- DPA live at /dpa
|
||||
- All pages linked in site footer
|
||||
- Pages reviewed and approved by legal counsel
|
||||
- Last updated date within 30 days of launch
|
||||
- Contact email for privacy inquiries functional
|
||||
|
||||
validation:
|
||||
- Navigate to /privacy → complete policy displayed
|
||||
- Click footer links → correct pages load
|
||||
- Legal counsel approval documented
|
||||
- Email to privacy@kordant.com → received
|
||||
|
||||
notes:
|
||||
- Consider using Termly or iubenda for generated policies
|
||||
- Ensure policies cover all data processors (Stripe, Clerk, etc.)
|
||||
- Update policies when adding new third-party services
|
||||
- Keep records of user consent to terms
|
||||
80
tasks/web-production/22-cookie-gdpr.md
Normal file
80
tasks/web-production/22-cookie-gdpr.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# 22. Cookie Consent & GDPR Compliance
|
||||
|
||||
meta:
|
||||
id: web-production-22
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [compliance, gdpr, cookies, production]
|
||||
|
||||
objective:
|
||||
- Implement GDPR-compliant cookie consent with granular controls and data processing transparency
|
||||
|
||||
deliverables:
|
||||
- Cookie consent banner component
|
||||
- Granular cookie preference management
|
||||
- Consent storage and enforcement
|
||||
- GDPR compliance verification
|
||||
|
||||
steps:
|
||||
1. Create cookie consent banner:
|
||||
- Banner appears on first visit
|
||||
- Accept all, reject non-essential, customize options
|
||||
- Links to cookie policy
|
||||
- Dismissible but persistent until choice made
|
||||
- Mobile-responsive design
|
||||
2. Implement granular controls:
|
||||
- Essential cookies (always on): auth, security
|
||||
- Analytics cookies (opt-in): PostHog, Plausible
|
||||
- Marketing cookies (opt-in): retargeting, ads
|
||||
- Preference cookies (opt-in): theme, language
|
||||
3. Create preference modal:
|
||||
- Toggle switches for each category
|
||||
- Description of each cookie type
|
||||
- Save preferences button
|
||||
- Re-openable from footer link
|
||||
4. Implement consent enforcement:
|
||||
- Store consent in cookie/localStorage
|
||||
- Block analytics scripts until consent given
|
||||
- Block marketing scripts until consent given
|
||||
- Respect "Do Not Track" browser setting
|
||||
5. Add GDPR-specific features:
|
||||
- Data processing notice in signup flow
|
||||
- Right to access data (export tool)
|
||||
- Right to erasure (delete account)
|
||||
- Right to portability (data export)
|
||||
- Data retention periods documented
|
||||
6. Add consent logging:
|
||||
- Log consent choices with timestamp
|
||||
- Store for compliance audit trail
|
||||
- Allow users to view their consent history
|
||||
|
||||
tests:
|
||||
- Unit: Test consent banner rendering and interaction
|
||||
- Integration: Test analytics blocked until consent
|
||||
- Compliance: Verify DNT respected
|
||||
|
||||
acceptance_criteria:
|
||||
- Cookie banner appears on first visit to all users
|
||||
- Users can accept, reject, or customize cookie preferences
|
||||
- Analytics scripts load only after opt-in consent
|
||||
- Marketing scripts load only after opt-in consent
|
||||
- Essential cookies function without consent
|
||||
- Consent preferences persist across sessions
|
||||
- "Do Not Track" browser setting respected
|
||||
- Consent choice logged with timestamp
|
||||
- GDPR rights accessible from settings page
|
||||
- Cookie policy linked from banner and footer
|
||||
|
||||
validation:
|
||||
- Clear cookies → visit site → banner appears
|
||||
- Click "Reject" → analytics network requests blocked
|
||||
- Click "Customize" → toggle analytics on → requests allowed
|
||||
- Enable DNT in browser → banner shows "DNT detected"
|
||||
- Check localStorage → consent object stored
|
||||
|
||||
notes:
|
||||
- Use CookieConsent by Orestbida or build custom with SolidJS
|
||||
- Must comply with both GDPR (EU) and CCPA (California)
|
||||
- Analytics must be completely blocked, not just paused
|
||||
- Document consent choices for 2 years (regulatory requirement)
|
||||
76
tasks/web-production/23-data-export-deletion.md
Normal file
76
tasks/web-production/23-data-export-deletion.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# 23. Data Export & Deletion Tools
|
||||
|
||||
meta:
|
||||
id: web-production-23
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [compliance, gdpr, privacy, production]
|
||||
|
||||
objective:
|
||||
- Implement user-facing data export and account deletion tools to comply with GDPR and CCPA requirements
|
||||
|
||||
deliverables:
|
||||
- Data export API and UI (/settings/data-export)
|
||||
- Account deletion API and UI (/settings/delete-account)
|
||||
- Data retention policy enforcement
|
||||
- Deletion confirmation and grace period
|
||||
|
||||
steps:
|
||||
1. Create data export functionality:
|
||||
- API endpoint: POST /api/trpc/user.exportData
|
||||
- Collect all user data: profile, alerts, exposures, subscriptions, family members
|
||||
- Format as JSON or machine-readable format
|
||||
- Include metadata: export date, data categories
|
||||
- Email download link or provide direct download
|
||||
- Complete within 30 days (GDPR requirement)
|
||||
2. Create account deletion:
|
||||
- UI in settings page with warning and confirmation
|
||||
- Require password re-entry for confirmation
|
||||
- API endpoint: POST /api/trpc/user.delete
|
||||
- Soft delete first (mark deletedAt, anonymize)
|
||||
- Hard delete after 30-day grace period
|
||||
- Cancel active subscriptions via Stripe
|
||||
- Remove from email lists
|
||||
3. Implement family data handling:
|
||||
- If family group owner: transfer ownership or delete group
|
||||
- If family member: remove from group
|
||||
- Notify family members of account deletion
|
||||
4. Add data retention policy:
|
||||
- Define retention periods per data type
|
||||
- Automated cleanup of deleted accounts after 30 days
|
||||
- Audit logs retained for 1 year
|
||||
- Backup deletion after retention period
|
||||
5. Add admin tools:
|
||||
- Admin endpoint to fulfill data export requests
|
||||
- Admin endpoint to process deletion requests
|
||||
- Audit log of all export/deletion actions
|
||||
|
||||
tests:
|
||||
- Unit: Test export includes all user data
|
||||
- Integration: Test deletion flow end-to-end
|
||||
- Compliance: Verify grace period and hard delete
|
||||
|
||||
acceptance_criteria:
|
||||
- Users can export all personal data from settings
|
||||
- Export includes: profile, alerts, exposures, watchlist, subscriptions, family data
|
||||
- Export delivered within 30 seconds (async for large data)
|
||||
- Account deletion requires password confirmation
|
||||
- Deleted accounts soft-deleted immediately, hard-deleted after 30 days
|
||||
- Active subscriptions cancelled on deletion
|
||||
- Family group handled correctly (ownership transfer)
|
||||
- Deletion audit log maintained
|
||||
- Data retention policy documented and enforced
|
||||
|
||||
validation:
|
||||
- Export data → JSON file contains all user data
|
||||
- Delete account → user marked deleted, can login to restore within 30 days
|
||||
- After 30 days → user data completely removed from DB
|
||||
- Check Stripe → subscription cancelled
|
||||
- Check audit log → deletion action recorded
|
||||
|
||||
notes:
|
||||
- Soft delete preserves referential integrity for family groups
|
||||
- Hard delete must cascade through all related tables
|
||||
- Consider GDPR Article 17 exceptions (legal obligations)
|
||||
- Backup restoration may temporarily restore deleted data
|
||||
79
tasks/web-production/24-security-txt.md
Normal file
79
tasks/web-production/24-security-txt.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# 24. Security.txt & Responsible Disclosure
|
||||
|
||||
meta:
|
||||
id: web-production-24
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [security, compliance, production]
|
||||
|
||||
objective:
|
||||
- Implement security.txt and responsible disclosure process for security researchers
|
||||
|
||||
deliverables:
|
||||
- security.txt file at /.well-known/security.txt
|
||||
- security@kordant.com email address
|
||||
- Responsible disclosure policy page
|
||||
- Bug bounty program foundation
|
||||
|
||||
steps:
|
||||
1. Create security.txt:
|
||||
- Contact: mailto:security@kordant.com
|
||||
- Expires: date 1 year in future
|
||||
- Encryption: link to PGP key (optional)
|
||||
- Acknowledgments: link to hall of fame page
|
||||
- Policy: link to disclosure policy
|
||||
- Hiring: link to security jobs (if applicable)
|
||||
2. Create responsible disclosure policy:
|
||||
- Page at /security/disclosure
|
||||
- Scope of testing (what's in scope, what's out)
|
||||
- Rules of engagement (no DDoS, no data exfiltration)
|
||||
- Safe harbor promise (won't prosecute good faith research)
|
||||
- Reporting process and expected response time
|
||||
- Reward/recognition program details
|
||||
3. Set up security email:
|
||||
- Create security@kordant.com alias
|
||||
- Forward to engineering team
|
||||
- Set up auto-responder with acknowledgment
|
||||
- Create internal triage process
|
||||
4. Create vulnerability response process:
|
||||
- Internal SLA: acknowledge within 48 hours
|
||||
- Triage within 72 hours
|
||||
- Fix critical vulnerabilities within 7 days
|
||||
- Fix high severity within 30 days
|
||||
- Public disclosure after fix deployed
|
||||
5. Add hall of fame page:
|
||||
- Page at /security/hall-of-fame
|
||||
- List researchers who reported valid vulnerabilities
|
||||
- Include date, severity, and researcher name (with permission)
|
||||
6. Add security page to footer:
|
||||
- Link to disclosure policy
|
||||
- Link to security.txt
|
||||
- Link to hall of fame
|
||||
|
||||
tests:
|
||||
- Integration: Verify security.txt accessible
|
||||
- Process: Test email auto-responder
|
||||
- Content: Review policy with security team
|
||||
|
||||
acceptance_criteria:
|
||||
- security.txt accessible at /.well-known/security.txt
|
||||
- Disclosure policy live at /security/disclosure
|
||||
- security@kordant.com email active with auto-responder
|
||||
- Hall of fame page live at /security/hall-of-fame
|
||||
- Safe harbor promise clearly stated
|
||||
- Response SLA documented and followed
|
||||
- Security links in site footer
|
||||
- PGP key available for encrypted communication (optional)
|
||||
|
||||
validation:
|
||||
- `curl https://kordant.com/.well-known/security.txt` → valid security.txt
|
||||
- Email security@kordant.com → auto-responder received
|
||||
- Navigate to /security/disclosure → complete policy visible
|
||||
- Check footer → security links present
|
||||
|
||||
notes:
|
||||
- security.txt standard defined by RFC 9116
|
||||
- Safe harbor is critical for encouraging responsible disclosure
|
||||
- Consider joining HackerOne or Bugcrowd for managed bug bounty
|
||||
- Document vulnerability severity classification (CVSS)
|
||||
83
tasks/web-production/25-seo-meta.md
Normal file
83
tasks/web-production/25-seo-meta.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# 25. Sitemap, Robots.txt & Open Graph
|
||||
|
||||
meta:
|
||||
id: web-production-25
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [seo, marketing, production]
|
||||
|
||||
objective:
|
||||
- Implement SEO fundamentals including sitemap, robots.txt, and Open Graph meta tags for all pages
|
||||
|
||||
deliverables:
|
||||
- Dynamic sitemap.xml generation
|
||||
- robots.txt configuration
|
||||
- Open Graph meta tags on all pages
|
||||
- Twitter Card meta tags
|
||||
- Canonical URLs
|
||||
|
||||
steps:
|
||||
1. Create dynamic sitemap:
|
||||
- Route: /sitemap.xml
|
||||
- Include all public pages: /, /about, /features, /pricing, /blog/*
|
||||
- Include auth pages: /login, /signup
|
||||
- Exclude admin pages and user-specific pages
|
||||
- Set priorities and change frequencies
|
||||
- Auto-update when blog posts published
|
||||
2. Create robots.txt:
|
||||
- Allow: all public pages
|
||||
- Disallow: /(admin)/*, /api/*, /billing/*, /auth/*
|
||||
- Sitemap reference
|
||||
- Crawl-delay for respectful crawling
|
||||
3. Add Open Graph tags to all pages:
|
||||
- og:title matching page title
|
||||
- og:description from meta description
|
||||
- og:image with branded preview image (1200x630)
|
||||
- og:url with canonical URL
|
||||
- og:type (website, article for blog)
|
||||
- og:site_name: Kordant
|
||||
4. Add Twitter Card tags:
|
||||
- twitter:card: summary_large_image
|
||||
- twitter:title, twitter:description, twitter:image
|
||||
5. Add canonical URLs:
|
||||
- Prevent duplicate content issues
|
||||
- Use absolute URLs with https
|
||||
- Handle query parameters correctly
|
||||
6. Create branded OG image:
|
||||
- Design 1200x630px image with Kordant branding
|
||||
- Include logo, tagline, and shield icon
|
||||
- Generate dynamically for blog posts (optional)
|
||||
7. Add structured data:
|
||||
- Organization schema on homepage
|
||||
- WebSite schema with SearchAction
|
||||
- Article schema for blog posts
|
||||
- SoftwareApplication schema for app
|
||||
|
||||
tests:
|
||||
- Unit: Test sitemap XML generation
|
||||
- Integration: Verify meta tags on all pages
|
||||
- SEO: Test with Facebook Sharing Debugger and Twitter Card Validator
|
||||
|
||||
acceptance_criteria:
|
||||
- Sitemap accessible at /sitemap.xml with all public pages
|
||||
- robots.txt accessible at /robots.txt with correct directives
|
||||
- Open Graph tags present on all public pages
|
||||
- Twitter Card tags present on all public pages
|
||||
- Canonical URL on every page
|
||||
- Branded OG image displaying correctly in social shares
|
||||
- Structured data valid per schema.org (test with Google Rich Results)
|
||||
- Blog posts have Article schema
|
||||
|
||||
validation:
|
||||
- `curl /sitemap.xml` → valid XML with all routes
|
||||
- `curl /robots.txt` → correct allow/disallow directives
|
||||
- Facebook Sharing Debugger → OG image and title display correctly
|
||||
- Google Rich Results Test → structured data valid
|
||||
- View page source → all meta tags present
|
||||
|
||||
notes:
|
||||
- SolidJS MetaProvider already in use — extend with OG tags
|
||||
- Use @solidjs/meta for dynamic meta tags per route
|
||||
- Consider using @vercel/og or similar for dynamic OG images
|
||||
- Blog sitemap should update automatically on publish
|
||||
83
tasks/web-production/26-analytics.md
Normal file
83
tasks/web-production/26-analytics.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# 26. Analytics Integration (Plausible/PostHog)
|
||||
|
||||
meta:
|
||||
id: web-production-26
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [analytics, marketing, production]
|
||||
|
||||
objective:
|
||||
- Implement privacy-respecting analytics to understand user behavior and measure conversion funnels
|
||||
|
||||
deliverables:
|
||||
- Analytics tracking setup
|
||||
- Custom event tracking for key actions
|
||||
- Conversion funnel measurement
|
||||
- Dashboard for key metrics
|
||||
|
||||
steps:
|
||||
1. Set up analytics platform:
|
||||
- Choose: Plausible (privacy-first, simple) or PostHog (powerful, self-hostable)
|
||||
- Create account and add tracking script
|
||||
- Configure domain and goals
|
||||
2. Add tracking to app:
|
||||
- Add script to web/src/entry-client.tsx or layout
|
||||
- Respect cookie consent (load only after opt-in)
|
||||
- Respect Do Not Track
|
||||
- Exclude admin traffic
|
||||
3. Track page views:
|
||||
- All public pages
|
||||
- Dashboard pages (anonymized)
|
||||
- Blog post reads
|
||||
4. Track custom events:
|
||||
- signup_started, signup_completed
|
||||
- login, logout
|
||||
- subscription_started, subscription_completed
|
||||
- darkwatch_scan_initiated
|
||||
- alert_viewed, alert_resolved
|
||||
- feature_page_viewed (voiceprint, spamshield, etc.)
|
||||
5. Create conversion funnels:
|
||||
- Landing → Signup → Onboarding → Dashboard
|
||||
- Dashboard → Pricing → Checkout → Subscription
|
||||
- Blog → Signup (content marketing ROI)
|
||||
6. Set up dashboards:
|
||||
- Daily/weekly active users
|
||||
- Signup conversion rate
|
||||
- Subscription conversion rate
|
||||
- Feature adoption (DarkWatch, VoicePrint, etc.)
|
||||
- Churn rate
|
||||
- Revenue metrics (via Stripe integration)
|
||||
7. Add A/B testing foundation:
|
||||
- PostHog feature flags or Split.io
|
||||
- Test landing page variants
|
||||
- Test pricing page variants
|
||||
|
||||
tests:
|
||||
- Integration: Verify events fire correctly
|
||||
- Privacy: Confirm no PII in analytics payload
|
||||
- Consent: Test analytics blocked until cookie consent
|
||||
|
||||
acceptance_criteria:
|
||||
- Analytics tracking active on all public pages
|
||||
- Custom events firing for signup, login, subscription, key features
|
||||
- Conversion funnels visible in dashboard
|
||||
- No PII (names, emails, IDs) sent to analytics
|
||||
- Analytics loads only after cookie consent (if required)
|
||||
- Admin pages excluded from tracking
|
||||
- Daily active users metric available
|
||||
- Subscription conversion rate tracked
|
||||
- A/B testing framework ready for use
|
||||
|
||||
validation:
|
||||
- Visit landing page → pageview event in analytics
|
||||
- Sign up → signup_completed event with funnel progression
|
||||
- Check analytics dashboard → conversion rates visible
|
||||
- Inspect network tab → no email addresses in payload
|
||||
- Reject cookies → analytics script not loaded
|
||||
|
||||
notes:
|
||||
- Plausible is GDPR-compliant without cookie consent banner
|
||||
- PostHog offers more features but requires consent in EU
|
||||
- Consider self-hosting Plausible for complete data control
|
||||
- Stripe can send revenue data to analytics automatically
|
||||
82
tasks/web-production/27-structured-data.md
Normal file
82
tasks/web-production/27-structured-data.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# 27. Structured Data & Rich Snippets
|
||||
|
||||
meta:
|
||||
id: web-production-27
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [seo, marketing, production]
|
||||
|
||||
objective:
|
||||
- Implement schema.org structured data to enable rich snippets in search results and improve SEO
|
||||
|
||||
deliverables:
|
||||
- JSON-LD structured data on all relevant pages
|
||||
- Organization schema
|
||||
- WebSite schema with search
|
||||
- Article schema for blog posts
|
||||
- SoftwareApplication schema
|
||||
- BreadcrumbList schema
|
||||
|
||||
steps:
|
||||
1. Add Organization schema to homepage:
|
||||
- @type: Organization
|
||||
- name: Kordant
|
||||
- url: https://kordant.com
|
||||
- logo: URL to logo image
|
||||
- sameAs: social media profiles
|
||||
- description: AI-powered identity protection
|
||||
2. Add WebSite schema:
|
||||
- @type: WebSite
|
||||
- url: https://kordant.com
|
||||
- potentialAction: SearchAction with search URL template
|
||||
3. Add SoftwareApplication schema:
|
||||
- @type: SoftwareApplication
|
||||
- name: Kordant
|
||||
- applicationCategory: SecurityApplication
|
||||
- operatingSystem: Web, iOS, Android
|
||||
- offers: Free tier, Plus ($12/mo), Premium ($29/mo)
|
||||
- aggregateRating (once reviews collected)
|
||||
- featureList: DarkWatch, VoicePrint, SpamShield, HomeTitle, RemoveBrokers
|
||||
4. Add Article schema for blog posts:
|
||||
- @type: Article
|
||||
- headline, author, datePublished, dateModified
|
||||
- image, articleBody, keywords
|
||||
- publisher (Organization reference)
|
||||
5. Add BreadcrumbList schema:
|
||||
- Dynamic breadcrumbs based on current route
|
||||
- Include in all non-home pages
|
||||
6. Add FAQPage schema (optional):
|
||||
- For /about or /features pages
|
||||
- Common questions and answers
|
||||
7. Validate all structured data:
|
||||
- Test with Google Rich Results Test
|
||||
- Test with Schema Markup Validator
|
||||
- Fix any warnings or errors
|
||||
|
||||
tests:
|
||||
- Unit: Test JSON-LD generation for each schema type
|
||||
- Integration: Verify schema present in page source
|
||||
- SEO: Validate with Google's tools
|
||||
|
||||
acceptance_criteria:
|
||||
- Organization schema on homepage
|
||||
- WebSite schema with SearchAction on homepage
|
||||
- SoftwareApplication schema with pricing and features
|
||||
- Article schema on all blog posts
|
||||
- BreadcrumbList on all non-home pages
|
||||
- All schemas pass Google Rich Results Test
|
||||
- No errors or warnings in Schema Markup Validator
|
||||
- Schemas dynamically generated based on page data
|
||||
|
||||
validation:
|
||||
- View homepage source → Organization and WebSite JSON-LD present
|
||||
- View blog post source → Article JSON-LD with correct dates
|
||||
- Google Rich Results Test → all schemas valid
|
||||
- Search console → rich results reported
|
||||
|
||||
notes:
|
||||
- Use @solidjs/meta or script tags in JSX for JSON-LD
|
||||
- Keep JSON-LD in <head> for optimal crawler discovery
|
||||
- Update SoftwareApplication schema when pricing changes
|
||||
- Consider adding Review schema once user reviews available
|
||||
73
tasks/web-production/28-api-versioning.md
Normal file
73
tasks/web-production/28-api-versioning.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# 28. API Versioning & Deprecation Strategy
|
||||
|
||||
meta:
|
||||
id: web-production-28
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [api, stability, mobile]
|
||||
|
||||
objective:
|
||||
- Establish API versioning and deprecation strategy to support mobile app updates without breaking existing clients
|
||||
|
||||
deliverables:
|
||||
- API versioning scheme
|
||||
- Deprecation policy documentation
|
||||
- Backward compatibility testing
|
||||
- Mobile client version tracking
|
||||
|
||||
steps:
|
||||
1. Implement API versioning:
|
||||
- Current: tRPC v10 (consider upgrade to v11)
|
||||
- Add version header or URL prefix for breaking changes
|
||||
- Version format: v1, v2, etc.
|
||||
- Mobile apps send X-API-Version header
|
||||
2. Create deprecation policy:
|
||||
- Document in docs/API_VERSIONING.md
|
||||
- Breaking changes only in major versions
|
||||
- Support previous version for minimum 6 months
|
||||
- Announce deprecations 3 months in advance
|
||||
- Sunset dates for old versions
|
||||
3. Add version negotiation:
|
||||
- Backend supports multiple tRPC router versions
|
||||
- Route to correct router based on version header
|
||||
- Default to latest for web clients
|
||||
4. Track client versions:
|
||||
- Log app version from User-Agent or X-Client-Version
|
||||
- Dashboard showing active client versions
|
||||
- Alert when old versions still in use near sunset
|
||||
5. Add compatibility tests:
|
||||
- Test all mobile app versions against current API
|
||||
- Automated compatibility matrix
|
||||
- Breaking change detection in CI
|
||||
6. Document API changes:
|
||||
- Changelog for all API modifications
|
||||
- Migration guides for major versions
|
||||
- Breaking vs non-breaking classification
|
||||
|
||||
tests:
|
||||
- Unit: Test version routing
|
||||
- Integration: Test old client with new API
|
||||
- Compatibility: Verify mobile app versions work
|
||||
|
||||
acceptance_criteria:
|
||||
- API versioning scheme documented and implemented
|
||||
- Mobile apps send version header in all requests
|
||||
- Backend supports at least 2 API versions simultaneously
|
||||
- Deprecation policy published and followed
|
||||
- 6-month support window for old versions
|
||||
- Client version tracking dashboard active
|
||||
- Compatibility tests passing for all supported versions
|
||||
- Changelog maintained for all API changes
|
||||
|
||||
validation:
|
||||
- Mobile app sends X-API-Version: 1 → receives v1 responses
|
||||
- Deploy v2 changes → v1 clients continue working
|
||||
- Check dashboard → active client versions visible
|
||||
- Review changelog → all changes documented
|
||||
|
||||
notes:
|
||||
- tRPC v10 to v11 is a breaking change — plan migration carefully
|
||||
- Mobile apps may take weeks to update — long support windows needed
|
||||
- Consider using feature flags instead of versioning for minor changes
|
||||
- Track iOS and Android app versions separately
|
||||
82
tasks/web-production/29-api-documentation.md
Normal file
82
tasks/web-production/29-api-documentation.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# 29. API Documentation (OpenAPI/tRPC Docs)
|
||||
|
||||
meta:
|
||||
id: web-production-29
|
||||
feature: web-production
|
||||
priority: P2
|
||||
depends_on: []
|
||||
tags: [api, documentation, production]
|
||||
|
||||
objective:
|
||||
- Generate and publish comprehensive API documentation for internal and external developers
|
||||
|
||||
deliverables:
|
||||
- Auto-generated API documentation
|
||||
- Interactive API explorer
|
||||
- Authentication documentation
|
||||
- Error code reference
|
||||
|
||||
steps:
|
||||
1. Set up tRPC documentation generation:
|
||||
- Use trpc-openapi or @trpc/openapi-v3 to generate OpenAPI spec
|
||||
- Or use trpc-docs or @trpc/doc-generator
|
||||
- Export spec as JSON/YAML
|
||||
2. Create documentation site:
|
||||
- Use Swagger UI or Scalar for interactive docs
|
||||
- Host at /api/docs or separate docs subdomain
|
||||
- Include request/response examples
|
||||
- Include authentication requirements
|
||||
3. Document all routers:
|
||||
- User router: login, signup, profile, family
|
||||
- Billing router: subscription, checkout, webhooks
|
||||
- DarkWatch router: watchlist, exposures, scan
|
||||
- VoicePrint router: enrollments, analysis
|
||||
- SpamShield router: rules, phone check
|
||||
- HomeTitle router: properties, monitoring
|
||||
- RemoveBrokers router: listings, removals
|
||||
- Alerts router: list, resolve, correlation
|
||||
- Admin router: user management, blog
|
||||
4. Add authentication docs:
|
||||
- Session cookie authentication
|
||||
- JWT bearer token authentication
|
||||
- API key authentication (for extensions)
|
||||
- Clerk webhook handling
|
||||
5. Add error documentation:
|
||||
- Standard error codes (400, 401, 403, 404, 429, 500)
|
||||
- tRPC error codes and meanings
|
||||
- Rate limit headers explanation
|
||||
6. Add webhook documentation:
|
||||
- Stripe webhook events
|
||||
- Clerk webhook events
|
||||
- Payload schemas and verification
|
||||
7. Keep docs in sync:
|
||||
- Auto-generate on build
|
||||
- CI check for doc changes
|
||||
- Version docs with API versions
|
||||
|
||||
tests:
|
||||
- Unit: Test OpenAPI spec generation
|
||||
- Integration: Verify docs site loads and examples work
|
||||
- Review: Team review for accuracy
|
||||
|
||||
acceptance_criteria:
|
||||
- API docs accessible at /api/docs
|
||||
- All tRPC routers documented with input/output schemas
|
||||
- Interactive explorer allowing test requests
|
||||
- Authentication methods documented with examples
|
||||
- All error codes explained with examples
|
||||
- Webhook payloads documented with verification steps
|
||||
- Docs auto-generated from code (single source of truth)
|
||||
- Examples use realistic test data
|
||||
|
||||
validation:
|
||||
- Navigate to /api/docs → interactive explorer loads
|
||||
- Try user.me endpoint → returns example response
|
||||
- Check auth section → all methods documented
|
||||
- Review webhook docs → verification steps clear
|
||||
|
||||
notes:
|
||||
- trpc-openapi requires adding meta tags to procedures
|
||||
- Consider using Scalar (modern alternative to Swagger UI)
|
||||
- Docs should be public but sensitive endpoints marked as auth-required
|
||||
- Keep examples updated when schemas change
|
||||
82
tasks/web-production/30-websocket-production.md
Normal file
82
tasks/web-production/30-websocket-production.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# 30. WebSocket Production Hardening
|
||||
|
||||
meta:
|
||||
id: web-production-30
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [security, websockets, production]
|
||||
|
||||
objective:
|
||||
- Harden WebSocket server for production with authentication, rate limiting, and connection management
|
||||
|
||||
deliverables:
|
||||
- Authenticated WebSocket connections
|
||||
- Connection rate limiting
|
||||
- Connection cleanup on logout
|
||||
- Horizontal scaling support (Redis adapter)
|
||||
|
||||
steps:
|
||||
1. Harden WebSocket authentication:
|
||||
- Validate JWT token in connection query param
|
||||
- Reject unauthenticated connections immediately
|
||||
- Re-authenticate periodically (every 15 minutes)
|
||||
- Close connection on token expiry
|
||||
2. Implement connection rate limiting:
|
||||
- Max 1 WebSocket connection per user
|
||||
- Max 5 reconnection attempts per minute
|
||||
- IP-based connection limits (100 per IP)
|
||||
3. Add connection management:
|
||||
- Track active connections per user
|
||||
- Close duplicate connections
|
||||
- Heartbeat with timeout (current implementation good)
|
||||
- Graceful close on server shutdown
|
||||
4. Implement horizontal scaling:
|
||||
- Use Redis adapter for ws (socket.io-redis or @socket.io/redis-adapter)
|
||||
- Or use Redis pub/sub for broadcast across instances
|
||||
- Ensure alerts reach all connected clients regardless of instance
|
||||
5. Add message validation:
|
||||
- Validate all incoming message schemas
|
||||
- Reject malformed messages
|
||||
- Limit message size (max 10KB)
|
||||
- Sanitize message content
|
||||
6. Add monitoring:
|
||||
- Track active connection count
|
||||
- Track messages per second
|
||||
- Track connection duration
|
||||
- Alert on connection spikes (possible DDoS)
|
||||
7. Secure WebSocket server:
|
||||
- Run on separate port or path
|
||||
- TLS encryption (wss://)
|
||||
- No mixed content (ws on https page)
|
||||
|
||||
tests:
|
||||
- Unit: Test authentication rejection
|
||||
- Integration: Test duplicate connection handling
|
||||
- Load: Test 1000 concurrent WebSocket connections
|
||||
- Security: Test unauthenticated connection rejection
|
||||
|
||||
acceptance_criteria:
|
||||
- All WebSocket connections authenticated with valid JWT
|
||||
- Unauthenticated connections rejected immediately
|
||||
- Max 1 connection per user (duplicates closed)
|
||||
- Heartbeat/ping-pong working with 30s interval
|
||||
- Redis adapter active for multi-instance deployment
|
||||
- Message size limited to 10KB
|
||||
- TLS encryption (wss://) in production
|
||||
- Connection metrics visible in monitoring
|
||||
- Graceful shutdown closes all connections cleanly
|
||||
|
||||
validation:
|
||||
- Connect without token → connection rejected
|
||||
- Connect with valid token → connection accepted
|
||||
- Open second connection → first connection closed
|
||||
- Send 20KB message → connection closed with error
|
||||
- Scale to 2 server instances → alerts broadcast to all clients
|
||||
- Check metrics → active connections, message rate visible
|
||||
|
||||
notes:
|
||||
- Current WebSocket in web/src/lib/websocket.ts and web/src/server/websocket.ts
|
||||
- ws library supports Redis adapter for scaling
|
||||
- Consider using Socket.io for more robust connection management
|
||||
- WebSocket auth via query params is common but consider cookie-based for security
|
||||
77
tasks/web-production/31-db-backup.md
Normal file
77
tasks/web-production/31-db-backup.md
Normal file
@@ -0,0 +1,77 @@
|
||||
# 31. Backup Strategy & Point-in-Time Recovery
|
||||
|
||||
meta:
|
||||
id: web-production-31
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [database, reliability, production]
|
||||
|
||||
objective:
|
||||
- Implement automated database backups with point-in-time recovery capability
|
||||
|
||||
deliverables:
|
||||
- Automated daily backups
|
||||
- Point-in-time recovery setup
|
||||
- Backup testing and verification
|
||||
- Retention policy
|
||||
|
||||
steps:
|
||||
1. Set up automated backups:
|
||||
- If PostgreSQL: configure pg_dump cron job or managed backups (RDS, Cloud SQL)
|
||||
- If SQLite/Turso: configure Turso database branching/backups
|
||||
- Daily full backups at off-peak hours (3 AM UTC)
|
||||
- Hourly incremental backups (WAL archiving for Postgres)
|
||||
2. Configure backup storage:
|
||||
- Store in separate region/cloud provider (S3, GCS, R2)
|
||||
- Encrypt backups at rest
|
||||
- Versioning enabled (protect against deletion)
|
||||
3. Implement point-in-time recovery:
|
||||
- WAL archiving for PostgreSQL
|
||||
- Transaction log backups every 15 minutes
|
||||
- Test recovery to specific timestamp
|
||||
4. Add backup monitoring:
|
||||
- Alert on backup failure
|
||||
- Track backup size and duration
|
||||
- Verify backup integrity (checksum)
|
||||
5. Test restore procedures:
|
||||
- Monthly restore test to staging environment
|
||||
- Document step-by-step restore process
|
||||
- Measure RTO (Recovery Time Objective) and RPO (Recovery Point Objective)
|
||||
- Target: RTO < 1 hour, RPO < 15 minutes
|
||||
6. Document retention:
|
||||
- Daily backups: 7 days
|
||||
- Weekly backups: 4 weeks
|
||||
- Monthly backups: 12 months
|
||||
- Annual backups: 7 years (compliance)
|
||||
7. Add Redis backup:
|
||||
- RDB snapshots every 6 hours
|
||||
- AOF persistence for point-in-time
|
||||
- Backup to S3/GCS
|
||||
|
||||
tests:
|
||||
- Integration: Test backup creation
|
||||
- Recovery: Test restore to staging
|
||||
- Monitoring: Verify backup alerts
|
||||
|
||||
acceptance_criteria:
|
||||
- Daily automated backups running successfully
|
||||
- Backups stored in separate region with encryption
|
||||
- Point-in-time recovery tested and working
|
||||
- Backup failures trigger alerts within 5 minutes
|
||||
- Monthly restore test completed and documented
|
||||
- RTO < 1 hour, RPO < 15 minutes
|
||||
- Retention policy enforced automatically
|
||||
- Redis backups included in strategy
|
||||
|
||||
validation:
|
||||
- Check backup storage → daily backups present
|
||||
- Trigger restore test → staging database restored successfully
|
||||
- Simulate backup failure → alert received
|
||||
- Check retention → old backups purged per policy
|
||||
|
||||
notes:
|
||||
- Turso offers automatic backups for SQLite — verify configuration
|
||||
- RDS automated backups are easiest for PostgreSQL
|
||||
- Test restores are critical — untested backups are useless
|
||||
- Document restore process for on-call engineers
|
||||
79
tasks/web-production/32-migration-safety.md
Normal file
79
tasks/web-production/32-migration-safety.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# 32. Migration Safety & Rollback Procedures
|
||||
|
||||
meta:
|
||||
id: web-production-32
|
||||
feature: web-production
|
||||
priority: P1
|
||||
depends_on: []
|
||||
tags: [database, reliability, production]
|
||||
|
||||
objective:
|
||||
- Ensure database migrations are safe, reversible, and won't cause downtime or data loss in production
|
||||
|
||||
deliverables:
|
||||
- Migration safety guidelines
|
||||
- Backward-compatible migration policy
|
||||
- Rollback scripts for each migration
|
||||
- Migration testing in staging
|
||||
|
||||
steps:
|
||||
1. Create migration safety guidelines:
|
||||
- Document in docs/MIGRATIONS.md
|
||||
- Additive changes only in production (add columns, create tables)
|
||||
- No destructive changes during deployment (no DROP COLUMN)
|
||||
- Two-phase migrations for destructive changes:
|
||||
- Phase 1: Add new column/table, deploy code to use it
|
||||
- Phase 2: Remove old column/table after code stable
|
||||
2. Audit existing migrations:
|
||||
- Review all drizzle migrations in web/src/server/db/
|
||||
- Check for any destructive operations
|
||||
- Add rollback scripts where missing
|
||||
3. Implement migration testing:
|
||||
- Run migrations against staging database copy
|
||||
- Verify app works after migration
|
||||
- Test rollback script
|
||||
- Measure migration duration (must be <30 seconds)
|
||||
4. Add migration safety checks:
|
||||
- CI check: verify no destructive migrations in PR
|
||||
- Pre-deploy: dry-run migration in production
|
||||
- Post-deploy: verify migration applied successfully
|
||||
5. Document rollback procedures:
|
||||
- Step-by-step rollback for each migration
|
||||
- Database backup before migration
|
||||
- Code rollback procedure
|
||||
- Data recovery steps if needed
|
||||
6. Add migration monitoring:
|
||||
- Log migration start, duration, success/failure
|
||||
- Alert on migration failure
|
||||
- Track migration duration trends
|
||||
7. Set up migration automation:
|
||||
- GitHub Action to run migrations on staging deploy
|
||||
- Manual approval for production migrations
|
||||
- Automated rollback on migration failure
|
||||
|
||||
tests:
|
||||
- Unit: Test migration scripts in isolation
|
||||
- Integration: Test migration on staging database
|
||||
- Rollback: Test rollback procedure
|
||||
|
||||
acceptance_criteria:
|
||||
- All production migrations are additive-only
|
||||
- Two-phase migration process documented for destructive changes
|
||||
- Rollback script exists for every migration
|
||||
- Migrations tested on staging before production
|
||||
- Migration duration <30 seconds
|
||||
- Automated CI check preventing destructive migrations
|
||||
- Backup taken before every production migration
|
||||
- Migration failure triggers automatic alert and rollback
|
||||
|
||||
validation:
|
||||
- Review migration history → no destructive changes in production
|
||||
- Test rollback → database restored to previous state
|
||||
- Run destructive migration in PR → CI blocks merge
|
||||
- Check migration logs → all migrations completed successfully
|
||||
|
||||
notes:
|
||||
- Drizzle migrations are generally safe but review generated SQL
|
||||
- Use drizzle-kit generate with --custom for complex migrations
|
||||
- Consider using gh-ost or pt-online-schema-change for large tables
|
||||
- Always have a database backup before running production migrations
|
||||
93
tasks/web-production/README.md
Normal file
93
tasks/web-production/README.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# Web Production Readiness
|
||||
|
||||
Objective: Harden, optimize, and operationalize the SolidStart web application for production deployment with enterprise-grade security, performance, monitoring, and compliance.
|
||||
|
||||
Status legend: [ ] todo, [~] in-progress, [x] done
|
||||
|
||||
## Tasks
|
||||
|
||||
### Security & Hardening
|
||||
- [ ] 01 — Security Headers & CORS Configuration → `01-security-headers-cors.md`
|
||||
- [ ] 02 — Rate Limiting & DDoS Protection → `02-rate-limiting-ddos.md`
|
||||
- [ ] 03 — Input Validation & XSS Prevention Audit → `03-input-validation-xss.md`
|
||||
- [ ] 04 — Authentication & Session Security Hardening → `04-auth-session-hardening.md`
|
||||
|
||||
### Performance & Reliability
|
||||
- [ ] 05 — CDN & Asset Optimization → `05-cdn-asset-optimization.md`
|
||||
- [ ] 06 — Database Connection Pooling & Query Optimization → `06-db-connection-pooling.md`
|
||||
- [ ] 07 — Caching Strategy (Redis + HTTP Cache) → `07-caching-strategy.md`
|
||||
- [ ] 08 — Graceful Shutdown & Health Check Endpoints → `08-health-checks-shutdown.md`
|
||||
|
||||
### Monitoring & Observability
|
||||
- [ ] 09 — Structured Logging & Log Aggregation → `09-structured-logging.md`
|
||||
- [ ] 10 — Error Tracking & Alerting (Sentry Integration) → `10-error-tracking.md`
|
||||
- [ ] 11 — Application Metrics & Dashboards → `11-metrics-dashboards.md`
|
||||
- [ ] 12 — Uptime & Performance Monitoring → `12-uptime-monitoring.md`
|
||||
|
||||
### CI/CD & DevOps
|
||||
- [ ] 13 — GitHub Actions CI Pipeline → `13-github-actions-ci.md`
|
||||
- [ ] 14 — Automated Deployment Pipeline → `14-deployment-pipeline.md`
|
||||
- [ ] 15 — Docker & Infrastructure Optimization → `15-docker-infra.md`
|
||||
- [ ] 16 — Environment Management & Secrets Rotation → `16-env-secrets.md`
|
||||
|
||||
### Testing & Quality Assurance
|
||||
- [ ] 17 — End-to-End Testing (Playwright) → `17-e2e-testing.md`
|
||||
- [ ] 18 — Load & Stress Testing → `18-load-testing.md`
|
||||
- [ ] 19 — Accessibility Audit & WCAG Compliance → `19-accessibility-audit.md`
|
||||
- [ ] 20 — Dependency Vulnerability Scanning → `20-dependency-scanning.md`
|
||||
|
||||
### Compliance & Legal
|
||||
- [ ] 21 — Privacy Policy, TOS & Legal Pages → `21-legal-pages.md`
|
||||
- [ ] 22 — Cookie Consent & GDPR Compliance → `22-cookie-gdpr.md`
|
||||
- [ ] 23 — Data Export & Deletion Tools → `23-data-export-deletion.md`
|
||||
- [ ] 24 — Security.txt & Responsible Disclosure → `24-security-txt.md`
|
||||
|
||||
### SEO & Marketing
|
||||
- [ ] 25 — Sitemap, Robots.txt & Open Graph → `25-seo-meta.md`
|
||||
- [ ] 26 — Analytics Integration (Plausible/PostHog) → `26-analytics.md`
|
||||
- [ ] 27 — Structured Data & Rich Snippets → `27-structured-data.md`
|
||||
|
||||
### API & Backend Stability
|
||||
- [ ] 28 — API Versioning & Deprecation Strategy → `28-api-versioning.md`
|
||||
- [ ] 29 — API Documentation (OpenAPI/tRPC Docs) → `29-api-documentation.md`
|
||||
- [ ] 30 — WebSocket Production Hardening → `30-websocket-production.md`
|
||||
|
||||
### Database Production Readiness
|
||||
- [ ] 31 — Backup Strategy & Point-in-Time Recovery → `31-db-backup.md`
|
||||
- [ ] 32 — Migration Safety & Rollback Procedures → `32-migration-safety.md`
|
||||
|
||||
## Dependencies
|
||||
- 01, 02, 03, 04 can be done in parallel (security foundation)
|
||||
- 05, 06, 07, 08 can be done in parallel (performance foundation)
|
||||
- 09, 10, 11, 12 can be done in parallel (observability)
|
||||
- 13 depends on 17, 18, 19, 20 (tests must pass before CI)
|
||||
- 14 depends on 13, 15, 16 (CI + infra + env)
|
||||
- 21, 22, 23, 24 can be done in parallel (compliance)
|
||||
- 25, 26, 27 can be done in parallel (SEO)
|
||||
- 28, 29, 30 can be done in parallel (API stability)
|
||||
- 31, 32 can be done in parallel (DB ops)
|
||||
- All groups can proceed independently
|
||||
|
||||
## Exit Criteria
|
||||
- All security headers present and scoring A+ on Security Headers scan
|
||||
- Rate limiting active on all public endpoints (100 req/min)
|
||||
- Database queries optimized with connection pooling (PgBouncer or equivalent)
|
||||
- Redis caching layer active for hot paths
|
||||
- Health check endpoint responding with 200 and dependency status
|
||||
- Structured logging shipping to aggregation service
|
||||
- Error tracking capturing 100% of unhandled exceptions
|
||||
- CI pipeline running tests, lint, typecheck, and build on every PR
|
||||
- Automated deployment to staging on merge to main
|
||||
- E2E tests covering critical user journeys (signup → dashboard → billing)
|
||||
- Load tests confirming 1000 concurrent users with <200ms p95 latency
|
||||
- Accessibility audit passing WCAG 2.1 AA
|
||||
- All production dependencies vulnerability-free
|
||||
- Legal pages live and linked in footer
|
||||
- Cookie consent banner functional with granular controls
|
||||
- GDPR data export and deletion APIs operational
|
||||
- SEO meta tags, sitemap, and robots.txt serving correctly
|
||||
- Analytics tracking page views and conversion events
|
||||
- API documentation publicly accessible and up-to-date
|
||||
- WebSocket connections stable with reconnection logic tested
|
||||
- Database backups automated with 7-day retention
|
||||
- Migration rollback tested and documented
|
||||
Reference in New Issue
Block a user