Files
Kordant/tasks/web-production/32-migration-safety.md
2026-05-26 16:06:34 -04:00

2.9 KiB

32. Migration Safety & Rollback Procedures

meta: id: web-production-32 feature: web-production priority: P1 depends_on: [] tags: [database, reliability, production]

objective:

  • Ensure database migrations are safe, reversible, and won't cause downtime or data loss in production

deliverables:

  • Migration safety guidelines
  • Backward-compatible migration policy
  • Rollback scripts for each migration
  • Migration testing in staging

steps:

  1. Create migration safety guidelines:
    • Document in docs/MIGRATIONS.md
    • Additive changes only in production (add columns, create tables)
    • No destructive changes during deployment (no DROP COLUMN)
    • Two-phase migrations for destructive changes:
      • Phase 1: Add new column/table, deploy code to use it
      • Phase 2: Remove old column/table after code stable
  2. Audit existing migrations:
    • Review all drizzle migrations in web/src/server/db/
    • Check for any destructive operations
    • Add rollback scripts where missing
  3. Implement migration testing:
    • Run migrations against staging database copy
    • Verify app works after migration
    • Test rollback script
    • Measure migration duration (must be <30 seconds)
  4. Add migration safety checks:
    • CI check: verify no destructive migrations in PR
    • Pre-deploy: dry-run migration in production
    • Post-deploy: verify migration applied successfully
  5. Document rollback procedures:
    • Step-by-step rollback for each migration
    • Database backup before migration
    • Code rollback procedure
    • Data recovery steps if needed
  6. Add migration monitoring:
    • Log migration start, duration, success/failure
    • Alert on migration failure
    • Track migration duration trends
  7. Set up migration automation:
    • GitHub Action to run migrations on staging deploy
    • Manual approval for production migrations
    • Automated rollback on migration failure

tests:

  • Unit: Test migration scripts in isolation
  • Integration: Test migration on staging database
  • Rollback: Test rollback procedure

acceptance_criteria:

  • All production migrations are additive-only
  • Two-phase migration process documented for destructive changes
  • Rollback script exists for every migration
  • Migrations tested on staging before production
  • Migration duration <30 seconds
  • Automated CI check preventing destructive migrations
  • Backup taken before every production migration
  • Migration failure triggers automatic alert and rollback

validation:

  • Review migration history → no destructive changes in production
  • Test rollback → database restored to previous state
  • Run destructive migration in PR → CI blocks merge
  • Check migration logs → all migrations completed successfully

notes:

  • Drizzle migrations are generally safe but review generated SQL
  • Use drizzle-kit generate with --custom for complex migrations
  • Consider using gh-ost or pt-online-schema-change for large tables
  • Always have a database backup before running production migrations