1.4 KiB
1.4 KiB
Backup Strategy
Database Backups
Automated Backups
- Frequency: Daily at 3 AM UTC
- Retention: 7 days daily, 4 weeks weekly, 12 months monthly
- Storage: Encrypted S3 bucket in separate region
- Type: Full backup + WAL archiving for point-in-time recovery
Point-in-Time Recovery
- RPO: < 15 minutes
- RTO: < 1 hour
- Method: WAL archive restoration to specific timestamp
Backup Verification
- Monthly restore test to staging environment
- Automated integrity checks on backup files
- Alert on backup failure within 5 minutes
Redis Backups
Configuration
- RDB snapshots: Every 6 hours
- AOF persistence: Enabled for point-in-time recovery
- Storage: Backed up to S3 daily
Recovery
- Restore from latest RDB snapshot
- Replay AOF for recent changes
- Test data integrity after restore
Backup Monitoring
Alerts
- Backup failure → Immediate PagerDuty alert
- Backup size anomaly → Slack notification
- Restore test failure → Jira ticket creation
Metrics
- Backup duration
- Backup size
- Restore time
- Data loss window (RPO)
Emergency Procedures
Complete Data Loss
- Activate disaster recovery plan
- Restore from latest backup
- Replay WAL/AOF for recent changes
- Verify data integrity
- Resume operations
Partial Data Corruption
- Identify affected data
- Restore specific tables from backup
- Verify data consistency
- Resume operations