Backup and Restore
Backup and Restore
Making data is not a backup. Restoring is. The difference only shows up after an incident. Here is a summary of PostgreSQL's tools, policies, and rehearsals.
1. Types of backups
PostgreSQL backups split into two branches.
| Type | Format | Tools | Characteristics |
|---|---|---|---|
| Logical | SQL or custom archive | pg_dump, pg_dumpall, pg_restore |
Possible at table or schema level. Portable across versions and architectures. Slow on large DBs. |
| Physical | Data directory file copy | pg_basebackup, file system snapshot + WAL |
Fast restore. Compatible only on the same major version and same architecture. |
Logical is the basic operational backup. Physical comes up often as the base for PITR.
2. pg_dump, pg_dumpall, pg_restore
pg_dump produces a logical backup of one DB. It can output as SQL text, custom archive (-Fc), or directory format (-Fd).
pg_dump -Fc -d mydb -f mydb.dump
pg_restore -d mydb_new mydb.dump
-Fc enables compression, parallel restore, and selective restore, and is the format most often recommended in production.
pg_dumpall bundles the entire cluster. Roles, tablespaces, and all DBs come together, but data comes out as SQL text only. pg_restore restores pg_dump's custom and directory formats, with parallel restore via -j.
3. WAL archiving
PostgreSQL writes every change to the transaction log (WAL) first. Continuously moving this log to separate storage allows reconstructing any point in time from a full backup plus accumulated WAL.
The configuration core is three settings.
wal_level = replica
archive_mode = on
archive_command = 'cp %p /mnt/wal-archive/%f'
archive_library can replace it as well. The archived WAL is the input to PITR.
4. PITR — Point-In-Time Recovery
The PITR flow has 6 steps.
① Periodically create base backups (pg_basebackup)
② Keep the WAL archive without gaps
③ On incident: restore the base backup into an empty data directory
④ Create the recovery.signal file and configure restore_command
⑤ Specify the target with recovery_target_time (or _lsn or _xid)
⑥ Start Postgres → replay WAL → stop on reaching the target
Recovery starts from the base backup's point in time, so the further back the base, the longer the replay. The base backup cadence and WAL retention policy decide RPO and RTO.
- RPO (Recovery Point Objective) — acceptable data loss. Influenced by WAL transfer cadence and delay.
- RTO (Recovery Time Objective) — acceptable time to restore. Influenced by base cadence, hardware, and WAL volume.
PITR alone does not bring RPO close to zero. Some places additionally need a synchronous replication standby.
5. Operational tools
Rather than wiring archive_command directly, using a tool is more common.
pgBackRest — written in C. Supports parallel compression, incremental backups, encryption, and S3-compatible storage directly. The most often recommended path for production.
Barman — backed by EnterpriseDB. Python-based. Supports full and incremental backups and WAL streaming.
WAL-G — started by Citus and Yandex. WAL compression, encryption, and friendly support for cloud storage (S3, GCS, Azure Blob). Small operational surface is its strength.
Cloud managed — Amazon RDS, Aurora, Cloud SQL, and Azure Database provide automated backups and PITR managed. The operator focuses on retention period and restore procedure. Limits are that the restored result becomes a new instance and platform lock-in.
6. The 3-2-1 retention rule
A traditional backup retention guideline.
- 3 copies
- 2 types of media (local disk + object storage, etc.)
- 1 offsite (different region, different cloud, different physical location)
The meaning of an offsite copy shows up in broad incidents like data center failures and ransomware.
Retention windows commonly run 1430 days daily, 812 weeks weekly, 12 months monthly, and 3~7 years yearly (per regulation and contract). Accumulated backups cost money, so it is safer to automate cleanup with code-driven policies.
7. Encryption and key management
- In transit: TLS.
- At rest: server-side encryption or client-side GPG, age.
- Key management: KMS, Vault, cloud Key Service.
Inspect key access permissions during restore rehearsals. Lose the key and an encrypted backup cannot be restored.
8. Restore rehearsals
The most common incident is "the backup existed but the restore did not work." Periodic rehearsals prevent it.
① PITR with base backup + WAL into a separate environment
② Verify row counts and checksums on the resulting DB
③ Application smoke tests
④ Record elapsed time → validate RTO
Operational examples that automate the rehearsal itself are common.
9. Replication is not a backup
Streaming replication is a failover tool. Logical replication is a major-version and schema-migration tool. Neither is a backup.
If a user deletes data deliberately or by mistake, the replica deletes too. PITR holds the point-in-time recovery capability. Disk snapshots (EBS, ZFS) are fast for backup and restore but consistent snapshots require the pg_backup_start / pg_backup_stop flow (PostgreSQL 15+).
10. Common pitfalls
Settling for pg_dump only — as the DB grows, dump and restore stretch into hours. Unsuitable for PITR.
Gaps in WAL archive — temporary archive_command failures break PITR. Monitoring is mandatory.
Trusting only the backup-completion alarm — alarms may merely say "a file was created." Complement with integrity checks and restore rehearsals.
Version and platform compatibility — physical backups restore only on the same major version and same architecture. Logical backups loosen that assumption.
Extension dependencies — if the restore environment lacks the same extensions, it fails. State them in the environment build step.
Permission void at incident time — when normal-day backup access is closed off from people, the restore can be blocked when an incident hits. Person/role separation and an emergency runbook are needed.
Closing thoughts
Backups are harder to restore than to make. Spend more time on rehearsals than on picking tools. Whether to go all the way to PITR or to settle for daily dumps is decided by RPO.
Next
- replication-streaming
- pgbackrest-setup
References: PostgreSQL Backup and Restore, Continuous Archiving and PITR, pgBackRest official, WAL-G GitHub.