Jan 01, 2025

How to Set Up WAL Archiving and Restore a Backup in PostgreSQL

Introduction

In PostgreSQL, Write-Ahead Logging (WAL) is a fundamental mechanism for ensuring data integrity and durability. WAL records every change made to the database, which is crucial for crash recovery and point-in-time recovery (PITR). Setting up WAL archiving allows you to preserve these logs and enables you to restore your database to a specific point in time, even if the system crashes.

This process involves creating backups of the database using pg_basebackup, enabling WAL archiving to store transaction logs, and configuring PostgreSQL to use these archived logs during recovery. By following this approach, you can ensure that your PostgreSQL database is resilient and capable of recovering from failures, making it an essential part of any database administration strategy.

In this guide, we will walk you through the steps to set up WAL archiving, create a backup, and restore that backup to a new data directory, ensuring your database remains safe and recoverable at all times.

1. Create a Folder for Archiving WAL Records

First, you need to create a folder to store archived WAL files. This folder will store WAL logs as they are generated by PostgreSQL.

mkdir /home/postgres_code/postgresql/wal_archive

This folder will be used later to archive the WAL logs.

2. Enable Archive Settings in PostgreSQL

Next, you need to configure PostgreSQL to enable WAL archiving. This is done by modifying the postgresql.conf file.

* Go to the PostgreSQL configuration directory ( current data directory ) and open postgresql.conf.

cd /home/postgres_code/postgresql/my_data_directory

nano postgresql.conf

Enable WAL archiving by setting archive_mode to on and define the archive_command to copy the WAL logs into the archive folder.

archive_mode = on # Enables archiving

archive_command = 'cp %p /home/cybrosys/postgres_code/postgresql/wal_archive/%f'

Save and exit the file. This ensures that PostgreSQL will archive each WAL log file as it is written to disk.

3. Create a Backup Folder

Now, create a folder to store the base backups.

mkdir /home/postgres_code/postgresql/backup_data

This folder will hold the backup files generated by the pg_basebackup command.

4. Take a Base Backup Using pg_basebackup

Use the pg_basebackup utility to create a backup of the current data directory.

Before using this command , you need to ensure that you are running the server on current data directory

bin/pg_basebackup -D /home/postgres_code/postgresql/backup_data -Ft -z

* -D specifies the backup directory.

* -Ft specifies a tar format backup.

* -z compresses the backup.

After running this command, you will have three files in the backup_data directory:

* backup_manifest

* base.tar.gz

* pg_wal.tar.gz

These files contain your backup and the WAL logs necessary for recovery.

5. Initialize a New Data Directory

Now, let's create a new data directory where you will restore the backup.

bin/initdb /home/postgres_code/postgresql/my_data_directory

This command initializes the data directory with the necessary PostgreSQL configuration files.

6. Extract the Base Backup into the New Data Directory

Extract the base.tar.gz file into the newly created data directory.

tar -xvf /home/postgres_code/postgresql/backup_data/base.tar.gz -C /home/postgres_code/postgresql/my_data_directory

This restores the base backup, including the database files and system data, into the new data directory.

7. Extract the WAL Files to the WAL Archive Directory

Extract the pg_wal.tar.gz file into the WAL archive directory to ensure you have the required WAL files for recovery.

tar -xvf /home/postgres_code/postgresql/backup_data/pg_wal.tar.gz -C /home/postgres_code/postgresql/wal_archive

This ensures the archived WAL logs are available to restore any changes that occurred after the base backup was taken.

8. Configure the postgresql.conf for Recovery

Now, edit the postgresql.conf in the new data directory to configure the restore command that PostgreSQL will use to fetch WAL logs from the archive.

nano /home/postgres_code/postgresql/my_data_directory/postgresql.conf

Set the following line to tell PostgreSQL where to find the archived WAL files during recovery:

restore_command = 'cp /home/postgres_code/postgresql/wal_archive/%f %p'

This configuration tells PostgreSQL to copy the required WAL file from the wal_archive directory when needed.

9. Create the recovery.signal File

In order to initiate recovery using the archived WAL files, you need to create a recovery.signal file in the new data directory and add the below line to this file.

touch /home/postgres_code/postgresql/my_data_directory/recovery.signal

restore_command = 'cp /home/postgres_code/postgresql/wal_archive/%f %p'

This file signals to PostgreSQL that it should perform a recovery using the provided WAL files.

10. Restart the PostgreSQL Server

Finally, restart the PostgreSQL server to begin the recovery process and restore the database to the point-in-time state as defined by the WAL files.

bin/pg_ctl -D /home/postgres_code/postgresql/my_data_directory -l logfile restart

This will apply the archived WAL files and bring the database up to the latest committed transaction, completing the restore process.

Conclusion

By following these steps, you can successfully set up WAL archiving, create a base backup, and restore the backup to a new data directory in PostgreSQL. This setup ensures that your database can be recovered at any point in time using the WAL logs, providing robust data protection and disaster recovery capabilities.