Setting up checksums in PostgreSQL: A complete guide

Last updated: Nov 12, 2024
Views: 91
Author: Admin

The PostgreSQL DBMS provides extensive capabilities for protecting your data and ensuring its integrity. One of the important tools is the verification of checksums. This mechanism helps track possible data errors that occur due to hardware or software failures, thereby improving the reliability and stability of the database. Enabling checksums in PostgreSQL reduces the risk of data loss and increases its integrity.

In this article, we'll look at how to enable checksums in PostgreSQL, as well as what steps to take to avoid potential issues when configuring this setting on an existing database.

The article on the page is based on my many years of experience with this technology and is presented to you as a short instruction, but if you want to get acquainted with it in more detail, then I recommend that you go to the official documentation on PostgreSQL.

Contents of the article:

Checking the current checksum settings.
Stopping the database.
Enable checksum verification on the database.
We turn on the database.
Checking the result.
Checking for errors.
Simulate a database file failure.

1. Checking the current checksum settings.

Before we do anything with the database, we first need to make sure that checksum verification is really disabled. For a simple check, you can use several options.

We connect to the database using the psql utility and execute queries.

sql> show data_checksums;

postgresql checksum

sql> SELECT name, setting, category FROM pg_settings WHERE name = 'data_checksums';

postgresql checksum

In all variants, the data_checksums parameter value is off, which means that the checksum check in the PostgreSQL database is disabled.

Another option for checking can be the pg_controldata utility.

$. pg_controldata |grep checksum

postgresql checksum

The pg_controldata utility can return two values for the checksum parameter, 0 or 1. The value 0 means that checksum checking is disabled, and 1 means that checksum checking is enabled.

2. Stopping the database.

Before enabling checksum verification on the database, we must first stop it. If we do not stop the database and run the pg_checksums utility, nothing will happen and the PostgreSQL DBMS itself will show us messages that it needs to be stopped.

$. pg_checksums -D /app/postgresql/15/pg_dbcluster_1/data --enable --progress --verbose

postgresql checksum

As a result, we will receive the error message - pg_checksums: error: cluster must be shut down

Stopping the database.

$. pg_ctl stop

3. Enable checksum verification on the database.

After we have stopped the database, we can now run the pg_checksums utility which will enable checksums on the database.

$. pg_checksums -D /app/postgresql/15/pg_dbcluster_1/data --enable --progress --verbose

Parameter descriptions:

-D - The directory where you store all your database files.
--enable - Enable checksum verification.
--progress - Show progress as a percentage.
--verbose - Show detailed execution of the utility.

postgresql checksum

As a result, we should see messages - Checksums enabled in cluster

4. We turn on the database.

Once the pg_checksums utility has completed all its actions, we can now start the database back up.

$. pg_ctl start

5. Checking the result.

Now let's run back all the checks we did in the first point to make sure that we really have enabled checksum verification in the database.

sql> show data_checksums;

sql> SELECT name, setting, category FROM pg_settings WHERE name = 'data_checksums';

$. pg_controldata |grep checksum

postgresql checksum

As we can see, as a result, all values of the checksum parameter are equal to on and 1, which means that the checksum verification has been successfully enabled.

6. Checking for errors.

We can now periodically check if we have any errors related to the data, for this we execute the query:

sql> SELECT datname, checksum_failures, checksum_last_failure FROM pg_stat_database WHERE datname IS NOT NULL;

postgresql checksum

If you get 0 values for all databases as a result, then you currently have no problems with the data.

7. Simulate a database file failure.

Now I will show you how the database will behave in cases where the file system is damaged, for this I will manually damage one of the database files in which the table will be stored and look at the result.

DO NOT DO SUCH ACTIONS IN YOUR PRODUCTION DATABASE!

I have corrupted one of the testdb database files and am running a normal select query.

postgresql checksum

As a result, we see that our checksums do not match and the database informs us of an error.

And finally, today we looked at how to enable checksum verification on a PostgreSQL database, and also looked at an example of what information the database will output if corrupted data appears.

Setting up checksums in PostgreSQL: A complete guide

SIMILAR ARTICLES