postgresql when it's not your job

10:38

That Google Checksum Tool

17 July 2018

Google recently released a tool to check on-disk checksums in PostgreSQL. PostgreSQL being hot, and Google being Google, this generated a lot of press, much of it containing an element of FUD about the integrity of PostgreSQL backups (note that Google’s own announcements were quite straight-forward about it).

First, some background: Since PostgreSQL 9.3, it has had page checksums available as an option when initializing a new database. These are fast, lightweight checksums to verify that the data on the page hasn’t been corrupted. It’s an excellent and low-impact data integrity feature, and highly recommended.

However, PostgreSQL doesn’t just randomly check the checksums of pages it is not going to read into memory (and why would it?). Thus, there can be lurking corruption in database that you don’t discover until too late… specifically, after all of the backups also contain it. Some PostgreSQL page-level backup tools, such as pgBackRest, check the checksums during the backup, but other tools (such as WAL-E and WAL-G) do not.

Google’s tool adds to that collection by doing an on-disk scan of each page, and verifying the checksum. That’s all it does. That’s not useless! It’s relevant to backups because doing this scan on a backup image doesn’t create the large I/O burden that doing so on an active server would.

So, if you are using a backup tool that does not verify backups, and that creates a disk image (as opposed to say, pg_dump) by all means consider this new addition as part of your toolbox. But there’s nothing inherent about PostgreSQL backups that makes them more prone to corruption, and some of the unfortunate press around this has given that implication.

Patrick TJ McPhee at 18:07, 18 July 2018:

Note that postgres 11 (currently on its second beta) includes the tool pg_verify_checksums which does exactly the same thing as the google tool. I’m not sure if the two are related or which came first, but the official tool has some spin-offs, notably https://github.com/credativ/pg_checksums, which supports all postgres versions from 9.3 on, and which can be used to turn on checksums for clusters created with checksums turned off.