The importance of max_wal_size
23 March 2023
The reality is that most PostgreSQL configuration parameters don’t have a huge impact on overall system performance. There are, however, a couple that really can make a huge difference when tuned from the defaults.
work_mem is one of them, and
max_wal_size is another.
max_wal_size controls how large the write-ahead log can get on disk before PostgreSQL does a checkpoint. It’s not a hard limit; PostgreSQL adapts checkpoint frequency to keep the WAL on disk no larger than that, but excursions above it can definitely happen. The only thing it costs you is disk space; there’s no other problem with it being too large.
max_wal_size too small can cause checkpoints to happen very frequently. Frequent checkpointing is bad for two reasons:
- Checkpoints themselves are expensive, since all of the dirty buffers in
shared_buffersneed to be written out.
- The first time a page is changed after a checkpoint, the entire page is written to the WAL rather than just the change. On a busy system, this can be a very significant burst of WAL activity.
Here’s a process to set
First, set the general checkpoint parameters. This is a good start:
checkpoint_timeout = 15min checkpoint_completion_target = 0.9 wal_compression = on log_checkpoints = on max_wal_size = 16GB
Then, let the system run, and check the logs (or any other tools you may have to determine checkpoint frequency). If the checkpoints are happening more frequently than every 15 minutes, increase
max_wal_size until they are being triggered by the timeout.
min_wal_size? This controls the amount of reserved WAL files that PostgreSQL will retain on disk even if it doesn’t need it for other reasons. This can speed up the WAL slightly, since PostgreSQL can use one of those retained files instead of having to create a new one. There’s no harm in bumping it up (again, all it costs is disk space), but on nearly every environment, the performance impact is small.
There is one comment.
Lukas Fittl at 17:19, 29 March 2023:
Re: “The only thing it costs you is disk space; there’s no other problem with it being too large.”
Doesn’t this omit the fact that a higher max_wal_size leads to longer recovery times after a crash? In my experience that was the reason why you wouldn’t want max_wal_size to e.g. be 100GB, since it means your database might take a while to get back up and running after crashes.