PostgreSQL

PostgreSQL

All Your GUCs in a Row: default_statistics_target

default_statistics_target is one of the most-recommended and least-explained parameters in PostgreSQL. Tuning guides say “raise it to 500 for data warehouses” with the confidence of scripture and rarely a word about what the number is. The default is 100, the range is 1 to 10,000, the context is user, and what the number actually controls is the

How the Other Half Plans

Statistics are the input. Planning is what the database does with them: it takes a declarative query, which describes what you want and says nothing about how, and turns it into an executable plan, which is nothing but how. There are two jobs inside that. First, rewrite the query into a logically equivalent but more tractable shape, which is where

All Your GUCs in a Row: the debug_* family

Twelve parameters share the debug_ prefix, and the prefix is load-bearing: these are PostgreSQL’s own development and QA apparatus, exposed as runtime settings so the buildfarm and the core developers can exercise code paths without recompiling. Pavlo Golub, writing about one of them, summed up the correct posture: “I’ll never-ever touch a runtime option with a ‘debug’ prefix on my

All Your GUCs in a Row: deadlock_timeout

The name is a small lie. deadlock_timeout is not how long PostgreSQL tolerates a deadlock before breaking it — deadlocks are broken the instant they’re found. It’s how long a process waits on a lock before PostgreSQL bothers to look for a deadlock at all. The default is 1s, the context is superuser, and the gap between what

All Your GUCs in a Row: DateStyle

After the fsync abyss, something genial. DateStyle controls how PostgreSQL prints dates and how it resolves ambiguous date input, and the only thing you really need to know about it is: leave it on ISO, and the rest is interesting trivia.

The default is ISO, MDY. Context is user, with PGDATESTYLE as the environment variable libpq

How the Other Half Counts

PostgreSQL has ANALYZE. You run it (or autovacuum runs it for you), it draws a sample of 300 × default_statistics_target rows, and it writes a row per column into pg_statistic: a null fraction, an n-distinct estimate, a most-common-values list, an equi-depth histogram, and a physical-vs-logical correlation. The planner reads those numbers, multiplies selectivities together, costs a handful of

All Your GUCs in a Row: data_sync_retry

data_sync_retry is a boolean, it defaults to off, and its context is postmaster so changing it needs a restart. You will almost certainly never change it. It exists as the visible scar tissue from the single most unsettling thing the PostgreSQL community ever learned about its own durability assumptions, and to explain a one-line setting we have to explain

All Your GUCs in a Row: data_directory_mode

What is the point of this one? SHOW data_directory_mode reports the Unix permission bits on the data directory — 0700 or 0750 — and that is the entire extent of what it does. It’s read-only; you can’t set it. And it reports a fact that ls -ld $PGDATA would tell you just as well. So why is it a GUC?

All Your GUCs in a Row: data_directory

data_directory names the location of the cluster’s data — the directory people mean when they write $PGDATA, the one holding base/, pg_wal/, global/, and the rest. Context is postmaster: settable in postgresql.conf or on the command line, never at runtime. And like a small handful of others, it cannot be set with ALTER SYSTEM

All Your GUCs in a Row: data_checksums

A read-only preset, like block_sizeSHOW data_checksums tells you whether the cluster has page checksums, and that’s the only interaction the GUC offers. But unlike block_size, this one has a thirteen-year history that’s still being written, and the history is the post.

When checksums are on, every data page carries a checksum that’s written when the page