PostgreSQL

PostgreSQL

What a Data Lake Actually Is (and why you probably don’t need one)

If your engineers have told you that you need a data lake, you should be a little suspicious. Most organizations that build data lakes don’t need them, and a substantial fraction of the ones that do build them end up with what the industry — without any irony — calls a “data swamp.” So before we get to what a

All Your GUCs in a Row: autovacuum_multixact_freeze_max_age

This parameter is the multixact equivalent of autovacuum_freeze_max_age. The mechanism is parallel; the underlying object being protected is not the transaction ID space but the MultiXact ID space, which most PostgreSQL users have never had to think about and the rest learned about during an outage. So before the parameter, the multixact tour.

Managed Postgres, Examined: Amazon Aurora PostgreSQL

Second in a series of dispassionate tours of managed PostgreSQL services. The first covered Amazon RDS for PostgreSQL. Aurora sits one tab over in the same AWS console and is architecturally unlike anything else on the list.

All Your GUCs in a Row: autovacuum_max_workers

autovacuum_max_workers sets the maximum number of autovacuum worker processes that may run simultaneously. Default is 3. Context is postmaster, so changing it requires a restart. The launcher process is separate and not counted against this number.

This is the parameter that gets raised from 3 to 10 by someone who has decided autovacuum is too slow, after which

Failover Slots, Two Years On

Logical replication and physical standbys did not get along, for a long time. You could have one or the other surviving a failover, but not both. PostgreSQL 17 finally shipped the machinery to fix this. PostgreSQL 18 did not really add to it. PostgreSQL 19, currently in feature freeze, adds two genuine improvements and one quality-of-life change. The honest question

PgQue: Two Snapshots and a Diff

PgQue shipped its v0.1 last week, and the part I want to talk about is not what the announcement leads with — managed-Postgres compatibility, no C extension, no daemon. Those are real, but they’re packaging. The part that’s worth understanding is the implementation, because PgQue is a working in-database queue whose hot path contains zero UPDATEs, zero DELETEs,

All Your GUCs in a Row: autovacuum_freeze_max_age

This parameter is the last line of defense against PostgreSQL’s most famous failure mode. To explain what it does, a brief detour into how PostgreSQL knows which rows you are allowed to see.

All Your GUCs in a Row: autovacuum_analyze_scale_factor and autovacuum_analyze_threshold

These two are inseparable. They combine in a single formula that decides when autovacuum runs ANALYZE against a table, and discussing one without the other gives you half a picture. So: a double-header.

The formula:

1analyze threshold = autovacuum_analyze_threshold
2 + autovacuum_analyze_scale_factor × reltuples

When the number of tuples inserted, updated, or deleted since

wal_sender_shutdown_timeout: Now Actually a Timeout

If you have ever run pg_ctl stop -m fast on a primary and watched it hang well past wal_sender_shutdown_timeout, you have met a bug that has been sitting in walsender.c for years. As of commit c0b24b3 on master (Fujii Masao, May 1, reported by Andres Freund via FreeBSD CI), it is fixed. PostgreSQL 19 will enforce the timeout. PostgreSQL

All Your GUCs in a Row: autovacuum

autovacuum is a boolean. Default on, context sighup. Set it to off for any meaningful length of time and you have purchased a tour of every PostgreSQL failure mode worth knowing about, in escalating order, at no extra charge. Let me describe the tour.

The autovacuum launcher and its workers run VACUUM and ANALYZE against tables based