A genuinely useful parameter that almost nobody knows about. Added in PostgreSQL 14, off by default, fixes a specific pathology that anyone who has run analytics workloads has lived through at least once.

The pathology

A user runs an expensive query — a multi-hour analytics scan, a runaway report — and walks away. Or their laptop dies. Or their VPN drops. The TCP connection is gone, but PostgreSQL doesn’t know. The backend continues executing the query at full cost for the full duration, eventually completes, attempts to send the result to a socket that closed two hours ago, and only at that point notices the client is gone. The query then logs a “connection to client lost” error, the resources are released, and the only person who paid for those two hours of CPU and I/O is you.

The pre-14 mitigations were all partial:

  • TCP keepalives detect dead connections at the kernel level, but PostgreSQL only acts on the result when it next interacts with the socket — which doesn’t happen during a query, because the query has nothing to send yet.
  • statement_timeout caps query duration, but it’s a sledgehammer; it doesn’t distinguish “client is still waiting” from “client is dead.”
  • Periodic cleanup scripts that pg_terminate_backend() long-running queries based on heuristics. They work, sort of, when someone remembers to maintain them.

None of these solve the actual problem: the server has no idea the client is gone, so it dutifully completes work that nobody will ever see.

What the parameter does

client_connection_check_interval is the interval, in milliseconds, at which the backend polls the client socket during query execution to check whether the kernel has reported the connection as closed. If it has, the backend aborts the query, logs connection to client lost, and cleans up.

Default is 0 (disabled). Context is user. Linux only — and macOS, illumos, and BSD variants per the PG 18 docs, though the original implementation was Linux-only because it relies on the POLLRDHUP extension to poll(). Windows does not have an equivalent.

The polling is cheap. The backend already passes through the query-execution loop frequently; checking a socket descriptor is a single syscall and adds essentially no overhead. The docs and the original patch author were both explicit that the cost is negligible.

So why is the default 0?

Conservatism, primarily. PostgreSQL ships new features off by default until the community has confidence in the operational behavior — same reason jit defaulted to off when it first appeared. The parameter has been available for four major releases now, and the operational case for enabling it is strong; the default has not yet moved. There is an ongoing conversation in the community about flipping the default, and a reasonable expectation that PG 19 or 20 will do so.

Tuning

For most workloads, set it on. Reasonable values:

  • 10s — sensible default for OLTP. Cheap, conservative, catches obvious cases.
  • 5s — more aggressive; appropriate for analytics workloads with very long-running queries where you really want to know within a few seconds that the client has disappeared.
  • 1s — the original patch’s default. Fine on modern hardware; overkill for most production.

Don’t pair this with aggressive tcp_keepalives_* tuning unless you understand the interaction. The PostgreSQL parameter checks what the kernel knows about the socket; if your TCP keepalive settings have the kernel itself taking five minutes to notice a dead connection, then client_connection_check_interval = 5s doesn’t help — it polls promptly, but the kernel doesn’t have an answer yet. Both layers need to be tuned together for fast detection.

The cloud-provider angle: AWS RDS and Aurora set tcp_keepalives_idle = 300 and don’t change client_connection_check_interval from its 0 default, which means dead-client detection on RDS during a long query can take five minutes plus the rest of the query. If you’re on managed PostgreSQL and run long queries, enabling this parameter is one of the higher-leverage configuration changes available to you.

Recommendation: Set client_connection_check_interval = 10s on any cluster that runs queries longer than a few seconds and clients that might not stick around. The default is wrong for almost every modern workload, the cost is negligible, and the savings on a single abandoned long query pays for the parameter’s entire operational lifetime. This is the parameter you wish you’d known about the last time a runaway analytics query ate a CPU core for six hours after the analyst’s laptop went to sleep.