postgresql when it's not your job

10 July 2024

10:48

Checking Your Privileges, 2

I turned the last blog post into a talk; you can get the slides here.

no comments

25 March 2024

10:00

Checking Your Privileges

The PostgreSQL roles and privileges system can be full of surprises.

Let’s say we have a database test, owned by user owner. In it, we create a very secret function f that we do not want just anyone to be able to execute:

test=> select current_user;
 current_user 
--------------
 owner
(1 row)

test=> CREATE FUNCTION f() RETURNS int as $$ SELECT 1; $$ LANGUAGE sql;
CREATE FUNCTION
test=> select f();
 f 
---
 1
(1 row)

There are two other users: hipriv and lowpriv. We want hipriv to be able to run the function, but not lowpriv. So, we grant EXECUTE to hipriv, but revoke it from lowpriv:

test=> GRANT EXECUTE ON FUNCTION f() TO hipriv;
GRANT
test=> REVOKE EXECUTE ON FUNCTION f() FROM lowpriv;
REVOKE

Let’s test it! We log in as hipriv and run the function:

test=> SELECT current_user;
 current_user 
--------------
 hipriv
(1 row)

test=> SELECT f();
 f 
---
 1
(1 row)

Works great. Now, let’s try it as lowpriv:

test=> SELECT current_user;
 current_user 
--------------
 lowpriv
(1 row)

test=> SELECT f();
 f 
---
 1
(1 row)

Wait, what? Why did it let lowpriv run f()? We explicitly revoked that permission! Is the PostgreSQL privileges system totally broken?

Well, no. But there are some surprises.

Let’s look at the privileges on f():

test=> SELECT proacl FROM pg_proc where proname = 'f';
                 proacl                  
-----------------------------------------
 {=X/owner,owner=X/owner,hipriv=X/owner}
(1 row)

The interpretation of each of the entries is “=/“. We see that owner has X (that is, EXECUTE) on f() granted by itself, and hipriv has EXECUTE granted by owner. But what’s with that first one that doesn’t have a role at the start? And where is our REVOKE on lowpriv?

The first thing that may be surprising is that there is no such thing as a REVOKE entry in the privileges. REVOKE removes a privilege that already exists; it doesn’t create a new entry that says “don’t allow this.” This means that unless there is already an entry that matches the REVOKE, REVOKE is a no-op.

The second thing is that if there is no role specified that, that means the special role PUBLIC. PUBLIC means “all roles.” So, anyone can execute f()! This is the default privilege for new functions.

Combined, this means that when the function was created, EXECUTE was granted to PUBLIC. The REVOKE was a no-op, because there was no explicit grant of privileges to lowpriv.

How do we fix it? First, we can revoke that undesirable first grant to PUBLIC:

test=> REVOKE EXECUTE ON FUNCTION f() FROM PUBLIC;
REVOKE

hipriv can still run the function, because we gave it an explicit grant:

test=> SELECT current_user;
 current_user 
--------------
 hipriv
(1 row)

test=> SELECT f();
 f 
---
 1
(1 row)

But lowpriv can’t skate in under the grant to PUBLIC, so it can’t run the function anymore:

test=> SELECT current_user;
 current_user 
--------------
 lowpriv
(1 row)

test=> SELECT f();
ERROR:  permission denied for function f

So, if you are counting on the PostgreSQL privilege system to prevent roles from running functions (and accessing other objects), be sure you know what the default permissions are, and adjust them accordingly.

no comments

4 October 2023

08:16

“Look It Up: Real-Life Database Indexing” at PgConf.NYC

The slides for my talk “Look It Up: Real-Life Database Indexing” are now available.

1 comment

9 May 2023

10:58

Don’t use ChatGPT to solve problems.

I shouldn’t have to say this, but don’t use ChatGPT for technical advice.

In an experiment, I asked 40 questions about PostgreSQL. 23 came back with misleading or simply inaccurate information. Of those, 9 came back with answers that would have caused (at best) performance issues. One of the answers could result in a corrupted database (deleting WAL files to recover disk space).

LLMs are not a replacement for expertise.

4 comments

3 May 2023

09:55

Running PostgreSQL on two ports

Recently on one of the PostgreSQL mailing lists, someone wrote in asking if it was possible to get PostgreSQL to listen on two ports. The use case, to paraphrase, was that there was a heterogeneous mix of clients, some of which could connect with TLS, some of which couldn’t. They wanted the clients that could use TLS to do so, while allowing the non-TLS clients access.

The simple answer is: Upgrade your non-TLS clients already! But of course the world is a complicated place. And for reasons that weren’t given (but which we will accept for now), it has to be two different ports.

The PostgreSQL server itself can only listen on one port. But there were two options presented that could fix this:

Run pgbouncer with TLS turned on, on a different port, and have it forward the connections to the PostgreSQL server via a local socket.
Run stunnel to listen for TLS connections, and route those to PostgreSQL.

I don’t imagine many people will have this exact situation, but if you do… there are options!

1 comment

2 May 2023

10:23

“Writing a Foreign Data Wrapper” at PGCon 2023

I’ll be speaking about Writing a Foreign Data Wrapper at PGCon 2023 in Ottawa, May 30-June 2, 2023. Do come! It’s the premiere technical/hacker conference for PostgreSQL.

no comments

30 March 2023

01:14

A little more on max_wal_size

In a comment on my earlier post on max_wal_size, Lukas Fittl asked a perfectly reasonable question:

Re: “The only thing it costs you is disk space; there’s no other problem with it being too large.”

Doesn’t this omit the fact that a higher max_wal_size leads to longer recovery times after a crash? In my experience that was the reason why you wouldn’t want max_wal_size to e.g. be 100GB, since it means your database might take a while to get back up and running after crashes.

The answer is… as you might expect, tricky.

The reason is that there are two different ways a checkpoint can be started in PostgreSQL (in regular operations, that is; there’s a few more, such as manual CHECKPOINT commands and the start of a backup using pg_start_backup). Those are when PostgreSQL thinks it needs to checkpoint to avoid overrunning max_wal_size (by too much), and when checkpoint_timeout is reached. It starts a checkpoint on the first of those that it hits.

The theory behind my recommendations on checkpoint tuning is to increase max_wal_size to the point that you are sure that it is always checkpoint_timeout that fires rather than max_wal_size. That in effect caps the checkpoint interval, so larger values of max_wal_size don’t change the checkpoint behavior once it has reached the level that checkpoint_timeout is always the reason a checkpoint starts.

But Lukas does raise a very good point: the time it takes to recover a PostgreSQL system from a crash is proportionate to the amount of WAL that it has to replay, in bytes, and that’s soft-capped by max_wal_size. If crash recovery speed is a concern, it might make sense to not go crazy with max_wal_size, and cap it at a lower level.

Pragmatically, crashes are not common and checkpoints are very common, so I recommend optimizing for checkpoint performance rather than recovery time… but if your system is very sensitive to recovery time, going crazy with max_wal_size is probably not a good idea.

1 comment

23 March 2023

13:21

The importance of max_wal_size

The reality is that most PostgreSQL configuration parameters don’t have a huge impact on overall system performance. There are, however, a couple that really can make a huge difference when tuned from the defaults. work_mem is one of them, and max_wal_size is another.

max_wal_size controls how large the write-ahead log can get on disk before PostgreSQL does a checkpoint. It’s not a hard limit; PostgreSQL adapts checkpoint frequency to keep the WAL on disk no larger than that, but excursions above it can definitely happen. The only thing it costs you is disk space; there’s no other problem with it being too large.

Having max_wal_size too small can cause checkpoints to happen very frequently. Frequent checkpointing is bad for two reasons:

Checkpoints themselves are expensive, since all of the dirty buffers in shared_buffers need to be written out.
The first time a page is changed after a checkpoint, the entire page is written to the WAL rather than just the change. On a busy system, this can be a very significant burst of WAL activity.

Here’s a process to set max_wal_size properly:

First, set the general checkpoint parameters. This is a good start:

checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
wal_compression = on
log_checkpoints = on
max_wal_size = 16GB

Then, let the system run, and check the logs (or any other tools you may have to determine checkpoint frequency). If the checkpoints are happening more frequently than every 15 minutes, increase max_wal_size until they are being triggered by the timeout.

How about min_wal_size? This controls the amount of reserved WAL files that PostgreSQL will retain on disk even if it doesn’t need it for other reasons. This can speed up the WAL slightly, since PostgreSQL can use one of those retained files instead of having to create a new one. There’s no harm in bumping it up (again, all it costs is disk space), but on nearly every environment, the performance impact is small.

1 comment

21 March 2023

07:10

“Real-World Logical Replication” at Nordic PGDay 2023

The slides from my presentation “Real-World Logical Replication” are now available.

no comments

06:48

“Database Antipatterns, and where to find them” at SCaLE 20x

The slides are now available for my talk “Database Antipatterns, and where to find them” at SCaLE 20x.

no comments