postgresql when it's not your job

00:00

A foreign key pathology to avoid

18 January 2023

There’s a particular anti-pattern in database design that PostgreSQL handles… not very well.

For example, let’s say you are building something like Twitch. (The real Twitch doesn’t work this way! At least, not as far as I know!) So, you have streams, and you have users, and users watch streams. So, let’s do a schema!

CREATE TABLE stream (stream_id bigint PRIMARY KEY);

CREATE TABLE "user" (user_id bigint PRIMARY KEY);

CREATE TABLE stream_viewer (    
    stream_id bigint REFERENCES stream(stream_id),    
    user_id bigint REFERENCES "user"(user_id),   
    PRIMARY KEY (stream_id, user_id));

OK, schema complete! Not bad for a day’s work. (Note the double quotes around "user". USER is a keyword in PostgreSQL, so we have to put it in double quotes to use as a table name. This is not great practice, but more about double quotes some other time.)

Let’s say we persuade a very popular streamer over to our platform. They go on-line, and all 1,252,136 of our users simultaneously log on and start following that stream.

So, we now have to insert 1,252,136 new records into stream_viewer. That’s pretty bad. But what’s worse is now we have 1,252,136 records with a foreign key relationship to a single record in stream. During the operation of the INSERT statement, the transaction that is doing the INSERT will take a FOR KEY SHARE lock on that record. This means that at any one moment, several thousand different transactions will have a FOR KEY SHARE lock on that record.

This is very bad.

If more than one transaction at a time has a lock on a single record, the MultiXact system handles this. MultiXact puts a special transaction ID in the record that’s locked, and then builds an external data structure that holds all of the transaction IDs that have locked the record. This works great… up to a certain size. But that data structure is of fixed size, and when it fills up, it spills onto secondary storage.

As you might imagine, that’s slow. You can see this with lots of sessions suddenly waiting on various MultiXact* lightweight locks.

You can get around this in a few ways:

Not many systems have this particular design issue. (You would never actually build a streaming site using that schema, just to start.) But if you do, this particular behavior is a good thing to avoid.

Comments are closed.