The Build

14 January 2016


Doing a bulk merge in PostgreSQL 9.5

I gave a talk earlier this week about the new features in PostgreSQL 9.5. The headline feature, of course, is the new INSERT ... ON CONFLICT clause, aka “Upsert,” which allows an INSERT that has a key conflict to do an UPDATE (or just ignore the insert) instead.

A perfectly reasonable question was: “Does this work on COPY“? And the answer is no, COPY doesn’t have an ON CONFLICT clause. But the answer is actually yes, because you can get these semantics without too much trouble using things you already have around the house.

First, note that INSERT can take a SELECT statement as the source of the records to import:

INSERT INTO t(pk, col1, col2)
   SELECT pk, col1, col2 FROM t2
      SET col1 = EXCLUDED.col1,
          col2 = EXCLUDED.col2;

So, of course, you could COPY the import file into a secondary table like t2 above, and then do the INSERT.

But we don’t have to! We can use the file_fdw foreign data wrapper to “mount” the text file just like it was a local table.

First, we need to create the file_fdw in the database (it is part of the PostgreSQL distribution);


… and a “server” to go with it:


… and then “mount” the table in the local database:

   pk uuid,
   col1 integer,
   col2 text
) SERVER import_t
  OPTIONS ( filename '/path/to/file/t.csv', format 'csv' );

And then we can do the INSERT operation just as above.

So, we have proper MERGE-type operations in PostgreSQL, right out of the box!

4 January 2016


Always Do This #3: Log Lock Waits and Temp Files

How much and what to log in PostgreSQL is something that doesn’t really admit a single solution. Logging every connection can swamp a server, as can too low a setting of log_min_statement_duration. But there are two settings I always turn on: log_lock_waits, and log_temp_files (with logtempfiles being set to 0).

log_lock_waits will log any lock wait that goes on longer than the deadlock_timeout setting (the same process that checks for deadlocks also emits the log message). By default, this is one second, and if my database has a lock on which a process is waiting for that long, I want to know about it.

log_temp_files is a size over which a log message will be generated for the temp file creation. Zero is everything. Temp files are (almost always) bad; any time the system needs to do something on disk instead of in memory, I want to know about it.

So, just turn these on, and feed the results through pgbadger. If you are getting so many lock waits or temp files that you are getting excessive logging, the problem is the waits or the temp files, not the logging!


January SFPUG: What’s New in PostgreSQL 9.5?

I’ll be presenting at the January San Francisco PostgreSQL Users’ Group meeting on what’s new in PostgreSQL 9.5. I hope you can join us!

21 December 2015


Always Do This #2: ssl_renegotiation_limit = 0

At the point that there are two separate warnings advising you to turn off a configuration parameter in postgresql.conf, it’s probably a good idea to take the advice and disable it.

In theory, this parameter sets a maximum amount of data that will flow over an SSL connection before key renegotiation, to prevent an eavesdropping attacker from determining the session key through collection of a large amount of ciphertext. In practice, it just causes broken connections and miscellaneous problems. Turn it off, especially in situations where you have funky networking and long-standing SSL connections (such as between a primary and secondary).

17 December 2015


Always Do This #1: Create New Clusters with Checksums

(First in a series of things I always do when setting up or configuring PostgreSQL.)

Since version 9.3, PostgreSQL has had the option to create new database clusters with data checksums. These are fast, simple checksums to verify that the data coming off of disk is what PostgreSQL expects it to be. The checksums don’t correct errors, but it can be a warning that something bad has happened to your data.

Always initialize new clusters with them turned on, unless you are running on a file system (like ZFS) that does checksumming itself. You have uncorrected errors on your disk, so you might as well find them.

Turning them on is the –data-checksums (-k) option to initdb. If you are using Debian packaging, you can set this in the /etc/postgresql-common/createcluster.conf file.

18 November 2015


“The PCI-Compliant Database” at PGConfSV

I’ll be speaking about “The PCI-Compilant Database” at PGConf Silicon Valley!

30 October 2015


Don’t Assume PostgreSQL is Slow

You can’t build a real-life system without caching.

That being said, it’s often the case that parts of the system you think are going to be slow aren’t. I’ve noticed a tendency to build out a huge stack of components (“we’ll have PostgreSQL, and Redis, and Celery, and Varnish, and…”) without actually measuring where the bottlenecks are.

Example: A counter.

Suppose you need a global counter for something. It needs to be super-fast, and available across all of the web front ends. It’s not transactional (you never “uncount” based on a rollback), but you do want it to be persistent.

Option 1: Drop Redis into the stack, use INCR to keep the counter, and have some other process that reads the counter and spills it into PostgreSQL, then have some other process that picks up the count when Redis starts and initializes it (or be smart enough to read from both places and add them when yo need it), and accept that there are windows in which you might use counts.

Option 2: Use SERIAL in PostgreSQL.

But option 2 is really really really slow compared to super-ultra-fast Redis, right?

Not really (test on an Amazon i2-2xlarge instance, client over local sockets, Python client):

So: Slower. 6.8 microseconds per increment slower. And no elaborate Redis tooling.

So, build for operation, apply load, then decide what to cache. Human intuition about what might be slow is almost certainly wrong.


ProTip: Digital Ocean, Please Don’t Do This.

Digital Ocean, who I assume are very nice people and meant well, did a Hacktoberfest event in which people were encouraged to submit a pull request to any open source GitHub project. In exchange, “contributors” would get a t-shirt.

You can probably guess what happened:

sigmavirus24: Hey @digitalocean your October pull request event has done nothing but wreak havoc on projects I maintain. I’ve had to reject too many PRs

Reviewing pull requests is a lot of work. Getting buried under trivial pull requests by people who are only after a t-shirt is a great way to get people to take projects private. Please don’t use open source projects as free marketing.

8 October 2015


UUID vs BIGSERIAL for Primary Keys

SERIAL (32 bit integer) or BIGSERIAL (64 bit integer) are the first choice for most people for a synthetic primary key. They’re easy, they’re comprehensible, and they’re transaction-safe. The values that come out of them are, at least to start, manageable and human-scale. They can also provide an easy sortation on creation order.

They’re not perfect, though: If you have to merge together two tables that were generated using SERIALs, you have a massive key update ahead of you to avoid conflicts. If you use SERIAL, exhausting the range is a possibility. If you have a sharded database, you need some way of keeping the sequences independent, such as different starting offsets (but what if you get the offset wrong?) or creating them using different increments (but what if you add another server)

A good alternative is using UUIDs, generated using the uuid_generate_v4() function in PostgreSQL’s uuid-ossp contrib module. This makes mergers much easier, and guarantees independence across multiple tables.

But UUIDs are 128 bits, not 64, and require a function call to generate. How much of a problem is that, really?

As a test, I created a table with a single primary key column and a single float field:

   id <type>,
   f float,

<type> could be one of three possibilities:

The test inserted 10,000,000 rows into the table. In one run, it did a COMMIT after each INSERT; in the other, a single COMMIT after all INSERTs. This was on PostgreSQL 9.4.4 on an AWS i2.2xlarge instance, with the two SSDs in a RAID-0 as the database volume.

The results were:

COMMITing after each INSERT:

column typetime (s)size (MB)


column typetime (s)size (MB)

Overall, the INSERT time for UUIDs was slightly longer than that for BIGSERIAL, but not appreciably. The BIGINT column was notably slower, due to the PL/pgSQL function generating the new keys.

The UUID tables were bigger, of course, although this is an extremely example in that the primary key was only one of two fields in the table; more realistic tables with more columns would not show the same percentage increase.

The conclusion I draw is that it is fine to use UUIDs unless you are faced with a very tight INSERT performance requirement; they are surprisingly efficient compared to BIGSERIAL. My supposition is that the increased computation for the UUID is balanced agains the I/O to maintain the SERIAL.

10 September 2015


Beyond the B-Tree: GIST and GIN Indexes

I was honored to be invited to give a presentation at the Austin PostgreSQL Users’ Group Meetup, and the slides for my presentation Beyond the B-Tree are now available.

« Older Entries

Newer Entries »