The Build

30 December 2016

13:50

The Multi-Column Index of the Mysteries

The one thing that everyone knows about compositive indexes is: If you have an index on (A, B, C), it can’t be used for queries on (B) or (B, C) or (C), just (A), (A, B) or (A, B, C), right? I’ve said that multiple times in talks. It’s clearly true, right?

Well, no, it’s not. It’s one of those things that is not technically true, but it is still good advice.

The documentation on multi-column indexes is pretty clear:

A multicolumn B-tree index can be used with query conditions that involve any subset of the index’s columns, but the index is most efficient when there are constraints on the leading (leftmost) columns. The exact rule is that equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will be used to limit the portion of the index that is scanned.

Let’s try this out!

First, create a table and index:

xof=# CREATE TABLE x ( 
xof(#     i integer,
xof(#     f float,
xof(#     g float
xof(# );
CREATE TABLE
xof=# CREATE INDEX ON x(i, f, g);
CREATE INDEX

And fill it with some test data:

xof=# INSERT INTO x SELECT 1, random(), random() FROM generate_series(1, 10000000);
INSERT 0 10000000
xof=# INSERT INTO x SELECT 2, random(), random() FROM generate_series(1, 10000000);
INSERT 0 10000000
xof=# INSERT INTO x SELECT 3, random(), random() FROM generate_series(1, 10000000);
INSERT 0 10000000
xof=# ANALYZE x;
ANALYZE

And away we go!

xof=# EXPLAIN ANALYZE SELECT SUM(g) FROM x WHERE f BETWEEN 0.11 AND 0.12;
                                                                   QUERY PLAN                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=599859.50..599859.51 rows=1 width=8) (actual time=91876.057..91876.057 rows=1 loops=1)
   ->  Index Only Scan using x_i_f_g_idx on x  (cost=0.56..599097.71 rows=304716 width=8) (actual time=1820.699..91652.409 rows=300183 loops=1)
         Index Cond: ((f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
         Heap Fetches: 300183
 Planning time: 3.384 ms
 Execution time: 91876.165 ms
(6 rows)

And sure enough, it uses the index, even though we didn’t include column i in the query. In this case, the planner thinks that this will be more efficient than just doing a sequential scan on the whole table, even though it has to walk the whole index.

Is it right? Let’s turn off index scans and find out.

xof=# SET enable_indexonlyscan = 'off';
SET
xof=# EXPLAIN ANALYZE SELECT SUM(g) FROM x WHERE f BETWEEN 0.11 AND 0.12;
                                                                QUERY PLAN                                                                 
-------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=599859.50..599859.51 rows=1 width=8) (actual time=39691.081..39691.081 rows=1 loops=1)
   ->  Index Scan using x_i_f_g_idx on x  (cost=0.56..599097.71 rows=304716 width=8) (actual time=1820.676..39624.144 rows=300183 loops=1)
         Index Cond: ((f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
 Planning time: 0.181 ms
 Execution time: 39691.128 ms
(5 rows)

PostgreSQL, you’re not helping!

xof=# SET enable_indexscan = 'off';
SET
xof=# EXPLAIN ANALYZE SELECT SUM(g) FROM x WHERE f BETWEEN 0.11 AND 0.12;
                                                                   QUERY PLAN                                                                    
-------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=689299.60..689299.61 rows=1 width=8) (actual time=40593.427..40593.428 rows=1 loops=1)
   ->  Bitmap Heap Scan on x  (cost=513444.70..688537.81 rows=304716 width=8) (actual time=37901.773..40542.900 rows=300183 loops=1)
         Recheck Cond: ((f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
         Rows Removed by Index Recheck: 8269763
         Heap Blocks: exact=98341 lossy=53355
         ->  Bitmap Index Scan on x_i_f_g_idx  (cost=0.00..513368.52 rows=304716 width=0) (actual time=37860.366..37860.366 rows=300183 loops=1)
               Index Cond: ((f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
 Planning time: 0.160 ms
 Execution time: 40593.764 ms
(9 rows)

Ugh, fine!

xof=# SET enable_bitmapscan='off';
xof=# EXPLAIN ANALYZE SELECT SUM(g) FROM x WHERE f BETWEEN 0.11 AND 0.12;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=641836.33..641836.34 rows=1 width=8) (actual time=27270.666..27270.666 rows=1 loops=1)
   ->  Seq Scan on x  (cost=0.00..641074.54 rows=304716 width=8) (actual time=0.081..27195.552 rows=300183 loops=1)
         Filter: ((f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
         Rows Removed by Filter: 29699817
 Planning time: 0.157 ms
 Execution time: 27270.726 ms
(6 rows)

It turns out the seq scan is faster, which isn’t that much of a surprise. Of course, what’s really fast is using the index properly:

xof=# // reset all query planner settings
xof=# EXPLAIN ANALYZE SELECT SUM(g) FROM x WHERE i IN (1, 2, 3) AND f BETWEEN 0.11 AND 0.12;
                                                                QUERY PLAN                                                                 
-------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=92459.82..92459.83 rows=1 width=8) (actual time=6283.162..6283.162 rows=1 loops=1)
   ->  Index Only Scan using x_i_f_g_idx on x  (cost=0.56..91698.03 rows=304716 width=8) (actual time=1.295..6198.409 rows=300183 loops=1)
         Index Cond: ((i = ANY ('{1,2,3}'::integer[])) AND (f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
         Heap Fetches: 300183
 Planning time: 1.264 ms
 Execution time: 6283.567 ms
(6 rows)

And, of course, a dedicated index for that particular operation is the fastest of all:

xof=# CREATE INDEX ON x(f);
CREATE INDEX
xof=# EXPLAIN ANALYZE SELECT SUM(g) FROM x WHERE f BETWEEN 0.11 AND 0.12;
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=188492.00..188492.01 rows=1 width=8) (actual time=5536.940..5536.940 rows=1 loops=1)
   ->  Bitmap Heap Scan on x  (cost=4404.99..187662.16 rows=331934 width=8) (actual time=209.854..5466.633 rows=300183 loops=1)
         Recheck Cond: ((f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
         Rows Removed by Index Recheck: 8258716
         Heap Blocks: exact=98337 lossy=53359
         ->  Bitmap Index Scan on x_f_idx  (cost=0.00..4322.00 rows=331934 width=0) (actual time=163.402..163.402 rows=300183 loops=1)
               Index Cond: ((f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
 Planning time: 5.586 ms
 Execution time: 5537.235 ms
(9 rows)

Although, interestingly enough, PostgreSQL doesn’t quite get it right here:

xof=# SET enable_bitmapscan='off';
SET
xof=# EXPLAIN ANALYZE SELECT SUM(g) FROM x WHERE f BETWEEN 0.11 AND 0.12;
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=203875.29..203875.30 rows=1 width=8) (actual time=2178.215..2178.216 rows=1 loops=1)
   ->  Index Scan using x_f_idx on x  (cost=0.56..203045.45 rows=331934 width=8) (actual time=0.161..2110.903 rows=300183 loops=1)
         Index Cond: ((f >= '0.11'::double precision) AND (f <= '0.12'::double precision))
 Planning time: 0.170 ms
 Execution time: 2178.279 ms
(5 rows)

So, we conclude:

And there we are.

xof=# DROP TABLE x;
DROP TABLE

28 December 2016

20:24

A Cheap and Cheerful Replication Check

On a PostgreSQL primary / secondary pair, it’s very important to monitor replication lag. Increasing replication lag is often the first sign of trouble, such as a network issue, the secondary disconnecting for some reason (or for no reason at all, which does happen rarely), disk space issues, etc.

You can find all kinds of complex scripts that do math on the various WAL positions that are available from the secondary and from pgstatreplication on the primary.

Or you can do this. It’s very cheap and cheerful, and for many installations, it gets the job done.

First, on the primary (and thus on the secondary), we create a one-column table:

CREATE TABLE replication_monitor (
   last_timestamp TIMESTAMPTZ
);

Then, we insert a singel row into the table (you can probably already see where this is going):

INSERT INTO replication_monitor VALUES(now());

Having that, we can start a cron job that runs every minute, updating that value:

* * * * * /usr/bin/psql -U postgres -c "update replication_monitor set last_update=now()" postgres > /dev/null

On the secondary (which is kept in sync with the primary via NTP, so make sure ntpd is running on both!), we have a script, also run from cron, that complains if the value in has fallen more than a certain amount behind now(). Here’s a (pretty basic) Python 2 version:

#!/usr/bin/python

import sys

import psycopg2

conn = psycopg2.connect("dbname=postgres user=postgres")

cur = conn.cursor()

cur.execute("select (now()-last_update)>'5 minutes'::interval from replication_monitor")

problem = cur.fetchone()[0]

if problem:
    print >>sys.stderr, "replication lag over 5 minutes."

We make sure we get the output from stderr for cron jobs on the secondary, set it up to run every so often, and we’re done!

Of course, this has its limitations:

It has the advantage that it works even if the server is otherwise not taking traffic, since it creates traffic all by itself.

As a final note, check_postgres has this check integrated as one of the (many, many) checks it can do, as the replicate_row check. If you are using check_postgres, by all means use that one!

3 November 2016

06:51

Securing PostgreSQL at PGConf EU

The slides for my talk, Securing PostgreSQL at PGConf EU 2016 are now available.

2 November 2016

03:03

Unclogging the VACUUM at PGConf EU

The slides for my presentation Unclogging the VACUUM at PGConf EU in Tallinn, Estonia are now available.

4 June 2016

21:34

Recent Slides

16 February 2016

11:51

Always Do This #5: The Elements of postgresql.conf Style

Everyone has their own style, unfortunately, on how they edit postgresql.conf. Some like to uncomment specific values and edit them at the point they appear in the default file, some like to tack overrides onto the end… and some do a mixture of those (don’t do that).

My personal preference is to leave everything in the default file commented, and include an overrides file (postgresql.local.conf). That way, I have an easy reference for what the defaults are, and minimum changes when a new version of postgresql.conf lands (you do know that checkpoint_segments is obsolete in 9.5, and you can’t specify it, right?). It’s easy for me to see what I’ve changed, since I can just consult that one included file.

I highly recommend this. But whatever you do, don’t do the “some things in the middle, some things at the end,” please. The next person to edit the file will thank you.

11 February 2016

20:53

Indexes Are Not Cheap

I’ve noticed an increasing tendency in PostgreSQL users to over-index tables, often constructing very complex partial indexes to try to speed up very particular queries.

Be careful about doing this. Not only do additional indexes increase the plan time, they greatly increase insert time.

By way of example, I created a table with a single bigint column, and populated it with:

time psql -c "insert into t select generate_series(1, 100000000)"

That run without any indexes took 1 minute, 55 seconds; that run with eight indexes on the same table took 26 minutes, 39 seconds, or almost 14 times slower.

Regularly consult pg_stat_user_indexes and drop indexes that aren’t being used. Your disk space and insert times will thank you.

9 February 2016

11:46

Django 1.8/1.9 and PostgreSQL: An Ever-Closer Union

I’ll be speaking about Django and PostgreSQL at PyCon US 2016.

8 February 2016

20:46

PostgreSQL High Availability, 2016 Edition

The slides from my talk at PG Day at FOSDEM 2016 are now available.

2 February 2016

19:59

Always Do This #4: Put stats_temp_directory on a memory file system

The PostgreSQL statistics collector generates a lot of very important statistics about the state of the database. If it’s not working, autovacuum doesn’t work, among other problems. But it does generate a lot of write activity, and by default, that goes back onto the database volume.

Instead, always set statstempdirectory to point to a RAM disk (which has to be owned by the postgres user, with 0600 permissions). The statistics are written back to the database on shutdown, so in normal operations, you won’t lose anything on a reboot. (You’ll lose the most recent statistics on a crash, but you will anyway; the statistics are reset on recovery operations, including restart from a crash.)

This can substantially cut down the amount of write I/O the main database storage volume has to receive, and it’s free!

« Older Entries

Newer Entries »