9 June 2017
11:27
There’s a lot of mystery-cult information floating around out there about what to set shared_buffers
to in PostgreSQL. Most of it is, at best, overthinking a fairly simple setting.
You can get 90% of the way there, on 95%+ of PostgreSQL systems, just by setting it to 25% of total system memory, to a maximum of 32GB. (It used to be 8GB, and then 16GB; as processors get faster, the number tends to go up.)
In particular, the difference between 24GB and 32GB, for example, will almost certainly be unmeasurable, on a system with 128GB or more memory.
It can be of benefit to lower shared_buffers
on systems that have a high write rate, to reduce the total amount of data flushed on one checkpoint. I don’t have analytic values for that (shame on me!), but I’ve seen benefits by reducing it to as low as 2GB on a very high write-rate system.
But don’t agonize.
7 June 2017
15:03
For some reason, this is something that even experienced PostgreSQL people don’t know about, which is: What do you do if you’ve forgotten the postgres
user’s password, and you have no other superuser available?
The answer is: Bring up PostgreSQL in single-user mode, reset the password, bring it back up in standard mode.
Of course, this requires access to the shell on the machine running PostgreSQL.
28 April 2017
09:00
The slides from my talk at Percona Live 2017, A PostgreSQL Response to Uber, which I also gave as “Why Uber Was (Mostly) Wrong” at PGDay Nordic 2017, are now available.
31 March 2017
09:34
The slides from my presentation, Corruption War Stories, are now available.
09:33
The slides from my presentation, Django and PostgreSQL, are now available.
29 March 2017
07:08
My slides from my PGConf US 2017 tutorial, PostgreSQL When It’s Not Your Job, are available now.
7 February 2017
03:14
The slides for my talk Corruption War Stories are now available.
4 January 2017
12:00
(One of an occasional series of posts about strange errors you can get out of Django, and what it is trying to tell you.)
This exception was being thrown when I was trying to render a template:
dictionary update sequence element #0 has length 1; 2 is required
It turned out to be a slightly bad url()
in a urlconf. I had forgotten the name=
before the URL’s name. Oops!
10:00
So, what’s weird about this plan, from a query on a partitioned table? (PostgreSQL 9.3, in this case.)
test=> explain select distinct id from orders where order_timestamp > '2016-05-01';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=347341.56..347343.56 rows=200 width=10)
Group Key: orders.id
-> Append (cost=0.00..337096.10 rows=4098183 width=10)
-> Seq Scan on orders (cost=0.00..0.00 rows=1 width=178)
Filter: (order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone)
-> Index Scan using orders_20160425_order_timestamp_idx on orders_20160425 (cost=0.43..10612.30 rows=120838 width=10)
Index Cond: (order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone)
-> Seq Scan on orders_20160502 (cost=0.00..80539.89 rows=979431 width=10)
Filter: (order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone)
-> Seq Scan on orders_20160509 (cost=0.00..74780.41 rows=909873 width=10)
Filter: (order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone)
-> Seq Scan on orders_20160516 (cost=0.00..68982.25 rows=845620 width=10)
Filter: (order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone)
-> Seq Scan on orders_20160523 (cost=0.00..65777.68 rows=796054 width=10)
Filter: (order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone)
-> Seq Scan on orders_20160530 (cost=0.00..36403.57 rows=446366 width=10)
Filter: (order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone)
(17 rows)
That estimate on the HashAggregate certainly looks wonky, doesn’t it? Just 200 rows even with a huge number of rows below it?
What if we cut down the number of partitions being hit?
test=> explain select distinct id from orders where order_timestamp > '2016-05-01' and order_timestamp < '2016-05-15';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=169701.02..169703.02 rows=200 width=10)
Group Key: orders.id
-> Append (cost=0.00..165026.92 rows=1869642 width=10)
-> Seq Scan on orders (cost=0.00..0.00 rows=1 width=178)
Filter: ((order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone) AND (order_timestamp < '2016-05-15 00:00:00'::timestamp without time zone))
-> Index Scan using orders_20160425_order_timestamp_idx on orders_20160425 (cost=0.43..10914.39 rows=120838 width=10)
Index Cond: ((order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone) AND (order_timestamp < '2016-05-15 00:00:00'::timestamp without time zone))
-> Seq Scan on orders_20160502 (cost=0.00..82988.46 rows=979431 width=10)
Filter: ((order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone) AND (order_timestamp < '2016-05-15 00:00:00'::timestamp without time zone))
-> Index Scan using orders_20160509_order_timestamp_idx on orders_20160509 (cost=0.42..71124.06 rows=769372 width=10)
Index Cond: ((order_timestamp > '2016-05-01 00:00:00'::timestamp without time zone) AND (order_timestamp < '2016-05-15 00:00:00'::timestamp without time zone))
(11 rows)
Still 200 exactly. OK, that’s bizarre. Let’s select exactly one partition:
test=> explain select distinct id from orders where order_timestamp > '2016-05-14' and order_timestamp < '2016-05-15';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=14669.26..14671.26 rows=200 width=10)
Group Key: orders.id
-> Append (cost=0.00..14283.05 rows=154481 width=10)
-> Seq Scan on orders (cost=0.00..0.00 rows=1 width=178)
Filter: ((order_timestamp > '2016-05-14 00:00:00'::timestamp without time zone) AND (order_timestamp < '2016-05-15 00:00:00'::timestamp without time zone))
-> Index Scan using orders_20160509_order_timestamp_idx on orders_20160509 (cost=0.42..14283.05 rows=154480 width=10)
Index Cond: ((order_timestamp > '2016-05-14 00:00:00'::timestamp without time zone) AND (order_timestamp < '2016-05-15 00:00:00'::timestamp without time zone))
(7 rows)
Still 200 exactly. What happens if we select from the child directly?
test=> explain select distinct id from orders_20160509;
QUERY PLAN
-------------------------------------------------------------------------------------
HashAggregate (cost=74780.41..75059.51 rows=27910 width=10)
Group Key: id
-> Seq Scan on orders_20160509 (cost=0.00..72505.73 rows=909873 width=10)
(3 rows)
A much more reasonable estimate. So, what’s going on?
That 200 should be something of a flag, as that’s a compiled-in constant that PostgreSQL uses when it doesn’t have ndistinct information for a particular table, usually because there are no statistics collected on a table.
In this case, the issue was that an ANALYZE had never been done on the parent table. This isn’t surprising: Autovacuum would never hit that table, since (like most parent tables in a partition set), it has no rows and is never updated or inserted to. That lack-of-information gets passed up through the Append node, and the HashAggregate just uses the default 200.
Sure enough, when the parent table was ANALYZE’d, the estimates became much more reasonable.
So: It can pay to do an initial ANALYZE on a newly created partitioned table so that the planner gets statistics for the parent table, even if those statistics are “no rows here.”
1 January 2017
16:53
I love Django a lot; it still surprises me how productive I can be in it. And, especially in more recent versions, the error handling for weird configuration problems is much, much better than it used to be.
But sometimes, you get an error whose origin is slightly mysterious. Thus, it can be helpful to have a log of what Django says, and what it means.
Today’s is:
Site matching query does not exist
This usually means one of two things:
- You forgot to create the record for the site in in your
django_site
table. If so, create a data migration or fixture!
- You forgot the
SITE_ID
setting in your settings.py entirely. I did this while moving a project from an old version of Django.