postgresql when it's not your job

20 April 2012

09:37

Two very cool things.

First, read this essay about the disaster that is PHP. Every word is correct.

Then, view this photo set.

18 April 2012

23:55

Blah, Blah, Blah, First Half of 2012 Edition

I’ll be speaking at the following conferences through July:

13 April 2012

08:00

The Elements of postgresql.conf Style

… or, inexcusable things I am tired of seeing in postgresql.conf files.

Do not mix ‘n’ match override styles.

There are two valid styles for overriding the default values in postgresql.conf: Putting your changes as a cluster at the end, or uncommenting the defaults and overriding in place. Both have advantages and disadvantages. Having some settings one way and some another is pure disadvantage. Do not do this.

Use units.

Quick, what is log_min_duration_statement set to here?

log_min_statement_duration = 2000

Now, what is it set to here?

log_min_statement_duration = 2s

Always use units with numeric values if a unit is available.

Do not remove the default settings.

If you strip out all of the defaults, it becomes impossible to tell what a particular value is set to. Leave the defaults in place, and if you comment out a setting, reset the value to the default (or at least include comments that make it clear what is going on).

Do not leave junk postgresql.conf files scattered around.

If you need to move postgresql.conf (and the other configuration files) to a different location from where the package for your system puts it, don’t leave the old, dead postgresql.conf lying around. Delete any trace of the old installation hierarchy.


Thank you.

12 April 2012

12:09

Instagram’s Technology Stack

Instagram has been in the news lately. In this really great post on Tumblr, Instagram talks about its technology stack.

I have some acquaintance with the Instagram people, and they are among the smartest technologists I’ve met. Really nice, too. (Of course, they mention this blog in the post, so I’m biased.)

19 March 2012

12:33

A Recipe for Django Transactions on PostgreSQL

As noted before, Django has a lot of facilities for handling transactions, and it’s not at all clear how to use them. In an attempt to cut through the confusion, here’s a recipe for handling transactions sensibly in Django applications on PostgreSQL.

The goals are:

The bits of the recipe are:

The quick reasons behind each step:

This recipe a few other nice features:

xact() also supports the using parameter for multiple databases.

Of course, a few caveats:

To use, just drop the source (one class definition, one function) into a file somewhere in your Django project (such as the omni-present utils application every Django project seems to have), and include it.

Examples:

from utils.transaction import xact

@xact
def my_view_function1(request):
   # Everything here will be in a transaction.
   # It'll roll back if an exception escapes, commits otherwise.

def my_view_function2(request):
   # This stuff won't be in a transaction, so don't modify the database here.
   with xact():
      # This stuff will be, and will commit on normal completion, roll back on a exception

def my_view_function3(request):
   with xact():
      # Modify the database here (let's call it "part 1").
      try:
         with xact():
            # Let's call this "part 2."
            # This stuff will be in its own savepoint, and can commit or
            # roll back without losing the whole transaction.
      except:
         # Part 2 will be rolled back, but part 1 will still be available to
         # be committed or rolled back.  Of course, if an exception
         # inside the "part 2" block is not caught, both part 2 and
         # part 1 will be rolled back.

The source is available on GitHub. It’s licensed under the PostgreSQL License.

24 January 2012

23:03

PostgreSQL Performance When It’s Not Your Job

My presentation from SCALE 10x, “PostgreSQL Performance When It’s Not Your Job” is now available for download.

30 September 2011

22:39

“Sharding & IDs at Instagram”

I’d like to recommend an interesting post, “Sharding & IDs at Instagram”, about sharding using Postgres.

27 September 2011

22:25

Cleaning up after your Bucardo goats

If you are not familiar with it already, Bucardo is a nifty multi-master replication system for PostgreSQL, written by Greg Sabino Mullane. Written in Perl, it is great if you need replication that doesn’t have the restrictions associated with PG 9’s streaming replication.

To keep your Bucardo installation clean and tidy, a few regular cron jobs are required. One of them cleans up the archived replicated data (stored in a separate database by Bucardo) once you know you are done with it.

The Bucardo page above has a recommended script using all sorts of bashing, but I wanted something a bit more pure-PostgreSQL; it also doesn’t purge more than one old table at a time. So, I whipped up the following PL/pgSQL function.

(Note that this is for Bucardo 4.4. I haven’t played with the forthcoming Bucardo 5, so I’m not sure if this is still required.)

CREATE OR REPLACE FUNCTION bucardo.purge_frozen_child_qs(far_back interval)
    RETURNS SETOF TEXT AS
$purge_frozen_child_qs$
DECLARE
    t TEXT;
    qt TEXT;
BEGIN

    IF far_back IS NULL THEN
        RAISE EXCEPTION 'Interval cannot be null.'
            USING HINT = 'So, do not do that.';
    END IF;

    IF (now() + far_back) > now() THEN
        RAISE EXCEPTION 'Interval must be negative.'
            USING HINT = 'Consider using the "ago" form of intervals.';
    END IF;

    FOR t IN 
        SELECT tablename 
            FROM pg_tables
            WHERE schemaname='freezer' 
                  AND tablename like 'child_q_%' 
                  AND (replace(tablename, 'child_q_', '')::timestamp with time zone) < now() + far_back::interval
            ORDER BY tablename
    LOOP
        qt := 'freezer.' || t;
        EXECUTE 'DROP TABLE ' || qt;
        RETURN NEXT qt;
    END LOOP;

    DELETE FROM bucardo.q 
        WHERE (started < now() + far_back::interval 
                OR ended < now() + far_back::interval 
                OR aborted < now() + far_back::interval 
                OR cdate < now() + far_back::interval) 
              AND (ended IS NULL OR aborted IS NULL);

    RETURN;

END
$purge_frozen_child_qs$
LANGUAGE plpgsql
    VOLATILE;

To use it, just call it repeatedly from a cron job with the appropriate argument, along the lines of:

SELECT * FROM bucardo.purge_frozen_child_qs('7 days ago'::interval);

It returns the names of the tables it deleted.

This particular function doesn’t need to be run more often than once a day. And it keeps your Bucardo goats nice and clean.

(A “bucardo” is a now-extinct species of goat. For why Bucardo is goat-related, ask Greg.)

26 July 2011

14:14

Unbreaking Your Django Application

My tutorial at OSCON 2011, Unbreaking Your Django Application, is now available for download.

18 May 2011

11:31

Life with Object-Relational Mappers

The slides from my talk at PGCon 2011 are now available.

« Older Entries

Newer Entries »