postgresql when it's not your job

14:40

Django, PostgreSQL, and Transaction Management

7 November 2009

Django has quite a bit of code in it devoted to transaction management. Although the documentation goes into quite a bit of depth on transactions, I’ve never felt that the docs by themselves let you build a good mental model of how transactions actually work. So, I decided to approach it experimentally: Build a small Django app, and see how the various options actually work.

For all of this, I’m using Django 1.1.1 and PostgreSQL 8.4.1, through the psycopg2 interface.

For the application, we have a models.py:

from django.db import models

class Address(models.Model):
    street_address = models.CharField(max_length=80)
    city = models.CharField(max_length=80)
    state = models.CharField(max_length=2)
    zip = models.CharField(max_length=5)

    def __unicode__(self):
        return self.street_address + " " + self.city + " " + self.state + " " + self.zip

class Order(models.Model):
    customer_name = models.CharField(max_length=80)
    shipping_address = models.ForeignKey(Address)

    def __unicode__(self):
        return "Order " + unicode(self.id) + " going to " + \
            self.customer_name + ", " + unicode(self.shipping_address)

and a views.py:

from django.http import HttpResponse
from djangotrans.example.models import Order, Address
from django.db import transaction, connection

def create_order(request):

    address = Address(street_address="1112 E Broad St", city="Westfield", state="NJ", zip="07090")
    address.save()

    order = Order(customer_name="Gomez Addams", shipping_address=address)
    order.save()

    return HttpResponse("Created: " + unicode(order))


def show_order(request, order_id):
    order = Order.objects.get(pk=order_id)

    return HttpResponse(unicode(order))

and a urls.py:

from django.conf.urls.defaults import *

urlpatterns = patterns('djangotrans.example.views',
    (r'^createorder/$', 'create_order'),
    (r'^showorder/(?P<order_id>\d+)$', 'show_order'),

)

The Basic Rule

In absence of any mention of transaction management at all, Django opens a new transaction with each database operation, and closes it when the operation is complete. If all you do is read operations on the model, it does a ROLLBACK to close the transaction; if you do a data-changing operation, it does a COMMIT. For example, in create_order, the pattern looks something like:

# BEGIN
address = Address(street_address="1112 E Broad St", city="Westfield", state="NJ", zip="07090")
address.save()
# COMMIT

# BEGIN
order = Order(customer_name="Gomez Addams", shipping_address=address)
order.save()
# COMMIT

… and in show_order:

# BEGIN
order = Order.objects.get(pk=order_id)
# ROLLBACK

This is simple, straight-forward… and wrong. The two operations in save_order should be wrapped in a single transaction. If we saved the address, but didn’t then save the order, we’d have an “orphan” address, and that would be just tragic. So, what to do? Fortunately, Django provides a rather large toolbox full of other ways of managing transactions.

Transaction Middleware

First, let’s take a look at the transaction middleware. This is a middleware component that you add to settings.py:

MIDDLEWARE_CLASSES = (
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.middleware.transaction.TransactionMiddleware',
)

This component sets a flag in Django turning off the automatic COMMIT/ROLLBACK behavior, and instead does an automatic BEGIN when each request starts, and then a COMMIT (if the view function returns normally and did a data-changing operation) or ROLLBACK (when the view function throws an exception, or returns normally having not modified data.)

Doing this to our extremely sophisticated application above, we get a much more pleasing pattern of transaction management:

# BEGIN
address = Address(street_address="1112 E Broad St", city="Westfield", state="NJ", zip="07090")
address.save()

order = Order(customer_name="Gomez Addams", shipping_address=address)
order.save()
# COMMIT

… and in show_order:

# BEGIN
order = Order.objects.get(pk=order_id)
# ROLLBACK

Much better.

Transaction Decorators

Django provides a set of transaction decorators that can be added to view functions to explicitly control how transactions work for that particular function. They work the same way whether or not the transaction middleware is present. They are:

So, What’s Not to Like?

That seems to be a pretty comprehensive set of transaction control mechanisms. So, what else could one want? Let’s take a look at what is actually sent to the database when we show an order:

LOG:  statement: SET DATESTYLE TO 'ISO'
LOG:  statement: SHOW client_encoding
LOG:  statement: SHOW default_transaction_isolation
LOG:  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
LOG:  statement: SET TIME ZONE E'America/Los_Angeles'
LOG:  statement: SELECT "example_order"."id", "example_order"."customer_name", "example_order"."shipping_address_id" FROM "example_order" WHERE "example_order"."id" = 47 
LOG:  statement: SELECT "example_address"."id", "example_address"."street_address", "example_address"."city", "example_address"."state", "example_address"."zip" FROM "example_address" WHERE "example_address"."id" = 79 
LOG:  statement: ROLLBACK

The transaction here isn’t actually required at all, setting aside the other overhead. Can we get rid of it? In part two, we’ll see how we can.

Django, PostgreSQL, and Autocommit at 19:25, 7 November 2009:

[…] November 2009In part 1, we ran down a list of the standard Django features for controlling transactions. Now, we’re […]

Joshua Jonah at 01:55, 8 November 2009:

Wow, great article. I’ve always wished there was clearer documentation on this. Especially since my db is getting a little busy lately. :D