postgresql when it's not your job

19:34

Django and PostgreSQL “Idle In Transaction” Connections

25 October 2010

A well-known issue that can come up with Django sites running on PostgreSQL is that connections in “Idle in Transaction” state can pile up. There’s a relatively straight-forward fix, but ultimately, it’s due to a bug in Django’s transaction management, at least when PostgreSQL is the back-end.

Let’s run through it.

First, what’s the “Idle in Transaction” state in PostgreSQL? Like the name implies, it’s when a transaction has been opened on a connection to a PostgreSQL server, but no query is running at that moment. Generally, you want to avoid these; they consume considerable system resources, defeat connection poolers (like pgbouncer) running in transaction pooling mode, and can seriously slam the performance of your server.

Django sites seem to be particularly susceptible to this; I’ve encountered client sites building up quite literally hundreds of these connections.

The short reason why is: Django, in its default transaction management scheme, does not issue a COMMIT if it thinks no data modification has occurred on the database. This wouldn’t be a problem if a BEGIN hadn’t been issued, but by default, one is issued by the psycopg2 database interface. Thus, in a read-only transaction, a BEGIN is issued, but no COMMIT, and since the connection stays open (potentially for quite some time), the connection sits in Idle in Transaction more or less indefinitely.

There’s even a bug in the Django issue tracker over it, but there’s a lot of discussion there over what to do about the existing behavior, and how to maintain backwards compatibility. In my opinion, there is no backwards-compatibility to maintain: Any application which relies on transactions staying open a random period of time and then being closed in some arbitrary fashion is essentially depending on uninitialized memory, and shouldn’t be surprised when the decision is made firmly one way or another.

That being said, how can we fix it in existing applications? The easiest way is to fix the bug in the transaction middleware, which I expect most Django applications use, as it provides a very clean and natural transaction model:

from django.db import transaction

class TransactionMiddleware(object):
    """
    Transaction middleware. If this is enabled, each view function will be run
    with commit_on_response activated - that way a save() doesn't do a direct
    commit, the commit is done when a successful response is created. If an
    exception happens, the database is rolled back.
    """
    def process_request(self, request):
        """Enters transaction management"""
        transaction.enter_transaction_management()
        transaction.managed(True)

    def process_exception(self, request, exception):
        """Rolls back the database and leaves transaction management"""
        if transaction.is_dirty():
            transaction.rollback()
        transaction.leave_transaction_management()

    def process_response(self, request, response):
        """Commits and leaves transaction management."""
        if transaction.is_managed():
            if transaction.is_dirty():
                transaction.commit()
            transaction.leave_transaction_management()
        return response

Note that the transaction is only closed if the transaction is marked as dirty. This is the source of the problem: psycopg2 by default opens a transaction, even for read-only operations, but that transaction is never closed. The fix is simplicity itself:

from django.db import transaction

class TransactionMiddleware(object):
    def process_request(self, request):
        transaction.enter_transaction_management()
        transaction.managed(True)

    def process_exception(self, request, exception):
        transaction.rollback()
        transaction.leave_transaction_management()

    def process_response(self, request, response):
        if transaction.is_managed():
            transaction.commit()
            transaction.leave_transaction_management()
        return response

The various decorators, such as @commit_on_success, have the same issue, so you’ll need to use patched versions of those.

Another option is to use AUTOCOMMIT mode. If this is set, Django will tell psycopg2 not to issue the BEGIN, and it will be up to the application to properly manage the BEGIN and COMMIT/ROLLBACK blocks. This is the most detailed and controlled way of handling it, but it does create the possibility of unbalanced transactions.

I believe the correct answer is just for Django to always commit (on normal return) or rollback (on exception), whether or not is_dirty is set, but until and unless that patch is applied, this should greatly reduce the Idle in Transaction problems.

Brett at 20:31, 25 October 2010:

Have you tried this fix in production? I’m not sure that you’d notice a change.

The reason Django only commits if the transaction is dirty is because the connection is closed at the end of each request anyway,

See: http://github.com/django/django/blob/master/django/db/__init__.py#L80

So connections are actually only idle in transaction between request_start and request_end, then a new connection is made and that one can also be idle. It isn’t great, and it’s actually a problem for us at work, but I don’t think your code change actually fixes anything…

chaitany kulkarni at 23:47, 25 October 2010:

Thanks for the details. Same thing may occur when using Zope application server with postgresql.

Daniele Varrazzo at 01:52, 26 October 2010:

I agree with the analysis. Psycopg implements what required from the DBAPI 2.0, which says about ‘commit()’:

[…] Note that
if the database supports an auto-commit feature, this must
be initially off. An interface method may be provided to
turn it back on.

In order to “not autocommit”, a transaction must be started at the first operation in a session. The driver can’t tell apart a SELECT without side effects from an update operation requiring a commit, so it’s up to the application using it to balance each bunch of operation with a related commit()/rollback().

Nic Ferrier at 02:33, 26 October 2010:

We submitted the patch to use autocommit for this some time ago. Unbalanced transactions? I think not. Autocommit is the standard Postgresql way of talking to the db, the reason Django didn’t do that originally seems to be a lack of knowledge about Postgresql and a confused attempt to do something more clever.

I don’t really look at autocommit mode as doing transactions because each operation is separately atomic. The difficult bit of transactions is combining things.

If you look at it like this, where using a transaction is a deliberate step by the programmer, then autocommit makes a lot of sense.

It works for us, we’ve had no problems with this ever since.

Django and PostgreSQL “Idle In Transaction” Connections « imitatio creatio at 03:38, 26 October 2010:

[…] Django and PostgreSQL “Idle In Transaction” Connections. Tagged as: databases, django, postgres, python Leave a comment Comments (0) Trackbacks (0) ( subscribe to comments on this post ) […]

Xof at 21:44, 27 October 2010:

Have you tried this fix in production? I’m not sure that you’d notice a change.

I have, and I do. The point about connections automatically closing on request end is a good one, and I’m not sure why that isn’t accomplishing the same goal. More research required, clearly.

Psycopg implements what required from the DBAPI 2.0 […]

No question that psycopg2 is doing the right thing here. It’s unfortunate that the DBAPI decided to come in at a relatively high level in transaction management, but one does what one must.

We submitted the patch to use autocommit for this some time ago.

This is not directed at the commenter, but i have to say that *hate hate hate* the term “autocommit.” It really conceals far more than it reveals.

In the particular case of the DBAPI and (thus) psycopg2, what “autocommit” does is defeat the automatic transaction open that is otherwise required by the API.

The best answer is to run without an explicit transaction block (i.e., “autocommit”) on read-only operations, and use explicit transaction management for read/write blocks. That’s not a standard Django idiom, I realize, but it’s definitely the best approach to get the best behavior out of the database.

SQL: How problematic are long-lived SQL transactions? - Quora at 17:46, 28 October 2010:

[…] "across" a commit.PS This is a little worrying if you are using Django with PostgreSQL – http://thebuild.com/blog/2010/10…Insert a dynamic date here BIU     @   @ ReferenceEdit […]

Josh Berkus at 11:28, 29 October 2010:

Brett,

I believe that the transactions only get automatically closed when the connection is *garbage-collected*. This is an important distinction because garbage collection happens asynchronously, resulting in IIT transactions persisting for 30-60 seconds after the code is done with them. And when the Django server gets busy, garbage collection gets delayed, as much as 15 or 20 minutes.

I’m not 100% certain of the above, but it would certainly explain the problems I’ve seen at 6 different Django-using sites. The IIT problem with Django seems universal.

Mike Krieger at 03:18, 30 October 2010:

Amazing, amazing series of articles on autocommit. Your help was a lifesaver as we diagnosed an overloaded Postgres server at 3am. Thanks :)

Brett at 10:09, 2 November 2010:

Josh,

Are you saying the transactions are held even after the connection is closed? Django forcibly disconnects Postgres connections after each request finishes, they aren’t just left for the GC to deal with. (See link to code in my first comment)