The Build


Using Server-Side PostgreSQL Cursors in Django

14 December 2010

This is a follow-up to the previous post, in which we talked about ways of handling huge result sets in Django.

Two commenters (thanks!) pointed out that psycopg2 has built-in support for server-side cursors, using the name option on the .cursor() function.

To use this in Django requires a couple of small gyrations.

First, Django wraps the actual database connection inside of the django.db.connection object, as property connection. So, to create a named cursor, you need:

cursor = django.db.connection.connection.cursor(name='gigantic_cursor')

If this is the first call you are making against that connection wrapper object, it’ll fail; the underlying database connection is created lazily. As a rather hacky solution, you can do this:

from django.db import connection

if connection.connection is None:
    cursor = connection.cursor()
       # This is required to populate the connection object properly

cursor = connection.connection.cursor(name='gigantic_cursor')

You can then iterate over the results using the standard iterator or cursor.fetchmany() method, and that will grab results in from the server in the appropriate chunks.

Thomas G├╝ttler at 03:34, 5 January 2011:

It would be nice to have a ticket in django’s trac system for this issue.

Daniele Varrazzo at 03:59, 8 February 2011:

In the just released psycopg 2.4, iteration on named cursors is more efficient as the results are fetched from the server in chunks.