19 March 2012
As noted before, Django has a lot of facilities for handling transactions, and it’s not at all clear how to use them. In an attempt to cut through the confusion, here’s a recipe for handling transactions sensibly in Django applications on PostgreSQL.
The goals are:
- Database operations that do not modify the database aren’t wrapped in a transactions at all.
- Database operations that modify the database are wrapped in a transactions.
- We have a lot of fine-grained control over sections that modify the databse vs those that don’t.
The bits of the recipe are:
The quick reasons behind each step:
- Turning on autocommit prevents psycopg2 from automatically starting a new transaction on the first database operation on each connection; this means that the transaction only starts when we want it to.
- Similarly, the transaction middleware will set the connection state to “managed,” which will defeat the autocommit option above, so we leave it out.
xact() decorator will set up the connection so that a transaction is started in the relevant block, which is what we want for database-modifying operations.
This recipe a few other nice features:
xact() operates like
commit_on_success(), in that it will issue a rollback if an exception escapes from the block or function it is wrapping.
xact() ignores the dirty flag on the Django connection. Since we’re deliberately wrapping stuff that modifies the database with it, the chance of it being dirty is near 100%, and a commit on a transaction that did not modify the database is no more expensive in PostgreSQL than a rollback. It also means you can do raw SQL inside an
xact() block without the foot-gun of forgetting to call
- Like the built-in Django transaction decorators, it can be used either as a decorator or as a context manager with the
xact() can be nested, giving us nested transactions! If it sees that there is already a transaction open when it starts a new block, it will use a savepoint to set up a nested transaction block. (PostgreSQL does not have nested transactions as such, but you can use savepoints to get 99.9% of the way there.)
- By not wrapping operations that do not modify the database, we get better behavior when using pgPool II (more on that in a future post).
xact() works around an outstanding bug in Django’s transaction handling on psycopg2.
xact() also supports the
using parameter for multiple databases.
Of course, a few caveats:
xact() requires the
postgresql_psycopg2 backend, and PostgreSQL 8.2 or higher. It’s possible it can be hacked to work on other backends that support savepoints.
xact() works just the way you want if it is nested inside a
commit_on_success() block (it will properly create a savepoint insted of a new transaction). However, a
commit_on_success() block nested inside of
xact() will commit or rollback the entire transaction, somewhat defeating the outer
xact(). To the extent possible, use only
xact() in code you write.
- Be sure you catch exceptions outside of the
xact() block; otherwise, the automatic rollback will be defeated. Allow the exception to escape the
xact() block, and then catch it. (Of course, if the intention is to always commit and to defeat the rollback, by all means catch the exception inside the block.)
To use, just drop the source (one class definition, one function) into a file somewhere in your Django project (such as the omni-present
utils application every Django project seems to have), and include it.
from utils.transaction import xact
# Everything here will be in a transaction.
# It'll roll back if an exception escapes, commits otherwise.
# This stuff won't be in a transaction, so don't modify the database here.
# This stuff will be, and will commit on normal completion, roll back on a exception
# Modify the database here (let's call it "part 1").
# Let's call this "part 2."
# This stuff will be in its own savepoint, and can commit or
# roll back without losing the whole transaction.
# Part 2 will be rolled back, but part 1 will still be available to
# be committed or rolled back. Of course, if an exception
# inside the "part 2" block is not caught, both part 2 and
# part 1 will be rolled back.
The source is available on GitHub. It’s licensed under the PostgreSQL License.
24 January 2012
My presentation from SCALE 10x, “PostgreSQL Performance When It’s Not Your Job” is now available for download.
30 September 2011
I’d like to recommend an interesting post, “Sharding & IDs at Instagram”, about sharding using Postgres.
27 September 2011
If you are not familiar with it already, Bucardo is a nifty multi-master replication system for PostgreSQL, written by Greg Sabino Mullane. Written in Perl, it is great if you need replication that doesn’t have the restrictions associated with PG 9’s streaming replication.
To keep your Bucardo installation clean and tidy, a few regular cron jobs are required. One of them cleans up the archived replicated data (stored in a separate database by Bucardo) once you know you are done with it.
The Bucardo page above has a recommended script using all sorts of
bashing, but I wanted something a bit more pure-PostgreSQL; it also doesn’t purge more than one old table at a time. So, I whipped up the following PL/pgSQL function.
(Note that this is for Bucardo 4.4. I haven’t played with the forthcoming Bucardo 5, so I’m not sure if this is still required.)
CREATE OR REPLACE FUNCTION bucardo.purge_frozen_child_qs(far_back interval)
RETURNS SETOF TEXT AS
IF far_back IS NULL THEN
RAISE EXCEPTION 'Interval cannot be null.'
USING HINT = 'So, do not do that.';
IF (now() + far_back) > now() THEN
RAISE EXCEPTION 'Interval must be negative.'
USING HINT = 'Consider using the "ago" form of intervals.';
FOR t IN
AND tablename like 'child_q_%'
AND (replace(tablename, 'child_q_', '')::timestamp with time zone) < now() + far_back::interval
ORDER BY tablename
qt := 'freezer.' || t;
EXECUTE 'DROP TABLE ' || qt;
RETURN NEXT qt;
DELETE FROM bucardo.q
WHERE (started < now() + far_back::interval
OR ended < now() + far_back::interval
OR aborted < now() + far_back::interval
OR cdate < now() + far_back::interval)
AND (ended IS NULL OR aborted IS NULL);
To use it, just call it repeatedly from a cron job with the appropriate argument, along the lines of:
SELECT * FROM bucardo.purge_frozen_child_qs('7 days ago'::interval);
It returns the names of the tables it deleted.
This particular function doesn’t need to be run more often than once a day. And it keeps your Bucardo goats nice and clean.
(A “bucardo” is a now-extinct species of goat. For why Bucardo is goat-related, ask Greg.)
26 July 2011
My tutorial at OSCON 2011, Unbreaking Your Django Application, is now available for download.
18 May 2011
The slides from my talk at PGCon 2011 are now available.
28 March 2011
DjangoCon Europe and the Apple WWDC are at the exact same time. This is going to be a tough call.
Update: Well, that was quick. WWDC sold out in 10 hours, while I was dithering.
18 March 2011
To bring everyone up to date:
- Justin Vincent wrote a post offering an opinion about the downsides of the chase of tech entrepreneurs over VC funding.
- Amy Hoy wrote a post expanding on Mr Vincent’s post.
- Alex Payne wrote a post criticizing this position, while finding it necessary to describe “long time acquaintance” Amy Hoy’s product as “duping credulous customers into overpaying for a time-tracking tool styled with this month’s CSS trends.”
- Unaccountably, he seems to have been surprised by the negative reaction this post generated, so he posted an explanation and partial retraction here.
Sadly, I find his last post as incoherent as his first one is vitriolic.
Rather than go through it point by point, the crux of his argument is:
Building a business around maximizing your individual happiness is not particularly useful or admirable. That is my position, and I’m well aware that it may be unpopular with some.
I am pleased to report, then, that Mr Payne has absolutely nothing to worry about, because no business that is built around the happiness of the owner as a primary goal has a hope of every getting anywhere, unless the business consists of the owner taking money out of one pocket and putting it in the other. Any business, unless it is operating in a grotesquely distorted marketplace, is primarily about pleasing its customers in exchange for their money.
I’m really not sure what these vaguely masturbatory companies Mr Payne is talking about do for a living, but every (successful) micro-business I know of is insanely, intensely focused on pleasing its customers. They have to be, because they don’t have an installed base, government-granted advantages, or (yes) piles of venture capital in the bank to fall back on if they fail to do so.
Mr Payne wants to run a big company. I wish him all the best. He seems to have his young heart in the right place. I have to say, though, that his emotional overreaction to the idea that someone might want to run a micro-business instead strikes me as the Puritan reacting to the idea that someone, somewhere, might be happy.
13 March 2011
Advisory locks are one of the cool unsung features of PostgreSQL. In 9.1, they are getting even cooler with transaction level locks. Many details here.
9 March 2011
Suppose a major manufacturer of computer keyboards announced a very serious security problem with a specific competitor’s keyboard: Someone could plug this keyboard into a computer running a malicious app, and cause a user to enter sensitive information. Thus, the manufacturer demands that their competitor recall all of these “insecure” keyboards.
Anyone with the technical sense of a rock would pause for a moment, and then burst out in laughter at the utter absurdity of this proclamation. No one would ever attempt to make such a ludicrous and obviously self-serving claim, would they?
Verifone would. Verifone is very very concerned about Square’s iPhone card scanner, because someone could run a malicious app on the iPhone and collect card data using it. The fact that Square just announced new pricing undercutting Verifone’s is, of course, completely coincidental.
Where to begin?
It is true that Square’s attachment does not encrypt the card track information between the iPhone and the card reader. This is true of pretty much every single card reader in the entire world. It is not the job of the card reader to encrypt data, any more than it is the job of the keyboard to encrypt your password. Verifone seems unconcerned at all of the other card readers you can buy from, say, Amazon (just for example).
For Verifone’s apocalyptic scenario to occur, the iPhone into which the card reader is plugged must be running a malicious app. This pretty much requires the iPhone user to be in on the scam, which means that they could be using any hardware they wish to collect this card data. If the merchant is crooked, then they’ll find a way to collect the card data, since they have possession of the card (on which is printed essentially all of the relevant data that is on the mag tracks, plus the CVV printed on the back).
Verifone’s competing solution, if the brochure is to be believed, encrypts the data at swipe-time. That’s nice, but the chance of card data being compromised between the reader and the iPhone, or during that extremely limited time that it is sitting unencrypted in the iPhone’s memory, is essentially zero. Again, Verifone seems unconcerned that Square’s app works exactly like every other PC-based credit card processing application in the entire world; indeed, Square’s is considerably more secure than most, since the merchant doesn’t have access to the card information. (For example, on my completely certified, authorized, and every-spec-compliant Nurit wireless card processing terminal, I can retrieve credit card numbers from a batch with no hassle whatsoever.)
In short, Verifone is bashing a competitor because the competitor’s pricing is more consumer-friendly than Verifone’s. Their technical arguments are nonsense, and they should be ashamed of launching a FUD campaign that plays on credit card security paranoia.