22:25
Cleaning up after your Bucardo goats
27 September 2011
If you are not familiar with it already, Bucardo is a nifty multi-master replication system for PostgreSQL, written by Greg Sabino Mullane. Written in Perl, it is great if you need replication that doesn’t have the restrictions associated with PG 9’s streaming replication.
To keep your Bucardo installation clean and tidy, a few regular cron jobs are required. One of them cleans up the archived replicated data (stored in a separate database by Bucardo) once you know you are done with it.
The Bucardo page above has a recommended script using all sorts of bash
ing, but I wanted something a bit more pure-PostgreSQL; it also doesn’t purge more than one old table at a time. So, I whipped up the following PL/pgSQL function.
(Note that this is for Bucardo 4.4. I haven’t played with the forthcoming Bucardo 5, so I’m not sure if this is still required.)
CREATE OR REPLACE FUNCTION bucardo.purge_frozen_child_qs(far_back interval)
RETURNS SETOF TEXT AS
$purge_frozen_child_qs$
DECLARE
t TEXT;
qt TEXT;
BEGIN
IF far_back IS NULL THEN
RAISE EXCEPTION 'Interval cannot be null.'
USING HINT = 'So, do not do that.';
END IF;
IF (now() + far_back) > now() THEN
RAISE EXCEPTION 'Interval must be negative.'
USING HINT = 'Consider using the "ago" form of intervals.';
END IF;
FOR t IN
SELECT tablename
FROM pg_tables
WHERE schemaname='freezer'
AND tablename like 'child_q_%'
AND (replace(tablename, 'child_q_', '')::timestamp with time zone) < now() + far_back::interval
ORDER BY tablename
LOOP
qt := 'freezer.' || t;
EXECUTE 'DROP TABLE ' || qt;
RETURN NEXT qt;
END LOOP;
DELETE FROM bucardo.q
WHERE (started < now() + far_back::interval
OR ended < now() + far_back::interval
OR aborted < now() + far_back::interval
OR cdate < now() + far_back::interval)
AND (ended IS NULL OR aborted IS NULL);
RETURN;
END
$purge_frozen_child_qs$
LANGUAGE plpgsql
VOLATILE;
To use it, just call it repeatedly from a cron job with the appropriate argument, along the lines of:
SELECT * FROM bucardo.purge_frozen_child_qs('7 days ago'::interval);
It returns the names of the tables it deleted.
This particular function doesn’t need to be run more often than once a day. And it keeps your Bucardo goats nice and clean.
(A “bucardo” is a now-extinct species of goat. For why Bucardo is goat-related, ask Greg.)
There is one comment.
Greg at 09:51, 28 September 2011:
Thanks, but I didn’t originally write it, Jon Jensen did. But I’ve certainly modified the original version a bit. You are correct to ask about Bucardo 5 – the function won’t be needed, as both the q and the child_q tables are no more. Although bucardo_purge_delta() still needs to be run, the MCP will spawn a process to handle that for you in Bucardo 5. Which means that you should not need any maintenance cronjobs at all.