Welcome to my blog where I ramble on about PostgreSQL and other random things…

PostgreSQL Extension Code Organization

I was going to title this “Why is my extension failing to install in 9.3?”, but I figured a more general post about how I’ve been organizing my extension code would be more helpful. But I will also be answering that question along the way since my organization method provided a solution for me.

For extension installation, the code that installs objects within the database must be in a file in the format extension–version.sql. If your extension is primarily C code with just a few functions to provide calling methods, the SQL is usually pretty easy to maintain in a single file. You can organize the pre-compiled source code as needed and the Makefile can then designate the files that get compiled and installed during installation. But if your entire extension is PL/pgSQL (or any of the other scripting languages), it can really start to get hard to manage all that code in a single file, which is what I see most people doing.

Table Partitioning & Long Names

While working on a way to reapply indexes to existing partitions in my partition management extension, it dawned on me that PostgreSQL’s default object name length limit of 63 characters could cause some interesting issues for child table names when setting up partitioning. Since most partitioning relies on a suffix on the end of the child table names for uniqueness and to easily identify their contents, the automatic truncation of a long child name could really cause some big problems. So I set about seeing if I could handle this with pg_partman.

PostgreSQL Partition Manager

Update June 11, 2015: Version 2.0.0 of pg_partman has been released and changed some of what is discussed below, most significantly there are no longer distinct Static & Dynamic partition trigger functions. Please see my updated post, http://www.keithf4.com/pg-partition-manager-v2-0-0, as well as the CHANGELOG included with the extension.

I’ve finally gotten a lot of the planned features implemented for another extension, so it’s about time I wrote a blog post on it. One of the bigger features missing from PostgreSQL that some other database systems have is table partition management. That’s not to say there isn’t a way to do it, or that it isn’t being worked on, but at the current time it’s not the easiest thing to get going from scratch.

Mimeo – Repulling Incremental Replication Data

With other per-table replication methods, if the source and destination become out of sync, you typically have to repull the entire table. One of the nice things about using an incremental based replication method (based on incrementing time or serial number; see previous blog post) is that it can make repulling smaller batches of that data much easier.

One of our clients had been having some issues with their website hits tracking table. Some of the hits had been missed via the normal tracking method and the had to be re-obtained via other means and re-inserted into the hits tracking table on production. This table is also replicated to a data warehouse system for reporting. Since this table uses incremental replication based on time, the old data that was reinserted to the source with the old timestamp values would never make it to the reporting database on its own.

PostgreSQL Extension Updates & Preserving Privileges

My latest update of Mimeo required me to do something that I knew I would eventually need to do: drop and recreate a function in an extension without breaking the original privileges it had. If the code within a function changes, a CREATE OR REPLACE wouldn’t affect privileges. But when you need to change the parameters (not overload it) or return type of a function, it must be dropped and recreated.

Mimeo - DML Replication

For the last introductory post about mimeo, I’ll be talking about DML replication (previous posts here and here). This is the most common way replication is done on a per table basis (at least that I’ve seen). Typically, a trigger is placed on the source table that tracks all changes (INSERTS, UPDATES & DELETES) and then some mechanism is used to replay those statements on the destination.

For mimeo, this is done with a queue table that just contains the primary key columns to note that a change was done to that row. The trigger places the primary key values into a queue table (also located on the source system) and then mimeo reads the queue table values to replay them on the destination. Saying that the statements are just replayed on the destination is really simplifying things though. While that is technically a legitimate way to replicate table changes, it is far from the most efficient. What mimeo actually does is