Keith's Ramblings…

"Fixing a complex system sounds pretty easy when you're an idiot."

Archive for February, 2014

Partitioning With Custom Time Intervals

with 2 comments

This has been something I’ve had in the works since I started on this extension, but it took a while to settle on the best method to use. The pre-determined time ranges for PG Partition Manager were done because it was easy and, as it turns out, provide a very noticeable performance improvement when used with the static partitioning method. Note that custom time partitioning, while the most flexible, is the least performant of the 3 methods now supported with this extension.

The biggest roadblock to implementing this was trying to find an answer to the following question:

If an interval can be any value, how do you determine the partition that the currently inserted timestamp goes into?

Seems a simple enough question, but when you try and implement it, you realize the impact solving it can have on the potential performance of every insert to the table. My solution in the end (which I was trying to avoid) was to create a lookup table. Thankfully the new range type added in PostgreSQL 9.2 made this a whole lot easier than it could’ve been, but it does mean this feature is only supported on the latest versions. If anyone has an efficient solution as an algorithm that can be implemented solely in the partitioning function, I’d be extremely grateful for a contribution.

Until then, pg_partman 1.6.0 hopefully now supports any interval value from 1 second up to the minimum and maximum values for the timestamp type that PostgreSQL supports. Less than that is potentially possible, but even a partitioning interval of 1 second seemed ridiculous enough to me, so I doubt I’m going to pursue it. And I say hopefully because it’s impossible for me to make unit tests for every possible interval there is. I test for intervals less than 1 minute (due to the new child table naming pattern) and greater than 100 years (due to the seemingly non-standard century partition boundaries (2001, 2101, 2201, etc)) so hopefully that will be enough for now. Would greatly appreciate feedback.

This version also bring the ability to manually designate your starting partition at creation. Be aware that if you’re using static partitioning, the trigger function on the parent may not cover all the partitions this could create. The partitioning functions (partition_data_time() & partition_data_id()) will create these tables automatically when you migrate data out of the parent. But if you’re starting from scratch, I thought this might make things easier so if you’re loading data externally, you can have the tables already set to have data loaded directly into them. Below is an example of using both the new “time-custom” type and setting a custom start partition (by default it would’ve only created 4 partitions behind CURRENT_DATE)

I’ve gotten a lot of positive responses (and bug fixes!) from the community so far and am glad that so many people are finding this a useful tool. It’s nice to see that the effort has been worth it.

Written by Keith

February 13th, 2014 at 11:34 am

Posted in PostgreSQL

Tagged with , ,