Keith's Ramblings…

WARNING: If accidentally read, induce vomiting

PostgreSQL Partition Manager

with 9 comments

Update June 11, 2015: Version 2.0.0 of pg_partman has been released and changed some of what is discussed below, most significantly there are no longer distinct Static & Dynamic partition trigger functions. Please see my updated post,, as well as the CHANGELOG included with the extension.

I’ve finally gotten a lot of the planned features implemented for another extension, so it’s about time I wrote a blog post on it. One of the bigger features missing from PostgreSQL that some other database systems have is table partition management. That’s not to say there isn’t a way to do it, or that it isn’t being worked on, but at the current time it’s not the easiest thing to get going from scratch.

The most common method of partitioning that I’ve seen is either time or serial/id based. There are of course other partitioning methods used, such as alphabetically or via UUID. But those methods are usually a once-and-done setup and don’t require as much on-going maintenance. Time & serial based partitioning require new tables and/or functions to be created indefinitely and it can be quite a challenge to set it up and handle all edge cases consistently.  So that is what my PG Partition Manager extension does. It currently only handles inserts to the parent table, but I’ll be trying to get update support in the future. It automates several processes involved in partitioning:

  • Parent & child table creation
  • Trigger function creation and updating
  • Dropping unneeded partitions
  • Moving existing data into partitions
  • Undoing partitioning and moving data back into the parent.

I went with triggers vs rules for handling inserts on the parent. Another blog post better explains the differences, so please read there if you’re more curious about them. Child tables are always pre-created before they are needed to handle the race condition with triggers and creating partitions on the fly.

Static vs Dynamic

The other major difference of methods for partitioning with triggers is whether you statically list each child table in the trigger function or use a dynamic function that determines which child table to insert to based on the data in the statement. The first, static method leads to better performance since the query plan in the function can be cached. However, the more child tables that are added to the conditional statement, the slower performance becomes, so this method is usually limited to partition sets where only recent data is inserted. The dynamic method uses an EXECUTE statement in the function to designate which child table to use on the fly, so it cannot be cached. But, it is capable of handling inserts to any child table in the set at any time. This extension supports both of these methods.

Time & Serial Partitioning

For time-based partitioning, the following time periods are supported: yearly, quarterly, monthly, weekly, daily, hourly, half-hourly, quarter-hourly. If anyone needs decade, century, or a smaller regular time interval I can probably get that added fairly easily. Just put in a request on github. Outside of that, dynamic or non-regular time periods are bit more challenging, so I don’t know if I’ll ever be doing that. The tables are pre-created using a cronjob to call a maintenance function. The number of child table to pre-create is configurable. That premake number also designates how many child tables the static partitioning method will handle, both forward and backward in time. So if you partition daily with a premake value of 4 (the default) on July 4th, it will handle inserting data from June 30th to July 8th.

Serial/id-based partitioning is done using a fixed integer interval that each child table will contain based on an integer column’s value (typically the primary key). Pre-creation is not managed via cron since it’s not predictable how quickly or slowly a partition will fill up. At the cost of a slight performance hit vs the time-based method, the current partition column value is checked at each insert and when it reaches 50% of maximum, the next partition is made. Partitions are still pre-made in the same manner, though, so the next partition being created is typically several intervals ahead to avoid race conditions of the next child table not existing for an insert.

For both time & serial based partitioning, if for some reason the required child doesn’t exist, it is inserted into the parent. A check_parent() function is included so you can monitor all partition sets managed by the extension for data getting put into the parent table.

To make configuring indexes, constraints, defaults & permissions easier, all of those table properties are inherited from the parent. Just set what you need on the parent, and all future partitions will obtain those properties. Due to performance issues, privileges are not retroactively added to all prior partitions if you change them on an existing partition set. But there is a function included to do that for you when your systems can handle it.

Partitioning Existing Data

Tables with existing data can also be partitioned with minimal impact to a production system. When you create a partition set, a brief, exclusive lock is taken on the given parent table to add the required trigger and child tables. Any new data inserted from then on is properly partitioned. All existing data is left in the new parent table and an included python script can move that data in batches, creating all child tables that are needed.

To give you an example of how well this works, I successfully partitioned a high-traffic, 74 million row table with zero noticeable interruption. It took about a 2-3 second lock on parent table which caused inserts to wait but they all successfully completed once new partitions & triggers were in place. I then partitioned the existing data using a 1 hour interval to avoid large, long insert batches and the 5 second wait option to ease the load on the streaming slave a bit. This averaged 50,000 – 200,000 rows per batch and the slave never fell more than 100 seconds behind and quickly caught up once it was done. Here is the script command that did that:

There is also an included python script that can undo a partition set, moving all data from child tables back into the parent. This script can actually work on any inheritance table setup in postgres, not just ones managed by the extension.


One of the biggest advantages of partitioning is being able to remove unneeded data without expensive delete commands and/or table bloat. Using the extension’s retention configuration options, the same maintenance function that manages pre-making time-based partitions can also automatically drop partitions you no longer need. By default, tables are just uninherited from the parent and not actually dropped, but you can also have it drop just the indexes from the old child table to recover space, move the table to a different schema or actually drop the table. Moving the table to a new schema also allows the use of another included python script that can use pg_dump to extract the tables in that schema out along with a SHA-512 hash before actually dropping them from the database. This allows you to keep old data around without having it take up space in the database itself.


To provide monitoring of any issues with partition creation and and audit trail, my pg_jobmon extension works with pg_partman as well. It is optional and will automatically be used if it is available. The premake value was set to 4 by default because pg_jobmon’s default is to alert when a function fails to run 3 times in a row. That way it just works by default to raise an alert before the last partition is used up.


I’ve gotten a lot of positive feedback from people when I’ve given Lightning Talks at recent PostgreSQL conferences about this extension. It’s also made common partitioning solutions for clients a lot easier to implement. Not having partitioning built into PostgreSQL still has its shortcomings, one of the biggest being primary/unique keys not being enforced across the entire partition set unless that column happens to be the one the partition constraints are based on. Or lack of constraint exclusion support outside of the partition column. But, along with update support, I’m working on some other options that won’t provide a perfect solution to these issues, but will hopefully make trying to manage them easier.

Written by Keith

July 8th, 2013 at 11:48 am

  • Odd that Postgres can’t “automatically” [internally] do interval partitioning but this is the next best thing, so thanks!

    • keithf4

      It’s being discussed, but isn’t a high priority.
      The thing that needs to be figured out is which method they’re going to use. MySQL, Oracle & SQLServer all use slightly different methods and all have their good parts and bad. And its likely that whatever is built in won’t be able to be as flexible as a specialized third-party tool like this either. But it will hopefully perform a whole lot better.

  • tom_m

    Just wanted to say thank you for your hard work. This is an extremely important feature (partitioning) that is often overlooked. As the whole “big data” thing becomes more commonplace, I feel as if though we’ll start seeing these kind of features take a bigger priority in databases. For now, your work can weather us through and it’s great. So thanks!

    • keithf4

      Glad you’ve found it useful 🙂

  • Pingback: PG Partition Manager v2.0.0 – Background Worker, Better Triggers & Extension Versioning Woes at Keith's Ramblings…()

  • Lorenzo

    Hi Keith, speaking about p_type parameter, what is the difference between ‘time’ ( ) and ‘time-static’ ( “… If you create each type of time-based partitioning (time-static, time-dynamic & time-custom) …” )?
    Thank you, Lorenzo

    • keithf4

      Please see the update note at the top of this post. The distinct static & dynamic partitioning types are no longer relevant as of version 2.0.0 of pg_partman. There is now only “time” or “time-custom”.

      • Lorenzo

        Great, thank you.
        I would ask you few more help because we are the way to use partman without install it as extension. This because we would be able to create and manage partitions from ETL jobs that are installed on different servers (access to different databases) and could be often switched to other databases.

        We embedded an old version of partman (prior than 2.0, because we uses ‘time-static’ type) in our ETL job (executing partman sql sources at job runtime, partitioning table if not already partitioned at job runtime, executing run_maintenance() at job runtime…) and everything works, and we are able to deploy our ETL Job everywhere, because it is able to setup everything it needs for partman.
        Do you think we can do the same with the last partman version (obviously using “time” instead of “time-static”)? I mean, are SQL codes enough (eventually executing few perl scripts by our ETL job), or the new partman -since it is a postgres extension- now needs to be installed? (compiling .c is mandatory? or .pl scripts needs to be copied on server because used by store procedures?)

        Thank you for your support, let me know if you prefer I get in contact elsewhere with you.

        • keithf4

          I’m sorry, but I cannot provide any support for using pg_partman when not installed as an extension. It’s a source of way too many complexities to manage that the extension system addresses. I would highly recommend finding a way to do install it as an extension, most especially if you’re embedding code into an automated job. It makes version upgrading tremendously easier and you know exactly which features are available according to the version number installed.

          Compiling the C code is only necessary if you want to have maintenance automatically managed for you with a background worker. If you don’t need that you can pass NO_BGW=1 to the make command and it will skip any C compilation.