Keith's Ramblings…

Archive for December, 2016

PostgreSQL 10 Built-in Partitioning

with 3 comments

Since I have a passing interest in partitioning in PostgreSQL, I figured I’d check out a recent commit to the development branch of PostgreSQL 10

Implement table partitioning – https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63

After many years of waiting, one of the major features missing from PostgreSQL is finally getting its first major step forward with the inclusion of a built in partitioning option. The syntax and usage is fairly straight forward so let’s jump straight into it with the examples from the documentation (slightly modified)

The basic syntax has two parts, one for initially creating a partitioned set

And another for adding a child table to that partition set

A ranged partition set is likely the most common use-case and is what pg_partman provides for time & id partitioning. Each child table is given a specific upper and lower bound of data. Once nice thing is that it can support multiple columns as well.

You can see that unlike all partitioning methods up until now, there is no user-visible trigger involved. All the data routing is handled internally which greatly simplifies setting things up and opens up greater opportunities for optimization. Also, constraints and some other properties are automatically inherited from the parent. But since indexes cannot be defined on the parent, they have to be defined per child. Above you can see you can define different defaults per child as well as other properties.

The other partitioning type is LIST and this is for explicitly defining values to go into specific children.

Here you can see the default of the sequence is inherited and an example of defining a constraint on the child. So what happens if you try to insert data and a child table is not defined for it?

Invalid data is currently rejected. It also seems to have incremented the sequence as well (I re-ran this again to be sure). Currently I’m not sure if there’s a way around the data rejection, and defining a trigger with a exception handler is not a good idea. As I found out the hard way in pg_partman, just the existence of an exception block in a trigger will also increase xid consumption since every row inserted will increment the global xid value. If this table has a high insert rate, you can quickly reach xid wraparound. I made a suggestion to the core devs to define a child table as the default and have any data that doesn’t match a child table go there instead.

Sub-partitioning is also possible and you can easily mix and match both range and list partitioning types. Here we redefine the cities_west table to be partitioned as well

Global indexes do not yet exist in PostgreSQL, so the issue of primary/unique keys being enforced across a partition set is still an issue. And while you can define foreign keys on individual children to other tables, defining them back to a partitioned set will not work. Also, foreign keys cannot be defined on the parent table with this new feature, so they’ll have to be set per child. Thankfully they just return an error if you try any of these things instead of allowing them and causing confusion the way most current partition setups do.

To get rid of a partitioned set, you’d use the CASCADE option just like with an inheritance set

Currently the TRUNCATE command run on the parent table will truncate the entire partition set, but I did see discussion on the mailing lists of a WHERE clause to only target specific children. That would be a great feature, but even better would be a WHERE clause to the DROP TABLE command to make retention maintenance to drop old tables much easier.

You may have noticed there’s no mention of ongoing creation of new partitions when needed. This is where I see pg_partman still being useful at this time. I’ve only just barely started playing with this feature to see how I can incorporate it. My main method of determining what characteristics were inherited to children (indexes, constraints, defaults, FKs etc) was done by defining them on the parent, but the current limitations on what can be defined on the parent with this new feature make this trickier to handle. Whether I’ll keep my old methods and just add this on as a new feature or do another major code revision will have to wait to be seen. It will all depend if I can find a way to implement all existing features I’ve added to pg_partman since I’d rather not take any steps backwards if I don’t have to. Reading the relevant threads on the mailing lists, it seems they have a lot more features for partitioning to be added in, so I may wait a bit before diving into incorporating this into pg_partman too tightly since I may over complicate things on my part due to a current lack of features. This email from Robert Haas gives me hope this this will be a well maintained core feature – https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h5Le%2BGOK5kw3J37SRveNfzEdx9s5-Yd8vA%40mail.gmail.com

From the moment I started writing pg_partman, I knew built-in partitioning wasn’t too far around the corner. But we build the tools that we need at that time. I’m looking forward to seeing how this feature develops and I hope I can help out where possible to overcome any limitations encountered. Many thanks to Amit Langote and the review team!

Written by Keith

December 12th, 2016 at 11:48 am

Posted in PostgreSQL

Tagged with , ,