Custom Time Intervals with Partitioning

Overview

When I first started designing pg_partman for time-based partitioning, it only had preset intervals that users could choose from simply because I didn’t know how to allow arbitrary intervals at the time. When deciding which intervals to choose from I looked at the most popular ones out there that I could find and settled on the following list, largest to smallest:

yearly
quarterly
monthly
weekly
daily
hourly
half-hour
quarter-hour 

Not all of these were particularly popular, but this set of intervals seemed easy enough to start with and I felt covered most of what people would want. Some ended up not being quite as easy as I thought in the end and, for version 5.x, I’d completely gotten rid of these preset intervals now that arbitrary intervals were easy enough to do. Even with allowing any possible interval, two of these intervals still proved a little tricky: weekly and quarterly. But thankfully pg_partman still has options to make these intervals easy and now even more flexible.

Weekly

Weekly partitioning was and still is a fairly popular partitioning interval. When I started working on it, I’d thankfully found the ISO week date standard to allow me to handle the more difficult issues of handling weeks when I’d wanted to label the children with the week number (leap years, starting days, 53 week years). However with declarative partitioning I found an opportunity to allow this to be more flexible when redesigning things for version 5. While this did get rid of the nice weekly numbering pattern I liked for this interval (IYYYwIW which came out to something like “2024w15”), the new method let people start their week on whichever day they desired. However, with flexibility always comes a little more complexity.

When you set your partitioning interval to 1 week in pg_partman, the day that starts that weekly pattern will be whatever day of the week it is when you run create_parent(). So today being a wednesday when I’m writing this blog post, my partition naming pattern AND constraints for the child tables would be as follows:

CREATE TABLE time_stuff(id int GENERATED ALWAYS AS IDENTITY, created_at timestamptz NOT NULL) PARTITION BY RANGE (created_at);

SELECT partman.create_parent('public.time_stuff', 'created_at', '1 week');
 create_parent 
---------------
 t
(1 row)

\d+ time_stuff
                                                      Partitioned table "public.time_stuff"
   Column   |           Type           | Collation | Nullable |           Default            | Storage | Compression | Stats target | Description 
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
 id         | integer                  |           | not null | generated always as identity | plain   |             |              | 
 created_at | timestamp with time zone |           | not null |                              | plain   |             |              | 
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240327 FOR VALUES FROM ('2024-03-27 00:00:00-04') TO ('2024-04-03 00:00:00-04'),
            time_stuff_p20240403 FOR VALUES FROM ('2024-04-03 00:00:00-04') TO ('2024-04-10 00:00:00-04'),
            time_stuff_p20240410 FOR VALUES FROM ('2024-04-10 00:00:00-04') TO ('2024-04-17 00:00:00-04'),
            time_stuff_p20240417 FOR VALUES FROM ('2024-04-17 00:00:00-04') TO ('2024-04-24 00:00:00-04'),
            time_stuff_p20240424 FOR VALUES FROM ('2024-04-24 00:00:00-04') TO ('2024-05-01 00:00:00-04'),
            time_stuff_p20240501 FOR VALUES FROM ('2024-05-01 00:00:00-04') TO ('2024-05-08 00:00:00-04'),
            time_stuff_p20240508 FOR VALUES FROM ('2024-05-08 00:00:00-04') TO ('2024-05-15 00:00:00-04'),
            time_stuff_p20240515 FOR VALUES FROM ('2024-05-15 00:00:00-04') TO ('2024-05-22 00:00:00-04'),
            time_stuff_p20240522 FOR VALUES FROM ('2024-05-22 00:00:00-04') TO ('2024-05-29 00:00:00-04'),
            time_stuff_default DEFAULT

I ran these statements on Wednesday, April 24, 2024 so you can see the partition time_stuff_p20240424 with the minimum value of that same day. And each subsequent child table is 7 days later, starting on every Wednesday. So while we’ve accomplished our weekly partitioning goal, this is not a common day to start the week. There is thankfully a very easy solution with pg_partman: we tell it the date to start making partitions. Say we wanted our weeks to start on Sunday. Just pick any Sunday date that would work for child tables we’d like to have initially created

SELECT partman.create_parent('public.time_stuff', 'created_at', '1 week', p_start_partition => '2024-04-17');
 create_parent 
---------------
 t
(1 row)

\d+ time_stuff
                                                      Partitioned table "public.time_stuff"
   Column   |           Type           | Collation | Nullable |           Default            | Storage | Compression | Stats target | Description 
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
 id         | integer                  |           | not null | generated always as identity | plain   |             |              | 
 created_at | timestamp with time zone |           | not null |                              | plain   |             |              | 
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240417 FOR VALUES FROM ('2024-04-17 00:00:00-04') TO ('2024-04-24 00:00:00-04'),
            time_stuff_p20240424 FOR VALUES FROM ('2024-04-24 00:00:00-04') TO ('2024-05-01 00:00:00-04'),
            time_stuff_p20240501 FOR VALUES FROM ('2024-05-01 00:00:00-04') TO ('2024-05-08 00:00:00-04'),
            time_stuff_p20240508 FOR VALUES FROM ('2024-05-08 00:00:00-04') TO ('2024-05-15 00:00:00-04'),
            time_stuff_p20240515 FOR VALUES FROM ('2024-05-15 00:00:00-04') TO ('2024-05-22 00:00:00-04'),
            time_stuff_p20240522 FOR VALUES FROM ('2024-05-22 00:00:00-04') TO ('2024-05-29 00:00:00-04'),
            time_stuff_default DEFAULT

We don’t have the partitions prior to our starting date created, but you can just pick an even earlier Sunday if you need more older partitions to start with. You can see that April 17, 2024 is a Sunday and every subsequent child table has its lower boundary on a Sunday as well. So using this method you can start on any day of the week you desire. And simply using the day of the lower boundary for the suffix name the got rid of the complexities of trying to use week numbers.

Quarterly

I’d always liked the idea of quarterly partitioning since it seemed to be a nice balance between larger and smaller partitioning intervals. PostgreSQL did have some limited quarterly timestamp formatting options, but if you go back and look at the partman source code for older versions, you’ll see it was way more complex than I’d expected it to be. And it pretty much locked the quarters into 4 pre-defined month blocks. With version 5.x of pg_partman, I decided to do the same as I did with weekly and simply allow any arbitrary 3 month interval people may want. So while it lost the nicer quarterly suffix pattern (YYYYq#, 2024q2), it’s now much more flexible.

The problem and solution for quarterly is the same as weekly. It’s not quite as bad of a problem in that the child lower boundaries are always rounded to the first of the month, but the quarter will default to start in the month that create_parent() runs. So running in April 2024 results in:

keith=# SELECT partman.create_parent('public.time_stuff', 'created_at', '3 months');
 create_parent 
---------------
 t
(1 row)

keith=# \d+ time_stuff
                                                      Partitioned table "public.time_stuff"
   Column   |           Type           | Collation | Nullable |           Default            | Storage | Compression | Stats target | Description 
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
 id         | integer                  |           | not null | generated always as identity | plain   |             |              | 
 created_at | timestamp with time zone |           | not null |                              | plain   |             |              | 
Partition key: RANGE (created_at)
Partitions: time_stuff_p20230401 FOR VALUES FROM ('2023-04-01 00:00:00-04') TO ('2023-07-01 00:00:00-04'),
            time_stuff_p20230701 FOR VALUES FROM ('2023-07-01 00:00:00-04') TO ('2023-10-01 00:00:00-04'),
            time_stuff_p20231001 FOR VALUES FROM ('2023-10-01 00:00:00-04') TO ('2024-01-01 00:00:00-05'),
            time_stuff_p20240101 FOR VALUES FROM ('2024-01-01 00:00:00-05') TO ('2024-04-01 00:00:00-04'),
            time_stuff_p20240401 FOR VALUES FROM ('2024-04-01 00:00:00-04') TO ('2024-07-01 00:00:00-04'),
            time_stuff_p20240701 FOR VALUES FROM ('2024-07-01 00:00:00-04') TO ('2024-10-01 00:00:00-04'),
            time_stuff_p20241001 FOR VALUES FROM ('2024-10-01 00:00:00-04') TO ('2025-01-01 00:00:00-05'),
            time_stuff_p20250101 FOR VALUES FROM ('2025-01-01 00:00:00-05') TO ('2025-04-01 00:00:00-04'),
            time_stuff_p20250401 FOR VALUES FROM ('2025-04-01 00:00:00-04') TO ('2025-07-01 00:00:00-04'),
            time_stuff_default DEFAULT

This does starts on the quarterly months people typically expect but only by coincidence. To start your quarters in whichever month you’d like, simply set the starting partition as we did with weekly. The day doesn’t really matter, just the month.

SELECT partman.create_parent('public.time_stuff', 'created_at', '3 months', p_start_partition => '2024-03-15');
 create_parent 
---------------
 t
(1 row)

\d+ time_stuff
                                                      Partitioned table "public.time_stuff"
   Column   |           Type           | Collation | Nullable |           Default            | Storage | Compression | Stats target | Description 
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
 id         | integer                  |           | not null | generated always as identity | plain   |             |              | 
 created_at | timestamp with time zone |           | not null |                              | plain   |             |              | 
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240301 FOR VALUES FROM ('2024-03-01 00:00:00-05') TO ('2024-06-01 00:00:00-04'),
            time_stuff_p20240601 FOR VALUES FROM ('2024-06-01 00:00:00-04') TO ('2024-09-01 00:00:00-04'),
            time_stuff_p20240901 FOR VALUES FROM ('2024-09-01 00:00:00-04') TO ('2024-12-01 00:00:00-05'),
            time_stuff_p20241201 FOR VALUES FROM ('2024-12-01 00:00:00-05') TO ('2025-03-01 00:00:00-05'),
            time_stuff_p20250301 FOR VALUES FROM ('2025-03-01 00:00:00-05') TO ('2025-06-01 00:00:00-04'),
            time_stuff_default DEFAULT

Any Arbirtary Interval

While solving for these two custom intervals isn’t too complicated, I did run into issues with allowing any arbitrary custom interval before 5.0. The issue is with how partman rounds the intervals to give the normally expected lower boundaries depending on the length of the interval: daily rounds to midnight, monthly rounds to the first of the month, etc. What if we wanted to partition by 9 week intervals and we wanted it to start on Mondays beginning with April 22, 2024?

keith=# SELECT partman.create_parent('public.time_stuff', 'created_at', '9 weeks', p_start_partition => '2024-04-22');
 create_parent 
---------------
 t
(1 row)

keith=# \d+ time_stuff
                                                      Partitioned table "public.time_stuff"
   Column   |           Type           | Collation | Nullable |           Default            | Storage | Compression | Stats target | Description 
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
 id         | integer                  |           | not null | generated always as identity | plain   |             |              | 
 created_at | timestamp with time zone |           | not null |                              | plain   |             |              | 
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240401 FOR VALUES FROM ('2024-04-01 00:00:00-04') TO ('2024-06-03 00:00:00-04'),
            time_stuff_p20240603 FOR VALUES FROM ('2024-06-03 00:00:00-04') TO ('2024-08-05 00:00:00-04'),
            time_stuff_p20240805 FOR VALUES FROM ('2024-08-05 00:00:00-04') TO ('2024-10-07 00:00:00-04'),
            time_stuff_p20241007 FOR VALUES FROM ('2024-10-07 00:00:00-04') TO ('2024-12-09 00:00:00-05'),
            time_stuff_p20241209 FOR VALUES FROM ('2024-12-09 00:00:00-05') TO ('2025-02-10 00:00:00-05'),
            time_stuff_default DEFAULT

That doesn’t look right! The issue here is that since the interval is less than 1 year but greater than or equal to 1 month, partman always tries to round to the first day of the month. What we want partman to do is round to the nearest week instead since our interval is based on an arbitrary amount of weeks. As I said, this was an issue before 5.0 but fixed only fairly recently thanks to a bug report from a user. This was solved in 4.6.0 by adding another option to create_parent().

keith=# SELECT partman.create_parent('public.time_stuff', 'created_at', '9 weeks', p_start_partition => '2024-04-22', p_date_trunc_interval => 'week');
 create_parent 
---------------
 t
(1 row)

keith=# \d+ time_stuff
                                                      Partitioned table "public.time_stuff"
   Column   |           Type           | Collation | Nullable |           Default            | Storage | Compression | Stats target | Description 
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
 id         | integer                  |           | not null | generated always as identity | plain   |             |              | 
 created_at | timestamp with time zone |           | not null |                              | plain   |             |              | 
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240422 FOR VALUES FROM ('2024-04-22 00:00:00-04') TO ('2024-06-24 00:00:00-04'),
            time_stuff_p20240624 FOR VALUES FROM ('2024-06-24 00:00:00-04') TO ('2024-08-26 00:00:00-04'),
            time_stuff_p20240826 FOR VALUES FROM ('2024-08-26 00:00:00-04') TO ('2024-10-28 00:00:00-04'),
            time_stuff_p20241028 FOR VALUES FROM ('2024-10-28 00:00:00-04') TO ('2024-12-30 00:00:00-05'),
            time_stuff_p20241230 FOR VALUES FROM ('2024-12-30 00:00:00-05') TO ('2025-03-03 00:00:00-05'),
            time_stuff_default DEFAULT

The p_date_trunc_interval parameter takes values that are valid for the PostgreSQL built-in function date_trunc. This tells partman how to round the boundaries to get the values you’re more expecting. One unfortunate thing that is unique for the weekly option here is that date_trunc('week', <timetamptz>) always rounds to a Monday. So in this case you wouldn’t be able to have an arbitrary amount of weeks that start on a Sunday or any other day of the week.

If you cannot use more common partition intervals (daily, monthly, etc), you’ll likely have to experiment with this feature to see if it allows you do do what you need. I would personally recommend trying to stick with more common intervals if at all possible, but business requirements sometimes require the uncommon.

Conclusion

When people first see the documentation for pg_partman, it can seem overwhelming. But 95% of the time you’re probably only going to be using the basic features or only a few more advanced ones at any one time. Partitioning has many small caveats to it that I’ve tried to help find solutions for over the years while working on pg_partman. Hopefully this blog post has helped to clarify how some of these smaller advanced features can help with more complex partitioning requirements.