Welcome to my blog where I ramble on about PostgreSQL and other random things…

Checking for PostgreSQL Bloat

My post almost 2 years ago about checking for PostgreSQL bloat is still one of the most popular ones on my blog (according to Google Analytics anyway). Since that’s the case, I’ve gone and changed the URL to my old post and reused that one for this post. I’d rather people be directed to correct and current information as quickly as possible instead of adding an update to my old post pointing to a new one. I’ve included my summary on just what exactly bloat is again below since that seemed to be the most popular part.

Document Storage in PostgreSQL & Open Source Benefits

Update 2016-07-20: Since this blog post, I’ve recently gone back and updated pg_doc_store to take advantage of the new INSERT ON CONFLICT (upsert) feature in 9.5. So the extension is much more ready for possible production use if anyone finds it useful.

This past week I’ve had two experiences that show the amazing benefits of having your code be part of an open source community. The first involves my pg_partman extension. People have been asking me for quite some time about having a more generalized partitioning solution beyond just time/serial. I’ve resisted because that’s really not the focus of the tool since outside of those two types, partitioning is usually a once and done setup and only rarely needs further maintenance. Also, that would add quite a bit more complexity if I wanted to support, for example, all the things that MySQL does (range, list, hash, key) and all the variations that can be possible in there. I’ve already been feeling feature creep as it is, so wasn’t in a hurry for this. But, a user of pg_partman submitted some great work for getting a generalized range partition setup going with hopes of possibly integrating it with pg_partman. I instead thought it would be better released as its own tool. You can find his repo here and give it a try if you’ve been needing that feature: (UPDATE: Range Partitioning project has been discontinued since native partitioning was included in PG10+. Link removed since repo is gone).

PG Partition Manager v2.0.0 – Background Worker, Better Triggers & Extension Versioning Woes

PG Partition Manager has been the most popular project I’ve ever done a significant amount of work on and I really appreciate everyone’s feedback for the roughly 2 years it’s been out there. I’ve got plenty more ideas for development and features and look forward to being able to move forward on them with this new major version released.

PostgreSQL 9.3 introduced the ability for user created, programmable background workers (BGW). 9.4 then introduced the ability to dynamically start & stop these with an already running cluster. The first thing that popped into my mind when I heard about this was hopefully having some sort of built-in scheduling system. There still hasn’t been a generalized version of anything like this, so in the mean time I studied the worker_spi contrib module. This is a very simple BGW example with a basic scheduler that runs a process with a configurable interval. This is basically all pg_partman needs for partition maintenance, and what required an external scheduler like cron before.

PG Partman – Sub-partitioning

After my talk at PGCon 2014 where I discussed pg_partman, someone I met at the bar track said they’d use it in a heartbeat if it supported sub-partitioning. Discussing this with others and reading online, I found that there is quite a demand for this feature and the partitioning methods in MySQL & Oracle both support this as well. So I set out to see if I could incorporate it. I thought I’d had it figured out pretty easily and started writing this blog post a while ago (last October) to precede the release of version 1.8.0. Then I started working on the examples here and realized this is a trickier problem to manage than I anticipated. The tricky part being managing the context relationship between the top level parent and their child sub-partitions in a general manner that would work for all partitioning types pg_partman supports. When I first started working on the feature, I’d get things like this:

A Small Database Does Not Mean Small shared_buffers

As a followup to my previous blog post, A Large Database Does Not Mean Large shared_buffers, I had some more interesting findings applying the queries in that blog post to another client recently. I assume you have read that one already and don’t repeat any of what I explained previously, so if you haven’t read that one and aren’t familiar with the pg_buffercache extension, I highly recommend you go read that one first.

A Large Database Does Not Mean Large shared_buffers

A co-worker of mine did a blog post last year that I’ve found incredibly useful when assisting clients with getting shared_buffers tuned accurately.

Setting shared_buffers the hard way

You can follow his queries there for using pg_buffercache to find out how your shared_buffers are actually being used. But I had an incident recently that I thought would be interesting to share that shows how shared_buffers may not need to be set nearly as high as you believe it should. Or it can equally show you that you that you definitely need to increase it. Object names have been sanitized to protect the innocent.