Keith's Ramblings…

"Fixing a complex system sounds pretty easy when you're an idiot."

A Large Database Does Not Mean Large shared_buffers

with 12 comments

A co-worker of mine did a blog post last year that I’ve found incredibly useful when assisting clients with getting shared_buffers tuned accurately.

Setting shared_buffers the hard way

You can follow his queries there for using pg_buffercache to find out how your shared_buffers are actually being used. But I had an incident recently that I thought would be interesting to share that shows how shared_buffers may not need to be set nearly as high as you believe it should. Or it can equally show you that you that you definitely need to increase it. Object names have been sanitized to protect the innocent.

IMPORTANT NOTE: An important caveat to using this extension is that when you query pg_buffercache it puts a lock on the shared buffers in order to get an accurate assessment of usage. For the most part, this lock is brief and you probably won’t even notice anything. But on extremely busy systems you may notice some queries take a bit longer than usual. This is not something to be monitored in real-time for any extended period of time. It’s mostly something to be checked on occasionally when you’re noticing performance problems. If you do set up automatic monitoring for it, do not be aggressive.

To set the stage, the database total size is roughly 260GB and the use case is high data ingestion with some reporting done on just the most recent data at the time. shared_buffers is set to 8GB. The other thing to note is that this is the only database in the cluster. pg_buffercache has info for all databases in the cluster, but when you join against pg_class to get object information, you can only do this on individual database at a time.

You can see that table1 is taking up a vast majority of the space here and it’s a large table, so only 9% of it is actually in shared_buffers. What’s more interesting though is how much of the space for that table is actually in high demand.

Data blocks that go into and come out of postgres all go through shared_buffers. Just to review the blog post I linked to, whenever a block is used in shared memory, it increments a clock-sweep algorithm that ranges from 1-5, 5 being extremely high use data blocks. This means high usage blocks are likely to be kept in shared_buffers (if there’s room) and low usage blocks will get moved out if space for higher usage ones is needed. We believe that a simple insert or update sets a usagecount of 1. So, now we look at the difference when usage count is dropped to that.

So the shared_buffers is actually getting filled mostly by the data ingestion process, but relatively very little of it is of any further use afterwards. If anything of greater importance was needed in shared_buffers, there’s plenty of higher priority space and that inserted data would quickly get flushed out of shared memory due to having a low usagecount.

So with having pg_buffercache installed, we’ve found that the below query seems to be a good estimate on an optimal, minimum shared_buffers setting

This is the sort of query you would run after you have had your database running through your expected workload for a while. Also, note my use of the key word minimal. This does not account for unexpected spikes in shared_buffers usage that may occur during a session of reporting queries or something like that. So you definitely want to set it higher than this, but it can at least show you how effectively postgres is using its shared memory. In general we’ve found the typical suggestion of 8GB to be a great starting point for shared_buffers.

So, in the end, the purpose of this post was to show that shared_buffers is something that needs further investigation to really set optimally and there is a pretty easy method to figuring it out once you know where to look.

UPDATE:

I’ve written a followup post to this one as well: A Small Database Does Not Mean Small shared_buffers.

So, as someone commented below, you don’t really need to join against pg_class & pg_database to get the ideal suggested minimum. This also avoids having to manually do totals across multiple databases in the cluster. The reason for joining against those two was to be able to identify which databases and objects the blocks in shared buffers were associated with. pg_class can only identify the objects of in the database you’re in.

Also, for really high traffic databases with fluctuating query activity, the suggested minimum query isn’t something you can run just once. It has to be run multiple times because the values can vary drastically.  Below are the results of running the shorter query just a few times in less than a 1 minute time period on a different client of ours that has a much different traffic pattern (OLTP) than the one above. There’s 46 databases in the cluster with a total size of roughly 900GB, with 800GB in one database, 30GB in the next largest and quickly getting smaller from there. For this one we actually have shared_buffers set down to 4GB and it’s been working great for years.

 


Also published on Medium.

Written by Keith

September 11th, 2014 at 2:53 pm

Posted in PostgreSQL

Tagged with , ,

  • Alexey Lesovsky

    Good post, Thanks!
    In the last query, I think you can querying data from pg_buffercache directly and pg_class is expensive.

    • keithf4

      You’re probably right. I was adapting it from the one where I had to get specific table data. Will see how it goes.

    • keithf4

      Posted an update. You’re right about it making the total suggested minimum easier to find. When you’re trying to figure out what objects the blocks belong to, that’s when the joins come in handy. Thanks!

  • Jim Nasby

    You might want to switch to using current_setting( ‘block_size’ ) instead of hard-coding 8192.

    • keithf4

      Good point. Thanks!

      • Jim Nasby

        Oh, you could use current_setting( ‘shared_buffers’ ) instead of the subselect on pg_settings too.

        • keithf4

          Ok, so I went to replace the static block_size values with current_setting(‘block_size’) and that worked fine. But then I went to change the pg_settings shared_buffers values and ran into the issue of current_setting() returning the actual value given in postgresql.conf (4GB), not the block size related value that pg_settings returns. This obviously can’t be used and I’d have to go through some conversion process to get it back into a byte value. Unless there’s something that does the opposite of pg_size_pretty()?

          I’d imagine those people with a higher block size setting would run into the same issue.

          • Jim Nasby

            Interesting… I didn’t realize config_setting() didn’t do the same thing as pg_settings. Looks like you need to stick with pg_settings here…

  • Noah Yetter

    Couple of problems with this:
    1. There’s an implicit argument here that all other things equal, a smaller shared_buffers is better, or stated differently, that you’re better off relying on OS disk caching. That argument seems weak. What’s the evidence that this is true?

    2. This method will only ever recommend a LOWER value of shared_buffers than your current setting, because you can’t count buffers you don’t have. If I’m running 4GB and my “ideal” value is 16GB, it will take a lot of testing iterations and attendant postgres restarts to find that out.

    • keithf4

      1. I’ve made no such implicit arguments. I’m just saying that a large database does not necessarily mean a large shared_buffers setting is needed. After working with many clients over the years, and talking to people at conferences, I’ve noticed that the many people that haven’t really looked into what shared_buffers is used for assume bigger is better. One of the few times we’ve seen setting shared_buffers to a very high amount work reliably well the majority of the time is when you can fit the entire database into memory. Outside of that, setting it very high when that amount is only a small fraction of the database you can run into double-buffering. Then relying on the OS cache can actually be beneficial if the majority of your high use data isn’t being kept in shared_buffers. Check out Greg Smith’s High Performance PostgreSQL book and blog posts by Robert Haas for more in-depth discussions on the repercussions of setting shared_buffers higher or lower than it needs to be.

      2. If you look at the original blog post I refer too, you can see the methods used when you don’t have enough shared_buffers available. If you’ve got a lot of high demand data filling an amount close to your current shared_buffers setting, then you can likely benefit from increasing it. Of course it’s always going to be less, but if you’re close that’s when you should start investigating more. This blog post was an example of when you clearly don’t need to increase and your performance problems are likely elsewhere, even in a massive OLTP system. It’s a small piece of the puzzle to database tuning. I made the update to my blog post to try and make it clearer this isn’t a once and done query. You’ll be using the first query to see which objects are using up space, what % of them is in memory and how much more memory would be needed to get the ideal amount into memory. The second query that targets a specific table then guides you to how much of that table is in demand. If you’re in a situation where increasing shared_buffers is necessary, then yes, it’s going to take some work to get there.

      Most people don’t know know that this extension exists and the documentation isn’t really clear on how useful it actually is. I figured I’d give an example I’ve run across quite frequently to show more clearly how beneficial it is.

  • Pingback: A Small Database Does Not Mean Small shared_buffers at Keith's Ramblings…()

  • Pingback: DB Weekly No.30 | ENUE()