<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
    <channel>
        <title>PostgreSQL - Category - Keith&#39;s Ramblings...</title>
        <link>https://www.keithf4.com/categories/postgresql/</link>
        <description>PostgreSQL - Category - Keith&#39;s Ramblings...</description>
        <generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Wed, 27 Sep 2023 05:05:05 -0400</lastBuildDate><atom:link href="https://www.keithf4.com/categories/postgresql/" rel="self" type="application/rss+xml" /><item>
    <title>New Site, New Partman</title>
    <link>https://www.keithf4.com/posts/2023-05-30-new-hugo-new-partman/</link>
    <pubDate>Wed, 27 Sep 2023 05:05:05 -0400</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/posts/2023-05-30-new-hugo-new-partman/</guid>
    <description><![CDATA[<p>Thanks to some help from my co-workers <a href="https://github.com/elizabeth-christensen" target="_blank" rel="noopener noreffer ">Elizabeth</a> and <a href="https://github.com/pgguru" target="_blank" rel="noopener noreffer ">David Christenson</a> and their son, I got my site migrated from Wordpress to Hugo! Being a DBA, you&rsquo;d think I wouldn&rsquo;t mind having a database backing my website, but the simplicity of managing a static site like Hugo was much more appealing at this point. With a new site that&rsquo;s far easier to manage, I&rsquo;m hoping that will motivate me to get back to writing new content on a regular basis.</p>]]></description>
</item>
<item>
    <title>Managing Transaction ID Exhaustion (Wraparound) in PostgreSQL</title>
    <link>https://www.keithf4.com/managing-transaction-id-exhaustion-wraparound-in-postgresql/</link>
    <pubDate>Mon, 12 Apr 2021 17:10:16 -0600</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/managing-transaction-id-exhaustion-wraparound-in-postgresql/</guid>
    <description><![CDATA[<p>One of the most critical topics to understand when administering a PostgresSQL database is the concept of transaction IDs (TXID) and that they can be exhausted if not monitored properly. However, this blog post isn&rsquo;t going to go into the details of what it TXID exhaustion actually is. The <a href="https://www.postgresql.org/docs/current/routine-vacuuming.html" target="_blank" rel="noopener noreffer ">Routine Vacuuming</a> section of the documentation is probably one of the most important to read and understand so I will refer you there. What this blog post is going to cover is an easy way to monitor for it and what can be done to prevent it ever being a problem.</p>]]></description>
</item>
<item>
    <title>Per-Table Autovacuum Tuning</title>
    <link>https://www.keithf4.com/per-table-autovacuum-tuning/</link>
    <pubDate>Mon, 01 Oct 2018 12:26:24 -0600</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/per-table-autovacuum-tuning/</guid>
    <description><![CDATA[<p>A pattern that seems to drive my blog posts definitely seems to be the frequency of client questions. And that is definitely the case here again. Vacuum tuning to manage bloat and transaction id wraparound on production systems has been a hot topic and lately this has even been getting down to tuning autovacuum on the individual table basis. I&rsquo;ve already discussed bloat pretty extensively in <a href="http://localhost:1313/posts/checking-for-psql-bloat.md" target="_blank" rel="noopener noreffer ">previous posts</a>. While I&rsquo;d like to get into the details of transaction ID wraparound, that really isn&rsquo;t the focus of this post, so I&rsquo;ll defer you to <a href="https://www.postgresql.org/docs/current/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND" target="_blank" rel="noopener noreffer ">the documentation</a>.</p>]]></description>
</item>
<item>
    <title>Removing A Lot of Old Data (But Keeping Some Recent)</title>
    <link>https://www.keithf4.com/removing-old-data/</link>
    <pubDate>Wed, 15 Mar 2017 11:33:40 -0600</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/removing-old-data/</guid>
    <description><![CDATA[<p>I&rsquo;ve had this situation crop up a few times with clients and after a discussion on #<a href="https://freenode.net" target="_blank" rel="noopener noreffer ">postgresql on Freenode</a> recently, decided a blog post may be in order. The pitfalls that lead me to this solution are useful to cover and it seems a useful set of steps to have documented and be able to share again later.</p>
<p>There comes a time for most people when you have a table that builds up quite a lot of rows and you then realize you didn&rsquo;t actually need to keep all of it. But you can&rsquo;t just run a TRUNCATE because you do want to keep some of the more recent data. If it&rsquo;s just a few million small rows, it&rsquo;s not a huge deal to just run a simple DELETE. But when that starts getting into the billions of rows, or your rows are very large (long text, bytea, etc), a simple DELETE may not be realistic.</p>]]></description>
</item>
<item>
    <title>PostgreSQL 10 Built-in Partitioning</title>
    <link>https://www.keithf4.com/postgresql-10-built-in-partitioning/</link>
    <pubDate>Mon, 12 Dec 2016 16:48:54 &#43;0000</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/postgresql-10-built-in-partitioning/</guid>
    <description><![CDATA[<p>Since I have a <a href="https://github.com/keithf4/pg_partman" target="_blank">passing interest</a> in partitioning in PostgreSQL, I figured I&rsquo;d check out a recent commit to the development branch of PostgreSQL 10</p>
<p>Implement table partitioning – <a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63" target="_blank" rel="noopener noreffer ">https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63</a></p>
<pre tabindex="0"><code>Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own.  The children are called
partitions and contain all of the actual data.  Each partition has an
implicit partitioning constraint.  Multiple inheritance is not
allowed, and partitioning and inheritance can&#39;t be mixed.  Partitions
can&#39;t have extra columns and may not allow nulls unless the parent
does.  Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn&#39;t yet supported for partitions which are foreign
tables, and it doesn&#39;t handle updates that cross partition boundaries.

Currently, tables can be range-partitioned or list-partitioned.  List
partitioning is limited to a single column, but range partitioning can
involve multiple columns.  A partitioning &#34;column&#34; can be an
expression.

Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations.  The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.

Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others.  Minor revisions by me.
</code></pre><p>After many years of waiting, one of the major features missing from PostgreSQL is finally getting its first major step forward with the inclusion of a built in partitioning option. The syntax and usage is fairly straight forward so let&rsquo;s jump straight into it with the examples from the documentation (slightly modified)</p>]]></description>
</item>
<item>
    <title>Cleaning Up PostgreSQL Bloat</title>
    <link>https://www.keithf4.com/cleaning-up-postgresql-bloat/</link>
    <pubDate>Wed, 08 Jun 2016 21:25:45 &#43;0000</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/cleaning-up-postgresql-bloat/</guid>
    <description><![CDATA[<p>As a followup to my <a href="http://www.keithf4.com/checking-for-postgresql-bloat/" target="_blank" rel="noopener noreffer ">previous post on checking for bloat</a>, I figured I&rsquo;d share some methods for actually cleaning up bloat once you find it. I&rsquo;ll also be providing some updates on the script I wrote due to issues I encountered and thanks to user feedback from people that have used it already.</p>
<p>First, as these examples will show, the most important thing you need to clean up bloat is extra disk space. This means it is critically important to monitor your disk space usage if bloat turns out to be an issue for you. And if your database is of any reasonably large size, and you regularly do updates &amp; deletes, bloat will be an issue at some point. I&rsquo;d say a goal is to always try and stay below 75% disk usage either by archiving and/or pruning old data that&rsquo;s no longer needed. Or simply adding more disk space or migrating to new hardware all together. Having less 25% free can put you in a precarious situation where you may have a whole lot of disk space you can free up, but not enough room to actually do any cleanup at all or without possibly impacting performance in big ways (Ex. You have to drop &amp; recreate a bloated index instead of rebuilding it concurrently, making previously fast queries extremely slow).</p>]]></description>
</item>
<item>
    <title>Checking for PostgreSQL Bloat</title>
    <link>https://www.keithf4.com/checking-for-postgresql-bloat/</link>
    <pubDate>Fri, 27 May 2016 15:55:50 &#43;0000</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/checking-for-postgresql-bloat/</guid>
    <description><![CDATA[<p>My <a href="http://www.keithf4.com/checking-for-postgresql-bloat-old/" target="_blank">post almost 2 years ago</a> about checking for PostgreSQL bloat is still one of the most popular ones on my blog (according to Google Analytics anyway). Since that&rsquo;s the case, I&rsquo;ve gone and changed the URL to my old post and reused that one for this post. I&rsquo;d rather people be directed to correct and current information as quickly as possible instead of adding an update to my old post pointing to a new one. I&rsquo;ve included my <a href="#why" rel="">summary on just what exactly bloat is again below</a> since that seemed to be the most popular part.</p>]]></description>
</item>
<item>
    <title>Document Storage in PostgreSQL &amp; Open Source Benefits</title>
    <link>https://www.keithf4.com/document-storage-in-postgresql-open-source-benefits/</link>
    <pubDate>Fri, 06 Nov 2015 16:48:20 &#43;0000</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/document-storage-in-postgresql-open-source-benefits/</guid>
    <description><![CDATA[<p><em><strong>Update 2016-07-20:</strong></em> Since this blog post, I&rsquo;ve recently gone back and updated pg_doc_store to take advantage of the new INSERT ON CONFLICT (upsert) feature in 9.5. So the extension is much more ready for possible production use if anyone finds it useful.</p>
<p>This past week I&rsquo;ve had two experiences that show the amazing benefits of having your code be part of an open source community. The first involves my <a href="https://github.com/keithf4/pg_partman" target="_blank" rel="noopener">pg_partman</a> extension. People have been asking me for quite some time about having a more generalized partitioning solution beyond just time/serial. I&rsquo;ve resisted because that&rsquo;s really not the focus of the tool since outside of those two types, partitioning is usually a once and done setup and only rarely needs further maintenance. Also, that would add quite a bit more complexity if I wanted to support, for example, all the things that MySQL does (range, list, hash, key) and all the variations that can be possible in there. I&rsquo;ve already been feeling feature creep as it is, so wasn&rsquo;t in a hurry for this. But, a user of pg_partman submitted some great work for getting a generalized range partition setup going with hopes of possibly integrating it with pg_partman. I instead thought it would be better released as its own tool. You can find his repo here and give it a try if you&rsquo;ve been needing that feature: (UPDATE: Range Partitioning project has been discontinued since native partitioning was included in PG10+. Link removed since repo is gone).</p>]]></description>
</item>
<item>
    <title>PG Partition Manager v2.0.0 – Background Worker, Better Triggers &amp; Extension Versioning Woes</title>
    <link>https://www.keithf4.com/pg-partition-manager-v2-0-0/</link>
    <pubDate>Thu, 11 Jun 2015 19:30:10 &#43;0000</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/pg-partition-manager-v2-0-0/</guid>
    <description><![CDATA[<p><a href="https://github.com/keithf4/pg_partman" target="_blank">PG Partition Manager</a> has been the most popular project I&rsquo;ve ever done a significant amount of work on and I really appreciate everyone&rsquo;s feedback for the roughly 2 years it&rsquo;s been out there. I&rsquo;ve got plenty more ideas for development and features and look forward to being able to move forward on them with this new major version released.</p>
<p>PostgreSQL 9.3 introduced the ability for user created, <a href="http://www.postgresql.org/docs/current/static/bgworker.html" data-versionurl="https://www.keithf4.com/amber/cache/fa76218d0f780695877e91efaed7eeb8/" data-versiondate="2016-02-04T15:15:27+00:00" data-amber-behavior="" target="_blank">programmable background workers</a> (BGW). 9.4 then introduced the ability to dynamically start &amp; stop these with an already running cluster. The first thing that popped into my mind when I heard about this was hopefully having some sort of built-in scheduling system. There still hasn&rsquo;t been a generalized version of anything like this, so in the mean time I studied the worker_spi contrib module. This is a very simple BGW example with a basic scheduler that runs a process with a configurable interval. This is basically all pg_partman needs for partition maintenance, and what required an external scheduler like cron before.</p>]]></description>
</item>
<item>
    <title>PG Partman – Sub-partitioning</title>
    <link>https://www.keithf4.com/pg-partman-sub-partitioning/</link>
    <pubDate>Mon, 09 Mar 2015 15:40:02 &#43;0000</pubDate>
    <author>Author</author>
    <guid>https://www.keithf4.com/pg-partman-sub-partitioning/</guid>
    <description><![CDATA[<p>After <a href="http://www.pgcon.org/2014/schedule/events/639.en.html" data-versionurl="https://www.keithf4.com/amber/cache/746f72b77e8e2103781c1c2b46ca6c09/" data-versiondate="2016-02-04T15:13:52+00:00" data-amber-behavior="" target="_blank">my talk at PGCon 2014</a> where I discussed <a href="https://github.com/keithf4/pg_partman" target="_blank">pg_partman</a>, someone I met at the bar track said they&rsquo;d use it in a heartbeat if it supported sub-partitioning. Discussing this with others and reading online, I found that there is quite a demand for this feature and the partitioning methods in MySQL &amp; Oracle both support this as well. So I set out to see if I could incorporate it. I thought I&rsquo;d had it figured out pretty easily and started writing this blog post a while ago (last October) to precede the release of version 1.8.0. Then I started working on the examples here and realized this is a trickier problem to manage than I anticipated. The tricky part being managing the context relationship between the top level parent and their child sub-partitions in a general manner that would work for all partitioning types pg_partman supports. When I first started working on the feature, I&rsquo;d get things like this:</p>]]></description>
</item>
</channel>
</rss>
