Keith's Ramblings…

Archive for the ‘extensions’ tag

PostgreSQL Extension Updates & Preserving Privileges

with 5 comments

My latest update of Mimeo required me to do something that I knew I would eventually need to do: drop and recreate a function in an extension without breaking the original privileges it had. If the code within a function changes, a CREATE OR REPLACE wouldn’t affect privileges. But when you need to change the parameters (not overload it) or return type of a function, it must be dropped and recreated.

Since extension updates are plain SQL, this is a little trickier than it would be if I could use variables or plpgsql to do this (like I did in the refresh_snap() function to preserve privileges when there are column changes that force a destination table recreation). I’d had an idea of how to do it, but until I actually tried it I wasn’t sure if it would work in the extension update process. This is some of the code from the beginning and end of the 0.11.1 to 0.12.0 update of mimeo:

CREATE TEMP TABLE mimeo_preserve_privs_temp (statement text);

INSERT INTO mimeo_preserve_privs_temp 
SELECT 'GRANT EXECUTE ON FUNCTION @extschema@.logdel_maker(text, int, text, boolean, text[], text, boolean, text[], text[], boolean) TO '||array_to_string(array_agg(grantee::text), ',')||';' 
FROM information_schema.routine_privileges
WHERE routine_schema = '@extschema@'
AND routine_name = 'logdel_maker'; 

INSERT INTO mimeo_preserve_privs_temp 
SELECT 'GRANT EXECUTE ON FUNCTION @extschema@.refresh_logdel(text, int, boolean, boolean) TO '||array_to_string(array_agg(grantee::text), ',')||';' 
FROM information_schema.routine_privileges
WHERE routine_schema = '@extschema@'
AND routine_name = 'refresh_logdel'; 

CREATE FUNCTION replay_preserved_privs() RETURNS void
    LANGUAGE plpgsql
    AS $$
DECLARE
v_row   record;
BEGIN
    FOR v_row IN SELECT statement FROM mimeo_preserve_privs_temp LOOP
        EXECUTE v_row.statement;
    END LOOP;
END
$$;

DROP FUNCTION @extschema@.logdel_maker(text, int, text, boolean, text[], text, boolean, text[], text[]);
DROP FUNCTION @extschema@.refresh_logdel(text, int, boolean);

[... Bunch of code that does the actual update here ...]

SELECT @extschema@.replay_preserved_privs();
DROP FUNCTION @extschema@.replay_preserved_privs();
DROP TABLE mimeo_preserve_privs_temp;

I’m really glad this works because it’s much less complicated than I was thinking it was going to be. I just create a temp table to hold the grant commands with the original privileges, run some SQL to generate them, have a temp function that goes back and replays them at the end of the update and then drops the unneeded objects. If you look closely, the generated GRANT statement has the signature of the new function while the DROP statement has the old function signature.

The refresh functions (refresh_logdel() here) are functions that we’ve typically given permissions to other users to execute so they can refresh tables on demand as needed. You could check those permissions before the update, make note of them, and reapply them afterwards. But I think it’s much more preferable for the extension update to handle it itself if it can. This same method can be used to preserve permissions on any object just by looking it up in the relevant information_schema view.

Don’t know if this is the best way to do this, but so far it works. I’d appreciate any feedback if anyone has a better approach.

UPDATE 2014/02/14: At the time I wrote this post I wasn’t really familiar with the anonymous DO block. Since then I’ve redone this process and completely eliminated the need for the function. Redoing the code block above, you can see it simplifies things a lot.

CREATE TEMP TABLE mimeo_preserve_privs_temp (statement text);

INSERT INTO mimeo_preserve_privs_temp
SELECT 'GRANT EXECUTE ON FUNCTION @extschema@.logdel_maker(text, int, text, boolean, text[], text, boolean, text[], text[], boolean) TO '||array_to_string(array_agg(grantee::text), ',')||';'
FROM information_schema.routine_privileges
WHERE routine_schema = '@extschema@'
AND routine_name = 'logdel_maker';

INSERT INTO mimeo_preserve_privs_temp
SELECT 'GRANT EXECUTE ON FUNCTION @extschema@.refresh_logdel(text, int, boolean, boolean) TO '||array_to_string(array_agg(grantee::text), ',')||';'
FROM information_schema.routine_privileges
WHERE routine_schema = '@extschema@'
AND routine_name = 'refresh_logdel'; 

DROP FUNCTION @extschema@.logdel_maker(text, int, text, boolean, text[], text, boolean, text[], text[]);
DROP FUNCTION @extschema@.refresh_logdel(text, int, boolean);

[... Bunch of code that does the actual update here ...]

DO $$
DECLARE
v_row   record;
BEGIN
    FOR v_row IN SELECT statement FROM mimeo_preserve_privs_temp LOOP
        EXECUTE v_row.statement;
    END LOOP;
END
$$;

DROP TABLE mimeo_preserve_privs_temp;

 

Written by Keith

March 28th, 2013 at 12:51 pm

Posted in PostgreSQL

Tagged with , , ,

Mimeo – DML Replication

without comments

For the last introductory post about mimeo, I’ll be talking about DML replication (previous posts here and here). This is the most common way replication is done on a per table basis (at least that I’ve seen). Typically, a trigger is placed on the source table that tracks all changes (INSERTS, UPDATES & DELETES) and then some mechanism is used to replay those statements on the destination.

For mimeo, this is done with a queue table that just contains the primary key columns to note that a change was done to that row. The trigger places the primary key values into a queue table (also located on the source system) and then mimeo reads the queue table values to replay them on the destination. Saying that the statements are just replayed on the destination is really simplifying things though. While that is technically a legitimate way to replicate table changes, it is far from the most efficient. What mimeo actually does is

  1. Grab all queue table values, using a SELECT DISTINCT to only get a single copy of each row changed (since multiple changes to the same row put the same value into the queue table multiple times).
  2. Grab the full row from the source using the primary key to get the most recent values.
  3. Perform a DELETE … USING … command, removing all rows from the destination table that have a matching primary key value in the queue table.
  4. INSERT full rows from step 2 into the destination table.
  5. Clear the processed rows from the queue table.

This method is much more efficient because

  • Even if a row is updated 100,000 times between refresh runs, only one update is ever run on the destination with the latest value of that row.
  • And since all rows that were changed are deleted from the destination, this avoids having to check if something was actually an update or a delete. If it was an update, it will be reinserted from the queue. If it was deleted, it won’t be in the queue for insert.

And since this is all done in a single transaction on the destination, the result appears exactly the same as if the statements had actually been replayed as they happened on the source.

Mimeo also has a specialized DML replication method that can be useful in a data warehousing environment. One common need is to preserve deleted rows, but not track every single update done to a row. Just the last value that row has needs to be kept for archive purposes. The log deletion (logdel) replication method can provide this. It basically uses the same method as the normal DML above, but the trigger & queue table on the source are a little different. The queue table has the same columns as the source table as well as an extra timestamp column that records when a row was deleted. For an insert or update, just the primary key values are stored in the queue table, but for deletes the entire row gets stored. The replication steps are pretty much the same as DML except there’s an extra one to insert the deleted rows. And the destination table has an extra timestamp column as well to record when that row was deleted on the source.

So that’s basically how mimeo dml replication works. The dml/logdel maker functions take care of setting up the source table triggers, trigger functions, and queue tables for you as long as you’ve got the permissions set properly. The rest of the replication methods also have maker & destroyer functions to make setup and tear down easier. I’ll have further blog posts with some tips and use cases of how we’ve put mimeo to use for our clients. If you’ve got any questions or suggestions, please feel free to post here, on github or poke me on freenode IRC in #postgresql.

Written by Keith

March 15th, 2013 at 1:25 pm

Mimeo – Incremental Replication

with 7 comments

Continuing from my introductory post to mimeo, I’ll now discuss one of the methods that’s more unique to this replication tool.

If a table happens to have a timestamp column that is set on every insert or update of a row, the incremental replication methods can be used. This simply uses that column to track where in the replication process it left off each time it is run. There’s one for insert-only tables and another that can handle if that column is also set on every row update. While the insert-only one requires no primary/unique key, the updater one does. This method does not replicate any row deletions.

This means that, just like the snapshot method, only select privileges and no triggers are required on the source table. This method is ideal for insert-only, high-transaction tables such as one tracking hits on a website. Adding a trigger to track the changes to such a table for replication could place a lot of extra load of your front-end production systems.

Since this method was introduced in our environment, we’ve had to deal with several edge cases. One of the first was when the rows were just grabbed by getting everything larger than the last recorded timestamp. This runs into issues when the source table hasn’t stopped inserting rows for its latest timestamp value. Since that value is used for the next batch’s lower boundary and it thinks it has all rows matching that timestamp, it may miss some the next batch. So a boundary interval was introduced. This sets the batch’s upper limit to be less than a given interval. For example, say the last recorded timestamp on the destination was 2013-02-18 12:30:00 and mimeo runs the same day at 13:30:00. With a 10 minute upper boundary (mimeo’s default), this would get all rows with values > 2013-02-18 12:30:00 and < 2013-02-18 13:20:00. If rows are constantly being inserted, this does mean the destination is always 10 minutes behind. But it also ensures that no rows are ever missed. The boundary interval is required to enforce data integrity, but it is also configurable on a per table basis.

Another issue along the same lines has to do with when a limit is set on how many rows are pulled each batch. If the maximum number of rows in a batch is pulled, the upper boundary could be cut off in the middle of any timestamp value, not just the latest values being inserted. This is handled by always removing the highest value from the batch when the maximum number is pulled, delaying it to being pulled the next run. Bigger issues occur when the batch contains timestamp values that are all the same. There is no way to ensure a consistent pull of data from the source in this case. So if this issue is encountered, mimeo just cancels that batch entirely. To fix it, you must either remove the batch limit or set it to a high enough value that it can pull data with at least two different timestamp values. The internal logging & monitoring system (pg_jobmon) sets off a critical alert if this occurs so you will know if it happens. Before v0.10.0 a batch limit was always used, so this was a bigger concern then. Since that version, they are no longer turned on by default, but I left the option available. So if you use that option, just be aware of these limitations.

And lastly, probably be biggest cause of issues with time-based replication is daylight savings. When time changes, you chance losing data on your destination table, especially when setting the clock back since it thinks it already got that data. Now is a good time to mention that if you can run your databases on a system that uses UTC/GMT time, you can make a lot of time-based issues with storing data go away. But if that’s not possible, and you want to use this replication method, there are some configuration options available. When you set up incremental replication using mimeo’s maker functions, it checks to see what timezone the database is running in. If you’re in UTC/GMT, you’ve got nothing to worry about. If you’re not, mimeo sets a flag to turn off replication around the time change. I haven’t found any better (ie, less outrageously complex) solutions to this other than just completely stopping replication for that time period. By default it turns off replication between 12:30am and 02:30am. If you need to narrow down or change that time period, the start & end times are configurable.

So at first, basing replication on a timestamp seems like it wouldn’t be too complex an issue. But as always, things are never as simple as they may seem to be.

Written by Keith

February 18th, 2013 at 12:41 pm

PostgreSQL Extension Developer Tips – Part 4

without comments

So my latest release of mimeo (v0.10.0) taught me a lesson to look up a new feature to make sure of when it was added and that it would be compatible for the versions I’m trying to support. That feature was the new GET STACKED DIAGNOSTICS command that allows the capture and display of more details in the exception block to help make debugging problems easier. However that was introduced in 9.2 and I’m trying to stay compatible with 9.1. I tried wrapping the call in an IF statement that checked the current postgres version, but apparently that command gets evaluated when the function is installed, not on execution. So I’ve had to remove what I thought was a nice additional feature for now.

The reason this post is an extension developer tip was the way I handled releasing a fix for this. If you’re upgrading an extension and you are several released versions behind, when you run the ALTER EXTENSION … UPDATE TO … command, all of the updates between your version and the target version are run in sequence. However, due to the issue described above, the 0.9.3 -> 0.10.0 update script would fail if you’re not running postgres 9.2. This means anyone updating from 0.9.3 and earlier to 0.10.0 and later would never be able to run the intermediary update script, and hence never upgrade. The key to allowing a bypass of this version update is in the way the update scripts are named.

mimeo--0.9.3--0.10.0.sql

The name you give the file must be the version you’re upgrading from and the version it is upgrading you to. The trick to bypass the 0.10.0 update is to just create an update script like this

mimeo--0.9.3--0.10.1.sql

Seeing this file, the extension update planner will see that it is a shorter update path and choose it instead of running the intermediate update step from 0.9.3 to 0.10.0. The important thing you have to remember is to include ALL updates that occurred between the given versions in the bypass script.

There’s a handy function that can show you the update path that an extension update (or downgrade) can take so you can be sure things will work as expected

pg_extension_update_paths('extension_name')

This will show you every single update and downgrade path possible, and if a valid path exists, all the steps along that path. This can be quite spammy for extensions with a long update history. Luckily this function returns a record set, so you can filter with a WHERE condition. Here’s an example with mimeo 0.9.2 installed and the path that would be taken without the bypass script.

keith=# select * from pg_extension_update_paths('mimeo') where source = '0.9.2' and path is not null;
 source | target |             path             
--------+--------+------------------------------
 0.9.2  | 0.9.3  | 0.9.2--0.9.3
 0.9.2  | 0.10.0 | 0.9.2--0.9.3--0.10.0
 0.9.2  | 0.10.1 | 0.9.2--0.9.3--0.10.0--0.10.1

With the bypass script available, the update paths turn into this.

keith=# select * from pg_extension_update_paths('mimeo') where source = '0.9.2' and path is not null;
 source | target |         path         
--------+--------+----------------------
 0.9.2  | 0.9.3  | 0.9.2--0.9.3
 0.9.2  | 0.10.0 | 0.9.2--0.9.3--0.10.0
 0.9.2  | 0.10.1 | 0.9.2--0.9.3--0.10.1

So now when I give the update command

ALTER EXTENSION mimeo UPDATE TO '0.10.1';

it will skip right over the 0.10.0 update script. This will allow PostgreSQL versions older than 9.2 to update without any issues. So, my apologies to anyone that grabbed my extension right away after my blog post and ran into issues. At least this gave me an opportunity for another tip!

Written by Keith

February 9th, 2013 at 11:42 am

Posted in PostgreSQL

Tagged with , , ,

Mimeo – A per-table replication extension for PostgreSQL

with 3 comments

One of the biggest projects I’ve been working on the last few months is an extension that came about trying to organize a per-table replication process that has been in use with several of our clients, but never really formalized. After nearly 300 hours of time logged working on it, and mentioning it several times in other blog posts, I figured it’s about time I talk more about it.

https://github.com/omniti-labs/mimeo

I got the name mimeo after searching a thesaurus for words similar to “copy” & “replicate” and came across a mimeograph. The terms “low cost” and copying “small quantities” seemed to fit with the theme of what I was trying to accomplish, so the name stuck. There are some other great add-ons for PostgreSQL that allow per-table replication (Bucardo being the other one I’m more familiar with), but their setup and use can be a bit daunting. And if you just need a few tables copied, a bit overkill. The goal with mimeo was to keep the installation, maintenance and monitoring as simple as possible. Honestly, the hardest part of the extension I’ve found, and had others report the same to me, is just managing the permissions. I’ve got some plans to make some of it easier, but the extension doesn’t assume or require any superuser privileges, which I think is another plus.

The existing code I was working off of had several different replication methods that were used, and each had their own merits. I’ll be covering them over several blog posts, along with some general tips on usage, since I think a single blog post discussing the entire thing would be a bit much. I’ve already done some pretty extensive documentation and even written a howto, so these blog posts will mostly be informational and not really focused too much on the entire setup and maintenance process.

Before I get into too much detail, I just have to thank the PostgreSQL team for getting the extension system into place with 9.1. The only reason I was able to even come close to organizing the existing processes into something formal like this was because of that. The versioning control of a group of objects within the database allows a much smoother upgrading (and downgrading) process. And also a big thanks to David Wheeler for the pgTAP suite. If you look in the tests folder you can see I made extensive use of it and it’s saved me a tremendous amount of development time (especially with the latest v0.10.0 release where I rewrote much of the refresh process). When dealing with data replication you want to be really sure you’re getting every bit of data across 100% of the time and not breaking anything as development progresses.

Now onto the details. The most basic way to copy a table from one database to another is to just grab the entire thing. That’s covered in mimeo with the snapshot replication method. A table setup with this method will have the entire contents refreshed every time it is run. To help make this processes more transparent to a user of these tables, a view with two underlying tables is used. The view only ever points to one table at a time. When the refresh runs, the table it’s not pointing to is truncated and refreshed. A brief lock is then taken to swap the view to that new table. If you’ve got to refresh a rather larger table, the lock that a truncate takes would make the table unusable during that time. And doing a delete instead of a truncate could lead to some very heavy bloat in addition to the locking. The view swapping minimizes both of these issues.

Some additions to the snapshot process I was able to make were to allow the automatic propagation of column changes. This includes adding & dropping columns as well as type changes. Indexes are also copied over as well at creation time (this will be optional soon). You shouldn’t need to worry about constraints on the destination end since that should be being controlled on the source table. That would make the replication needlessly take longer. Since the table is re-created from scratch on a column change, there is an extra configuration option to run some additional commands such as setting permissions on the view/table. This can also be handled with the default privileges settings that were added in 9.0, but the extension option allows you to control them for each individually replicated table.

The snapshot method is ideal for smaller tables. It’s also much more efficient than the DML replication method (replaying every insert/update/delete, which I’ll be covering in a later post) if almost the entire table is updated in the span of time between table refreshes. One of the things I am currently working on is a way to just skip the refresh process if nothing has changed on the source. This would then make the snapshot process ideal for static tables or ones that rarely ever change.

That’s all for this post. We’re using this extension in some of our production environments already, so I’m confident in the code even though it’s not “1.0”. Would appreciate any feedback to speed up my decision to make such a stable release.

My next post will cover the incremental replication methods.

Written by Keith

February 7th, 2013 at 11:33 am