Keith's Ramblings…

Archive for January, 2012

PostgreSQL Oracle FDW… in 8i?!

with 9 comments

So one of our clients is still stuck running an old Oracle 8i database. Please, no comments on how there’s no more support and they should have moved off this long ago. We know. They know. Moving on…

The introduction of a new Oracle Foreign Data Wrapper peaked our interest. Could it possibly still work with 8i? Working with our SA team, they found that the oldest client libraries still available from Oracle are 10.2. We’re not exactly sure when Oracle dropped 8i support from their client libraries, so instead of experimenting at this time they went with known working client libraries for our currently used client which is 10.1.0.2.0 (looking at the package info). So they compiled up some packages for our Solaris environment and off I went.

With the packages installed, setting up the extension couldnt’ve been easier

This was my first attempt with using Foreign Data Wrappers, so the next hour or so was spent reading the oracle_fdw docs and jumping around the postgres docs to see how it all works. We already have a connection between PostgreSQL and Oracle working with the dbi_link package, so the Oracle Client connection was already setup and working (explaining that setup is a little out of scope for this blog post). The commands to create the server, user mapping & foreign table follow…

Then run a select and see…

It works! This will make the (hopeful) migration off of Oracle 8i that much easier.

Could this possibly be faster than dbi_link for replicating data from Oracle to Postgres? Will be working on rewriting some of our data replication functions to use the FDW and run comparisons. I’ll share the results in a future post.

Written by Keith

January 12th, 2012 at 1:47 pm

Posted in PostgreSQL

Tagged with , ,

Linux Automated Minecraft Backup

with 8 comments

No, I’m not a Minecraft player. I got into it for a little while during the Alpha, paid for the full client to support Mojang just because I thought the idea was fantastic and loved seeing what other people do. But I just couldn’t stick with it. I do, however, have several friends that are fanatical fans of the game and some cycles on the home server to spare. So I offered to host a server for them so they could all play together instead of their own single player clients at home.

One shortcoming that I had for a while with the Linux version was a way to do automated backups. When running Minecraft from the command line in a screen session, there was no way to automatically shut the server down cleanly without possibly causing world corruption. I hadn’t really been keeping up with the small updates in the backend coding, but apparently they finally added a “save-off” console command to the game to allow a clean backup of the world files while the game was still running. The next problem was figuring out how to send that to a screen session automatically.

Spent some time looking at the options in screen that I had never really had a need for previously so wasn’t familiar with them. The following is an example of the script I whipped up that will perform a full backup of Minecraft running in screen window 3.

The stuff command was the magic I was looking for. It sends some nice ‘say’ commands to the console to let any players online know that the backup is running and turns world-saving on and off to allow a good, clean backup to be taken. Took some hunting around the web to find a way to send a return character in a stuff command. Found several, but printf is the first one I came across that worked, so I went with it. I set this to run in my cron daily and my friends are now a little more thankful that their world is getting backed up more reliably than me just remembering to do it on occasion.

Written by Keith

January 12th, 2012 at 12:35 am

Posted in Uncategorized

Tagged with , ,

PG Extractor – A smarter pg_dump

without comments

UPDATE: This post still gets a lot of views due to search results for pg_extractor. Please note that the perl version this post discusses is no longer in development and a complete rewrite in Python was done for version 2.0. The version control features were removed as well. Please see the README file for more information.

For my debut blog post, I’ll be explaining a tool, pg_extractor, that I began working on at my current job. I’d like to first give a big thank you to depesz and the DBA team at OmniTI for their help in making this tool.

We had already had a tool in use for doing a one-file-per-object schema dump of PostgreSQL databases for use in sending to version control. It was originally designed with some complex queries to fetch the internal table, view & function schema, but with the release of 9.x, many of those queries broke with internal structure changes. It also didn’t account for partitioned tables with inheritance, overloaded functions and several other more advanced internal structures. I’d originally set out to just fix the schema queries to account for these changes and shortcomings but depesz mentioned his idea for a complete reworking of the tool using pg_dump/pg_restore instead of complex queries. This would hopefully keep it as future proof as possible and keep the output in a format that was better guaranteed to be a representation of how PostgreSQL sees its own object structure, no matter how it may change in the future.

SVN options have been integrated into the tool and I have plans on getting Git in as well. While the ideal way to track a database’s schema in VCS would be before you actually commit it to your production system, this was an easy way to just have the schema available for reference and automatically track changes outside of the normal development cycle.

Working on this tool has shown many shortcomings with the built in pg_dump/restore. One of my hopes with this project is to show what improvements can be made and possibly built in sometime in the future. One of the biggest is the lack of outputting of ACLs and Comments for anything but tables. And if you have many individual objects you’d like to export, formatting that for pg_dump can be quite tedious. And there is a lack of filtering for anything other than tables and schemas with pg_dump. The filter options list for pg_extractor has become quite long and will probably only continue to grow. Just do a –help or see the documentation on GitHub. I think one of the nicest features is the ability to use external files as input to the filter options and also filter with regex!
Probably best to move on to some examples instead of continuing this wall of text…

This is a very basic use of the tool using many assumed defaults. It uses the postgres database user, grabs all objects (tables, views, functions/aggregates, custom types, & roles), and also makes a permanent copy of the pg_dump file used to create the individual schema files. I tried to use options that were similar to, or exactly the same as, the options for pg_dump/restore. There’s only so many letters in the alphabet though, so please check the documentation to be sure. Just like pg_dump, it will also use the PG environment variables $PGDATABASE, $PGPORT, $PGUSER, $PGHOST, $PGPASSFILE, $PGCLIENTENCODING if they are set (they are actually used internally by the script as well if the associated options are used).

This will extract only the functions from the “keith” schema. Any overloaded functions are put into the same file. I’d thought about trying to use the parameters to somehow make unique names for each version of an overloaded function, but I found this much easier for now, especially when going back and removing files if the –delete cleanup option is set.

This will extract only specifically named functions in the given filename. You must ensure that the full function signature is given with only the variable types for arguments. When using include files, it’s best to explicitly name the schemas that the objects in the file are in as well (it makes the temporary dump file that’s created smaller). The contents of any external file list are newline separated like so

One of the things I ensured was possible for external file lists was that you can use an object list created with psql when using the \t and \o options

You could then use the “table_filter” file with the –t_file or –T_file options. I use this ability at work for a cronjob that generates a specific list of views I want exported and tracked in SVN. But that view list is not static and can change at any time. So the cronjob generates a new view filter file every day.

This extracts only the tables listed in the given filename along with the data in the pg_dump “custom” format. By default, pg_extractor’s output will all be in “plain” (-Fp) format since the intended purpose of the tool is to create output in a human-readable format.

This excludes objects (most likely table partitions) that have the pattern objectname_pYYYY_MM or fairly similar. Binaries are also not in the $PATH and it EXCLUDES several schemas.
part_exclude file contains: _p_?(20|19)\d\d(_?\d+)*

This is an example using svn. To keep the svn password out of system logs, the svn username & password have to be manually entered into the script file. It’s the only option that requires any manual editing of the source code. May see about having this as an option that points to an external file instead. The –svndel option cleans up any objects that are removed from the database from the svn repository.
Update: Writing up this blog post got me motivated to fix having to edit any source to get a certain option to work. There is now an –svn_userfile option where you can give it the path to a file containing the svn username and password to use. See the help for formatting. Just set permissions appropriately on this file for security and ensure the user running pg_extractor can read it.

One of the first fun issues I came across was handling objects that have special characters in their names. It’s rare, but since individual objects are all in their own files based on their object names, the OS doesn’t particularly like that, and caused some rather annoying errors every time the script ran. This is now handled by hex-encoding the special character and preceding that character with a comma in the file name.
For example: table|name becomes table,7cname.sql
This hopefully makes it easy to decode if needed by any other tools. It’s actually done internally in the tool when you want objects that were deleted to also have their files deleted on subsequent runs.

One issue that I still haven’t solved (and make note of in the source where it would possibly be handled) is that the signature for a function when you do a pg_restore -l is different for any comments that are also associated with that function if variables are named in the parameter list.

Since looping back over the pg_restore -l list is how I find overloaded functions, ACLs and comments, it means it’s very hard to match those comments with those functions in all cases since PostgreSQL has variable types with spaces in them. If anyone has any solutions for this, I’d appreciate feedback. But this honestly seems like a bug to me since only the variable type in the parameter list is actually needed to uniquely identify a function. If it’s ok for the function definition itself, it should be ok for the associated comments as well.

As I said in the beginning, I will hopefully get Git integration done as well. Another option I’m working on right now is a filter for objects where a role has permissions on it, not just ownership (which there is an option for!). For example, dump all objects that the role “keith” has permission on. Maybe even more specific permissions.

I hope others will find this as useful as I have. It’s one of my first large projects released into the public, so I’d appreciate any constructive feedback on code quality and bugs found. I’ve enjoyed working on it and learned a great deal in the process.

https://github.com/omniti-labs/pg_extractor

Written by Keith

January 5th, 2012 at 1:39 pm

Posted in PostgreSQL

Tagged with , ,