Quantcast
Viewing all articles
Browse latest Browse all 190

Postgresql and pgloader

Image may be NSFW.
Clik here to view.
Recently I migrated a small but non-trivial (25 tables, about 500 columns, no triggers or SPs) MySQL schema to Postgresql using the Open Source pgloader utility.

pgloader supports migration from several databases/formats (MySQL, Sqlite, MS SQL, dBase, CSV) to Postgresql.

pgloader is fast: it took less than a minute to migrate/load both the schema and data into Postgresql 9.1 on my Mac notebook running Virtualbox.

Executive Summary

pgloader is a free utility that lets you quickly migrate database schemas and data to Postgresql. It is helpful for setting up a PoC to estimate the work required for a full migration analysis. (It does not do SQL or application migration.)

Installation on CentOS 7

yum install freetds postgresql
createdb mydb
# unpack and run the build script from github or use the Docker image
make pgloader
pgloader mysql://root@localhost/mydb postgresql:///mydb

Installation on Debian or Ubuntu

apt-get install pgloader postgresql
createdb mydb
pgloader mysql://root@localhost/mydb postgresql:///mydb

pgloader pluses:

– free
– fast
– configurable with pgloader rules
– reasonable result for a first pass even without custom pgloader rules
– “pro DBA” interface – accepts configuration file, called a load file, and emits pgloader.log.

pgloader minuses:

– unusual in that it is written in Lisp, but that is not user-visible
– doesn’t create Postgresql users/roles (to be expected as the SQL standard doesn’t specify these, thus they vary greatly across databases)
– detects and double-quotes reserved object names, but doesn’t notify of schema name conflicts with the Postgresql public namespace ie. table name ‘user’
– will require time-consuming schema cleanup for most use cases. See below.

Schema/Data Cleanup Notes

  • MySQL timestamps do not display the tz, but by default pg shows … “+00” unless you specify “timestamp without timezone”, or you use EXTRACT() or TO_CHAR(). Read about related pgloader rules here.
  • MySQL text and varchar columns can be migrated with varying results to pg text and bytea types
  • treat the table name ‘user’ as reserved since there is a public.user symbol that overrides the search path
  • don’t underestimate how long schema cleanup will take. Although pgloader runs quickly, Postgresql does not do casting automatically, so is extremely sensitive to application SQL statements
  • MySQL and Postgresql have different models for representing users/roles and timezones that need to be dealt with sooner than later. Here is some advice on timezone setting: Adding timezone to naive datetime fields from MySQL #331
  • migrating applications from MySQL to Postgresql is easier with Postgresql 9.5 since it has INSERT … ON CONFLICT DO UPDATE (UPSERT.)

Perl CGI::Session Notes

pgloader migrates the MySQL CGI::Session table using a Postgresql text column. This works better:

alter table sessions alter a_session type bytea using a_session::bytea;

My Migration Results

After renaming the table ‘user’ to ‘users’ and altering the sessions table (see above), around half of the SELECT queries worked as-is and I could login to the application and click around. Image may be NSFW.
Clik here to view.
:)

However, virtually all of the INSERT and UPDATE statements had to be rewritten, taking 2 man days to get to an alpha version.

Database Migration Alternatives

Amazon AWS provides 3 powerful data migration tools under the AWS Database Migration Service banner that are either free to use, or have a 6 month trial.

pgloader: Homepage, github, Manual, Licence
postgresql.org: pg_dump


Viewing all articles
Browse latest Browse all 190

Trending Articles