Successful conversion of OpenX from MySQL to PostgreSQL

PostgreSQL is a great database and OpenX was open source ad server of choice.  After a recent update OpenX works well with PostgreSQL and was in production here at Persistent Objects.

The MySQL database had been in production for a while and there were differences between the old and new database schemas so needed a few minor modifications before conversion could take place.

There is a small amount of data that will continue to go to the old database before the conversion is complete, not because anything is lost in the conversion but because of timing, the data is converted then the database switched and the duration was in our case in the order of about two minutes, this was recoverable and timed for least impact.

Setup

  • Apache 2.2
  • MySQL 5.1 (before)
  • PostgreSQL 9.2 (after)
  • OpenX 2.8.11 (patched)
  • py-mysql2pgsql 1.6

Preparation

Initial testing with OpenX and PostgreSQL showed high compatibility and confirmed conversion was viable.  The only deficiency was with the storage of images in the database and this was not a requirement as performance is poor and not used here.

Data conversion with mysql2pgsql

Python mysql2pgsql version 1.6 was used for the conversion (available on GitHub) and the mysql2pgsql.yml configuration file follows.

# if a socket is specified we will use that
# if tcp is chosen you can use compression
mysql:
 hostname: localhost
 port: 3306
 socket: /tmp/mysql.sock
 username: openxusername
 password: *****
 database: openx
 compress: false
destination:
 # if file is given, output goes to file, else postgres
 file: openx.sql
 postgres:
  hostname: localhost
  port: 5432
  username: mysql2psql
  password: 
  database: mysql2psql_test

# if tables is given, only the listed tables will be converted.  leave empty to convert all tables.
#only_tables:
#- table1
#- table2
# if exclude_tables is given, exclude the listed tables from the conversion.
exclude_tables:
- ox_z_0a45e1e07196fa49
- ox_z_1cd43d7ae97488e5
- ox_z_5e1c055313ce2125
- ox_z_905126a6dd8302d0
- ox_z_a6946b3d6aa54d36
- tmp_ad_zone_impression

# if supress_data is true, only the schema definition will be exported/migrated, and not the data
supress_data: false

# if supress_ddl is true, only the data will be exported/imported, and not the schema
supress_ddl: true

# if force_truncate is true, forces a table truncate before table loading
force_truncate: true

# if timezone is true, forces to append/convert to UTC tzinfo mysql data
timezone: false

When running mysql2pgsql for the first time a sample configuration is created that can be modified.  As the two databases were slightly different it was decided to create a fresh database from a new install of OpenX and then add the converted data to that rather than use the conversion script for both.  Although we could have updated the database directly in this instance it was decided to convert to an intermediate file with the option 'file: openx.sql' and without the table creation scripts using the option supress_ddl: true.  There were also a number of previous backup tables that were not required so these were excluded in the exclude_tables list.

MySQL data sanitising prior to conversion

A column lb_reporting was found to be redundant and required removing, a few tinyint(1) columns would have been incorrectly converted to boolean as when newly created are of type smallint so these were amended in MySQL to facilitate the conversion, there is no functional difference other than being of a larger datatype and could have been easily reversed.  A few rows were found to contain nulls instead of zeros and were rejected by the strict checking of PostgreSQL and needed fixing.  The full mysql script follows.

alter table ox_clients drop column lb_reporting;
alter table ox_account_user_permission_assoc modify column is_allowed smallint not null default 1;
alter table ox_ad_zone_assoc modify column to_be_delivered smallint not null default 1;
alter table ox_banners modify column transparent smallint not null default 0;
update ox_campaigns set min_impressions= 0 where min_impressions is null;
alter table ox_clients modify column advertiser_limitation smallint not null default 0;
alter table ox_data_raw_ad_click modify column max_https smallint null;
alter table ox_data_raw_ad_impression modify column max_https smallint null;
alter table ox_data_summary_ad_zone_assoc modify column to_be_delivered smallint not null default 1;
alter table ox_users modify column active smallint not null default 1;
alter table ox_zones modify column is_in_ad_direct smallint null;

A dry run confirmed that the script worked smoothly and without any errors and subsequent testing proved successful.

OpenX configuration example.com.conf.php

The differences between the configuration file were minimal and scripted with diff. Although there were few differences, by scripting them they are easily reproducible and automated.

--- openx.p-o.co.uk.conf.php	2013-09-30 17:18:57.000000000 +0100
+++ openx.p-o.co.uk.conf.php	2013-09-30 15:29:31.000000000 +0100
@@ -24,16 +24,16 @@
 zoneLinkingStatistics=1
 
 [database]
-type=mysql
+type=pgsql
 host=localhost
 socket=
-port=3306
+port=5432
 username=openxuser
 password="*******"
 name=openxdatabase
-persistent=1
-mysql4_compatibility=1
-protocol=unix
+persistent=
+mysql4_compatibility=
+protocol=tcp
 compress=
 ssl=
 capath=

Going live

A newly created database was prepared in PostgreSQL.

With reliable backups and a period when activity was predicted to be minimal the conversion started.

The conversion commenced and the subsequent file loaded into the newly created database, on completion the configuration file was patched and the server gracefully restarted with no loss of service.

With sufficient preparation the conversion was automated and tested successfully so that on the live conversion, the transition proceeded as expected with no downtime.  The two minutes during the transition was minimal and following a second conversion of the original database showed the difference to be minimal and easily transferred.

Since writing this article, ad services have moved on and Persistent Objects almost exclusively use Google Adwords and Google AdSense.  Although our preference is still for PostgreSQL over MySQL, Oracle have improved MySQL 5.6 considerably.

by Alan Hicks