Upgrading DBMail from MySQL to PostgreSQL

Upgrading a MySQL database to PostgreSQL has many benefits and our experience of converting/upgrading many of them is captured in this DBMail to PostgreSQL conversion; thanks to good preparation and testing it was both straightforward and without issue.

Choosing a suitable time where impact would be minimal and then...

Conversion Tasks

  • Suspend delivery
  • Suspend Nagios or other automatic monitoring
  • Halt the IMAP and POP services
  • Alter the MySQL database prior to migration
  • Transfer the data from MySQL to PostgreSQL
  • Fix the PostgreSQL sequences
  • Undo the MySQL schema changes
  • Vacuum the PostgreSQL database
  • Send a test message manually
  • Re-enable the Nagios automatic monitoring
  • Resume delivery as normal

Preparation

Although the process took a couple of hours, the process was largely automated and had been fully tested beforehand ensuring there were no surprises.  All the fun happened during the preparation for example, an old MySQL database with duplicate entries that needed to be fixed as PosgreSQL is strict with its indexes.  Your mileage may vary.

For anyone wishing to upgrade, the following shell script adapted from the dbmail source may be of help.

#!/bin/sh

DBMAIL_DATABASE=dbmail
DBMAIL_PREFIX=dbmail
echo Converting dbmail from MySQL to PostgreSQL

## Pre conversion to PostgreSQL
echo Preparing MySQL structure
mysql ${DBMAIL_DATABASE} --exec "ALTER TABLE ${DBMAIL_PREFIX}_mailboxes
MODIFY seen_flag smallint NOT NULL default 0,
MODIFY answered_flag smallint NOT NULL default 0,
MODIFY deleted_flag smallint NOT NULL default 0,
MODIFY flagged_flag smallint NOT NULL default 0,
MODIFY recent_flag smallint NOT NULL default 0,
MODIFY draft_flag smallint NOT NULL default 0,
MODIFY no_inferiors smallint NOT NULL default 0,
MODIFY no_select smallint NOT NULL default 0,
MODIFY permission smallint NULL default 2;"

mysql ${DBMAIL_DATABASE} --exec "ALTER TABLE ${DBMAIL_PREFIX}_messages
MODIFY seen_flag smallint NOT NULL default 0,
MODIFY answered_flag smallint NOT NULL default 0,
MODIFY deleted_flag smallint NOT NULL default 0,
MODIFY flagged_flag smallint NOT NULL default 0,
MODIFY recent_flag smallint NOT NULL default 0,
MODIFY draft_flag smallint NOT NULL default 0;"

mysql ${DBMAIL_DATABASE} --exec "ALTER TABLE ${DBMAIL_PREFIX}_partlists
MODIFY is_header smallint NOT NULL default 0;"

mysql ${DBMAIL_DATABASE} --exec "ALTER TABLE ${DBMAIL_PREFIX}_sievescripts
MODIFY active smallint NOT NULL default 0;"

py-mysql2pgsql -v

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(alias_idnr) FROM ${DBMAIL_PREFIX}_aliases"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_aliases
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(alias_idnr) + 1 FROM ${DBMAIL_PREFIX}_aliases"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_alias_idnr_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_aliases is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_authlog"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_authlog
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_authlog"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_authlog_id_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_authlog is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_envelope"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_envelope
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_envelope"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_envelope_idnr_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_envelope is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_filters"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_filters
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_filters"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_filters_id_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_filters is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_headername"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_headername
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_headername"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_headername_id_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_headername is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_headervalue"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_headervalue
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_headervalue"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_headervalue_id_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_headervalue is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(mailbox_idnr) FROM ${DBMAIL_PREFIX}_mailboxes"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_mailboxes
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(mailbox_idnr) + 1 FROM ${DBMAIL_PREFIX}_mailboxes"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_mailbox_idnr_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_mailboxes is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(message_idnr) FROM ${DBMAIL_PREFIX}_messages"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_messages
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(message_idnr) + 1 FROM ${DBMAIL_PREFIX}_messages"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_message_idnr_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_messages is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(messageblk_idnr) FROM ${DBMAIL_PREFIX}_messageblks"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_messageblks
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(messageblk_idnr) + 1 FROM ${DBMAIL_PREFIX}_messageblks"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_messageblk_idnr_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_messageblks is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_mimeparts"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_mimeparts
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_mimeparts"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_mimeparts_id_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_mimeparts is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_physmessage"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_physmessage
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_physmessage"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_physmessage_id_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_physmessage is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_referencesfield"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_referencesfield
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_referencesfield"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_referencesfield_idnr_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_referencesfield is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(idnr) FROM ${DBMAIL_PREFIX}_pbsp"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_pbsp
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(idnr) + 1 FROM ${DBMAIL_PREFIX}_pbsp"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_seq_pbsp_id', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_pbsp is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(id) FROM ${DBMAIL_PREFIX}_sievescripts"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_sievescripts
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(id) + 1 FROM ${DBMAIL_PREFIX}_sievescripts"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_sievescripts_idnr_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_sievescripts is empty
fi

NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT count(user_idnr) FROM ${DBMAIL_PREFIX}_users"`
if [ ${NEWSEQ} -gt 0 ]
then
	echo Updating sequence for ${DBMAIL_PREFIX}_users
	NEWSEQ=`psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT max(user_idnr) + 1 FROM ${DBMAIL_PREFIX}_users"`
	psql -d ${DBMAIL_DATABASE} --tuples-only -c "SELECT setval('public.${DBMAIL_PREFIX}_user_idnr_seq', ${NEWSEQ} , true)"
else
	echo ${DBMAIL_PREFIX}_users is empty
fi

## Return to MySQL
echo Restoring MySQL structure
mysql ${DBMAIL_DATABASE} --exec "ALTER TABLE ${DBMAIL_PREFIX}_mailboxes
MODIFY seen_flag tinyint(1) NOT NULL default 0,
MODIFY answered_flag tinyint(1) NOT NULL default 0,
MODIFY deleted_flag tinyint(1) NOT NULL default 0,
MODIFY flagged_flag tinyint(1) NOT NULL default 0,
MODIFY recent_flag tinyint(1) NOT NULL default 0,
MODIFY draft_flag tinyint(1) NOT NULL default 0,
MODIFY no_inferiors tinyint(1) NOT NULL default 0,
MODIFY no_select tinyint(1) NOT NULL default 0,
MODIFY permission tinyint(1) NULL default 2;"

mysql ${DBMAIL_DATABASE} --exec "ALTER TABLE ${DBMAIL_PREFIX}_messages
MODIFY seen_flag tinyint(1) NOT NULL default 0,
MODIFY answered_flag tinyint(1) NOT NULL default 0,
MODIFY deleted_flag tinyint(1) NOT NULL default 0,
MODIFY flagged_flag tinyint(1) NOT NULL default 0,
MODIFY recent_flag tinyint(1) NOT NULL default 0,
MODIFY draft_flag tinyint(1) NOT NULL default 0;"

mysql ${DBMAIL_DATABASE} --exec "ALTER TABLE ${DBMAIL_PREFIX}_partlists
MODIFY is_header tinyint(1) NOT NULL default 0;"

mysql ${DBMAIL_DATABASE} --exec "ALTER TABLE ${DBMAIL_PREFIX}_sievescripts
MODIFY active tinyint(1) NOT NULL default 0;"

Data conversion used the Python mysql2pgsql version 1.6 (available on GitHub) and the mysql2pgsql.yml configuration file follows.

# if a socket is specified we will use that
# if tcp is chosen you can use compression
mysql:
 hostname: localhost
 port: 3306
 socket: /tmp/mysql.sock
 username: dbmailusername
 password: password
 database: dbmail
 compress: false
destination:
 # if file is given, output goes to file, else postgres
 file:
 postgres:
  hostname: localhost
  port: 5432
  username: dbmailusername
  password: password
  database: dbmail

# if tables is given, only the listed tables will be converted.  leave empty to convert all tables.
only_tables:
- dbmail_acl
- dbmail_users
- dbmail_mailboxes
- dbmail_physmessage
- dbmail_messages
- dbmail_aliases
- dbmail_authlog
- dbmail_auto_notifications
- dbmail_auto_replies
- dbmail_filters
- dbmail_headername
- dbmail_headervalue
- dbmail_header
- dbmail_keywords
- dbmail_messageblks
- dbmail_mimeparts
- dbmail_partlists
- dbmail_pbsp
- dbmail_referencesfield
- dbmail_replycache
- dbmail_sievescripts
- dbmail_subscription
- dbmail_usermap
- dbmail_envelope

# if exclude_tables is given, exclude the listed tables from the conversion.
exclude_tables:
- dbmail_fromfield
- dbmail_ccfield
- dbmail_tofield
- dbmail_subjectfield
- dbmail_datefield


# if supress_data is true, only the schema definition will be exported/migrated, and not the data
supress_data: false

# if supress_ddl is true, only the data will be exported/imported, and not the schema
supress_ddl: true

# if force_truncate is true, forces a table truncate before table loading
force_truncate: true

# if timezone is true, forces to append/convert to UTC tzinfo mysql data
timezone: false

A few hours well spent.

by Alan Hicks