Migrate Amazon RDS (MySQL) Data from Azure MySQL — MyDumper, MyLoader

Amar Daxini
2 min readNov 15, 2021

It’s the third part of a series of our cloud migration journey

Database is a critical part of any application. We had to make sure that we plan the migration carefully since we had more than 300–400GB of data and the entire data migration and validation should be completed within 6 to 7 hours of the cutover time.

In this blog, I will discuss the data migration approach and key learning.

Few more things to be noted before we start the migration.

  • We had MyISAM for some of the tables which we need to move to Innodb since azure only supports Innodb.
  • We also observed that we used different charset at various places so we need to standardize to utf8 in azure.
  • Few tables where containing large data.

Azure Data Migration Service (DMS)

Initially, we explored DMS and it is a great tool but we hit some limitations.

Security was major concern, we don’t want to open our database to the public both aws and azure side.

They also need all tables in InnoDB format. We can do table migration at aws but it will be problematic and time-consuming so we decided to go back to older ways of migration which is script-based dump and restore approach.

Script Base Dump and Restore Approach

It’s easier to manage and have more control and we can run this script parallel

Approach

  • Created VM in which IP is whitelisted at RDS security group and since it’s inside out vnet so it can easily access our database.
  • Create schema for MySQL table in which we have removed different charset and remove the index and move MyISAM to Innodb.
  • We restore Schema on Azure Database.
  • Get Database dump: Initially, we tried with msyqldump but it’s not good for large tables and databases and we want to reduce time so we moved to MyDumper, MyLoader which gives us parallelism and performance both.
  • Since this operation is very high IO so during migration we increase our IOPS limit on both sides.
  • After we got data from MySQL rds which took not more than 20–40 min max but restoring data took almost 4–5 hours
  • We also tune a few configurations from azure MySQL to optimize further, a few parameters we identify based on our data. Some of them are listed below
log_bin = OFF
innodb_strict_mode = OFF
innodb_io_capacity_max = 8000
innodb_io_capacity = 8000
innodb_write_io_threads = 4
innodb_read_io_threads = 4
innodb_log_file_size = 127418953172
innodb_max_dirty_pages_pct = 1
innodb_flush_log_at_trx_commit = 0
  • After migration some of parameter is reverted and restarted mysql
log_bin = ON
innodb_strict_mode = ON
innodb_log_file_size = <256MB>
innodb_max_dirty_pages_pct = 75
innodb_flush_log_at_trx_commit = 1

There are a few problems we encounter while using Azure Mysql Architecture.

We were using PHP for a few applications and it’s a framework and doesn’t have a connection pooler. So every request open a new connection which is heavy and due to Azure Mysql Architecture being a little bit different from AWS(Out of the box connection pooling) we have added ProxySQL and PgBouncer as a connection pooler to our applications.

--

--

Amar Daxini

15+ years of experience who enjoys building large scalable products & platforms. Passionate about startups, working with new and emerging technologies.