Back in 2017, our own Pavel Yarema touched on the best ways to migrate to RDS, with AWS Database Migration Service (DMS) as the top method for migration. In this article, I’ll expand on the benefits of using the service, cover appropriate use cases, and highlight some gaps in the service.
The traditional database move has historically been met with a high level of anxiety from multiple stakeholders in an organization. The business was concerned with losing data during or after the migration. IT was concerned with application uptime. The DBA was rehearsing her migration steps for her zero-dark-thirty process so as to appease both parties with minimal downtime. This started with shutting off traffic to the database, restoring a backup to the new instance, and validating that the data is correct—and this was for a homogenous migration on the same database platform. If you were moving from SQL Server to MySQL, the SQL syntax corrections were required as well.
AWS Database Migration Service solves the migration problem as a managed service in the Cloud.
What is DMS
Just like the name implies, DMS is for migrating data from one database to another. DMS currently supports all of the major database platforms including MySQL, SQL Server, Azure DB, MongoDB, Postgres, etc. with additional platforms added frequently. DMS can also do heterogenous migrations, for example from MySQL to SQL Servers.
Data Migration Uses Cases
Homogenous migrations are historically the happiest of migration paths as you’re moving between database instances using the same database technology. Despite this, exhaustive planning was required to ensure uptime SLAs were met and no data was lost in the process.
Heterogenous migrations meant a team was migrating from say, MySQL, to a completely different database technology, such as SQL Server. Reasons could vary widely, but this process brought with it a host of challenges, the least of which were supported data types between platforms.
Maintaining development environments also brings a need to regularly migrate data to other databases. This is to ensure testing is occurring on accurate, up-to-date, production-like data.
Lastly, and the most relevant of migrations is moving data from a on-premises database to the Cloud. Database services like RDS and Redshift present their own challenges in that they are services and don’t possess every technology feature that we may be used to in a native SQL Server environment, for example.
DMS Use Cases
Before embarking on your migration journey, we highly recommend using the Schema Conversion Tool so you have a clear understanding of which objects are supported in your target AWS database platform. This will ensure your target platform supports what you’re trying to migrate. In it you’ll find an application called the Workload Qualification Framework that will assess and rate the workload for the entire migration.
As these managed services go, the beauty of DMS is the pricing model and the ease of migration via a replication instance. When you’re done migrating you can tear down your replication instance and charges for DMS drop to zero. You can also right-size your replication instance to meet the throughput needs of your source, and shrink it back down as needed. If this is to be a repeatable process for many different databases, DMS is well-supported in CloudFormation.
Below is a typical replication architecture for replication between an on-premises database and RDS. There are two options for connecting to on-premises data sources securely: Peering or VPN.
Replication Tasks are the objects that do all of the work in DMS. In these tasks you specify which tables, columns, or filters you would like to end up in your target.
Filters make it really easy to migrate specific source rows to your target database. This fits scenarios where a department wants to migrate to a data warehouse, with data pertaining only to their department. For example, if you only want to migrate rows from a table where the value in the “DepartmentName” column is equal to “Finance” DMS would migrate only those rows.
Replication Task Limitations
AWS doesn’t currently allow you to schedule DMS Tasks. If you have batch replication needs that should be run on a schedule, you can use a Lambda function and the Boto3 DMS library for starting DMS tasks.
The simplest use case for DMS is batch replication. We’ll often see this when customers are doing a one-time migration to a new AWS platform from an on-premises database and staying there.
When DMS is creating a database in the target for the first time, both Database Manipulation Language (DML) and Data Definition Language (DDL) statements are executed for supported object types. Again, it’s important to understand what is supported in both the target database platform and the DMS service itself so you are aware of any pitfalls.
Batch Replication Limitations
DMS is primarily a data migration service concerned with getting just the data from point A to point B. It won’t migrate anything that isn’t essential to this end which includes “…secondary indexes, sequences, default values, stored procedures, triggers, synonyms, views and other schema objects not specifically related to data migration [insert notation for article] ”.
Also, “Either the source or the target database (or both) need to reside in RDS or on EC2. Replication between on-premises to on-premises databases is not supported. [location of quote]”
Continually replicating data from an on-premises database to RDS via DMS is less supported compared to point-in-time (batch) replication but can still be a useful option.
If you are looking to offload read I/O from your on-premises database to AWS and use of a read-replica is not an option, DMS ongoing replication is a good alternative. You can create a DMS task that kicks off a Change Data Capture process to read change logs from your source database. Depending on the database engine you’re using the exact configuration can vary widely.
Ongoing Replication Limitations
Again, to reiterate the core competency of DMS: it’s focused squarely on data migration. This is important to remember for ongoing replication as well because DDL changes to your source won’t be honored at the target database and could cause replication to stop altogether. For example, adding a column to a table won’t also be created in the replication target database. Further, updating the data type of a column won’t be changed in the target database.
The cost of DMS hangs primarily on the use of your replication instance and the resources it uses.
The cost model is based on a compute instance, storage, and data transfer. Currently US West (Oregon) has on-demand, single-AZ, c4.large instances available at $0.154/hour. Your instance will come with storage already on it, and you can extend storage at a rate of $0.115 per Gb, per month for Single-AZ deployments.
For data transfer costs, if you are migrating data to and from AWS databases the data transfer costs are free in the same Availability Zone. Anything you transfer out to a different AZ, Region, or outside of AWS will have charges associated with it.
Here’s how the monthly math breaks down for one replication instance and one target RDS instance both in us-west-2a (same availability zone), replicating from on-premises:
|Single-AZ C4.large Replication Instance (720 hrs)||$110.88|
|100 GB Storage||Free|
|Transfer to replication Instance from on-prem||Free (Data in from Internet to AWS)|
|Transfer from Replication Instance to RDS||Free (within AWS Network)|
* VPN Charges not included
In summary, DMS solves the headache of migrating data to a growing list of supported database types without the cost and time we were historically used to. Whether you’re tasked with homogenous or heterogenous migrations, a move to the Cloud, or simple ongoing replication, DMS can be the lowest cost option with the lowest risk.
For a deeper dive on the problems DMS solves, check out the AWS resources page for step-by-step migration playbooks, blog posts, FAQs on the service.