Backstory
We were operating a legacy system that utilized a DocumentDB v3.6 cluster. Over the years the workload on this cluster increased. While the performance of the cluster remained adequate, the costs skyrocketed because the instance employed had IO based charges in addition to the hourly charges. Even though there was another instance type which had fixed hourly charges it was not available for this older version of the database engine. Consequently, we decided to migrate to DocumentDB v5 engine, which supported instances having this new charging model.
Challenges and preparation
The old cluster had 2TB of data and we were unable to find any details on timing involved with the migration with such a large set of data. So, we had to test it ourselves before moving on.
The old cluster was used by many legacy and newer apps written using Python 2.7 and 3.9. All those apps were using PyMongo library to communicate with the cluster which had discontinued support for Python 2.7. But we were lucky that the latest version available for Python 2.7 was supporting the new version of the DocumentDB engine which we were migrating to. Python 3.x based apps were already using never version of the library and upgrade was not needed.
Another hurdle was the different behavior of the PyMongo library when specifying the database connection parameter “retryWrites”. DocumentDB requires this to be specified but older versions of PyMongo library didn’t require this to be explicitly declared but the newest versions required it explicitly. Because of this we had to update the configurations of all apps which were written in Python 3.x. At the same time, we did that for the older apps too as a precaution.
The migration
We followed the migration guide provided in the AWS blog here to accomplish the task. In summary we did the following.
1. Cloned the existing cluster
2. Carried out an in place major version upgrade of the cloned cluster
3. Synchronized both clusters using Amazon DocumentDB MVU CDC migrator tool
4. Changed over to the new cluster
The migration process was smooth. Even though the old cluster had 2TB of data, the cloning process was quick. After the cloning we managed to upgrade the instance generation and the type which had the new charging model. The migration tool also did a very good job and synchronization was quick too. Changing over to the new cluster was easy because we had a domain name masking the cluster URI. Downtime was minimal and we only stopped traffic to old cluster during the data synchronization.
Final words
We saw a big cost reduction in DocumentDB associated costs because of switching to storage optimized instances which had a fixed hourly charge only. The cluster was stable and performing well. Being able to move to new instance generation was an added advantage which offer more performance for the same cost.
Comments