How to migrate from Oracle to Postgres on AWS?
Introduction
Since open source is getting popularity and become a new trend most of the companies in the world are moving from license-based software to open source for saving license costs and taking advantage of new-age technologies. PostgreSQL is one of open source database which is getting popular day by day. Many of the organizations migrating their existing databases to PostgreSQL database or choosing PostgreSQL their database option for new applications. I have done the migration for a couple of databases from oracle to PostgreSQL for CRM applications on the AWS cloud environment. Oracle to PostgreSQL database migration is very challenging and complex, it requires lots of refactoring and deep analysis of schema’s objects.
If you are looking to migrate it manually by yourself on-premises then ora2pg is an open-source tool that is used to migrate an Oracle database to a PostgreSQL compatible schema. It connects to the Oracle database, scans the database for its all objects, extracts schema’s structure and then generates SQL scripts that you can be applied into your target PostgreSQL database.
If you are looking to migrate your database to the AWS cloud then AWS provides tools to help you to make your migration smoother and easier.
PostgreSQL database feature
Compatible on several operation systems
Compatible with various programming languages
Highly extensible
Highly Reliable
Secure
The article will describe how to migrate from Oracle to PostgreSQL on the AWS RDS. Since it is heterogeneous migration of data engine. It involves the conversion of schemas as well.
The two parts of a database migration are the schema conversion and data replication. We will see how to use the AWS Schema Conversion Tool (AWS SCT) and the AWS Database Migration Service (AWS DMS) to tackle these two parts. Depending on the type of database migration you are doing, you can use several AWS tools and services to migrate data from an external database to AWS. You may find that the native migration tools for your database engine are more effective.
You would need the following prerequisite
- Set up target PostgreSQL database on AWS RDS or on Aurora PostgreSQL
- Download and install the AWS Schema Conversion Tool (AWS SCT) on your windows server which can be either on-prem or EC2 windows server on AWS
- Download the JDBC driver for your source database (Oracle)
- Download the JDBC driver for your target database (PostgreSQL)
AWS Schema Conversion Tool (AWS SCT)
AWS SCT is a tool that can converts schema from one database engine to another database engine. AWS SCT automatically converts schema from source to target database and generate an assessment report. The AWS SCT assessment report is a high level overview of the effort required to convert your source database to the target database. This report explains how much of schema can be converted automatically and how much of schema have to be converted manually. You need to check and take the necessary action based on the recommendations provided. This report provides details of objects with complexity for manual conversion.
Sample of how assessment report looks like
Once you setup SCT and provided drivers for your source and target databases.
Create a project in SCT and specify the source and target database.
Provide driver details for both source and target database as below.
Go to global settings-> Driver
Then connect to the source and target database specifying their connection details and generate a migration assessment report as explained above.
AWS Database Migration Service (AWS DMS)
AWS DMS migrates your data from your source database into your target database for any database engine. AWS DMS captures data manipulation language and supported data definition language changes that happen on the source database and applies these changes into the target database. By doing so, AWS DMS keeps your source and target databases in sync with each other.
Steps to perform data migration on AWS from source to target database
- Create your AWS RDS database instance for PostgreSQL and configure it.
- Install and setup SCT on your windows server
- Download drivers for oracle and PostgreSQL databases.
- Once you setup SCT and provided drivers for your source and target databases.
- Create a project in SCT and specify the source and target database.
- Then connect to the source and target database specifying their connection details and generate a migration assessment report for specific schema or for all schemas in the database.
- Create your schema in the target PostgreSQL database in the AWS RDS.
- Validate your schema, tables, and other objects connecting to DB. Also, perform manual steps recommended by the assessment report. Manual steps involve refactoring your database objects, table’s columns data type, and other steps according to your target DB.
- Drop foreign keys and secondary indexes on the target PostgreSQL database.
- Disable triggers in the target database.
- Create a replication instance.
- Create an IAM role with proper permission and attach it to the replication instance.
- Create the source and target database endpoint.
- Set up a DMS task to replicate your data – either full load or both full load and change data capture (CDC).
- Stop the task when the full load phase is complete, and recreate foreign keys and rebuild secondary indexes.
- Enable the DMS task.
- Check the logs generated by tasks in the Cloud watch service to see if there are any issues in data migration.
Schema creation in the target database can be done by AWS SCT. Once you covert schema from the AWS SCT tool. You can use AWS DMS for data replication from source to target database.
Now that we have prepared the schema on the target PostgreSQL database, we can start replicating the data on the target database. This is where DMS comes in. The great thing about DMS is that it does not only replicate data for each table, but it keeps that data up to date with CDC mode until you’re ready to migrate.
So to replicate data create a replication instance in the DMS service console in AWS.
Provide the necessary details to create a replication instance.
Note: Please choose the proper instance type and size depending on the workload.
Once the replication instance is created. Create source and target endpoint in the endpoint section of DMS service
Now setup Database migration tasks in AWS console. Click on create a task
Choose the replication instance and endpoint details created before. Now choose the migration type as per your need. Whether you want to migrate existing data or existing data with current changes as well.
Select value for Target table preparation mode like Do nothing or Truncate
Under Table mappings, select which schema you want to migrate and choose Add selection rule and transformation rule. The selection rule can be used to select a portion of schema, specific tables, eliminating unused columns, etc. The transformation rule can be used to transform table attributes like uppercase, lowercase, renaming multiple tables, etc.
You can directly modify the JSON file as well.
Note: check upper and lower case for tables name in source and target database .use mapping rule to convert lower to upper case or vice versa.
Now create a task. When the task is created, it can be started automatically if the checkbox starts on is enabled. You can monitor task progress in the DMS console by selecting the task and clicking on the Table statistics tab. When the full load is complete and cached changes are applied to the database, the task gets stop on its own.
If there is an error in a task like a table error please check cloud watch logs to see details of the error. If you are able to identify error then correct data as per error and restart the task.
Your table loads are now complete. Now please review the task logs in the AWS DMS console to make sure that there are no errors in the task. The next task is applying CDC, which applies changes in the order that they occurred in the source database.
Recreate foreign keys and any secondary indexes that were dropped earlier from the generated SQL script because the parent tables are updated before child tables in the target database.
Hold off enabling triggers until the migration complete because triggers can update data coming from the source.
That’s how data migration is completed by DMS. In some cases where DMS can include a few wrinkles, most of which are possible to troubleshoot without too much difficulty
Objects that are not supported by PostgreSQL.
- Bitmap, reverse key, join, and global indexes are not currently supported
- PostgreSQL does not have packages
- Synonyms are not currently supported
- PostgreSQL does not support view with read only option
Oracle vs PostgreSQL
- PostgreSQL’s PL/pgSQL procedural language is similar in many respects to Oracle’s PL/SQL. Both are block-structured, imperative languages, with similar formats for assignments, loops, and conditionals statements.
- Sequences have a different syntax in Oracle and PostgreSQL and will need to be updated manually
- A function that returns current dates and times are different in oracle and PostgreSQL. For example, current_date() function returns the current date in the oracle database while now() function returns the current date in the PostgreSQL database.
- in the oracle to_date function is used to return or convert the date data type while in the PostgreSQL to_timestamp in used to return or convert the date data type
- Time zone and date formats are different in the Oracle and the PostgreSQL DB
- RETURN is used to return value in the oracle database while RETURNS is used to return value in the PostgreSQL database
- Dual table present in the oracle database while it is not present in the PostgreSQL database
Verification and validation
Once the DMS task is completed. Login to your target database and verify the database objects in the schema. In case something is missing you need to manually rebuild it.
For example, in our case, we have rebuilt many indexes, modified a few data types. We have had
Logs like dbms_output.put_line in many stored procedures and functions in oracle that we have to manually rewritten like raise debug, raise a warning, raise note in PostgreSQL.
We have changed many below settings in the PostgreSQL configuration file postgresql.conf on the RDS server. These settings can also be applied through the pg_settings table in the database.
Changing search path parameters
Setting permissions
Setting certain performance parameters Setting different memory parameters
If you need more information in converting different objects for the database. Amazon has provided a playbook that provides detailed information about each object type, Data type, etc. Click here for more information.
Once verification of all schema’s objects are completed then start testing for database components. Testing the whole application and the migrated database is necessary because some of the functions are different in both databases, so the behaviour for those functions can be different.
Anticipate business impacts
Lastly, you need to anticipate how the migration will impact other business processes. Making sure the project timing doesn’t interfere with other business projects, as will determining how much downtime your business can withstand during the migration.
If your application cannot handle large downtime then do the full load migration in advance and the plan CDC changes on the day of cutover.
Conclusion
Migrating from one database to another database is not quite easy migration, it seems to be very challenging for heterogeneous migration. Oracle is pretty much compatible with PostgreSQL but still needs to work on some database objects. For Oracle to PostgreSQL migration AWS SCT and AWS DMS help a lot to make migration smooth. AWS provided playbook gives deep insight for not compatible database objects.
This article would help you to better understand the database migration process and strategies to follow, and make your cloud journey less challenging on AWS.
I hope you would have liked this article. Happy learning!!!