Jump to content

Bulk Upsert enhancements to Data Migrator


Recommended Posts

internet-8097838.thumb.jpg.49c5d73979375d40f84ee9c13119a71a.jpg

Data Migrator BULK UPSERT

Over the last year ibi has made a concerted effort to improve performance for data flows upserting data to target databases.

The effort has come in the form of updating our adapters for all databases that support merge and upsert. This allows for us to support bulk upserts, improving performance significantly.

In the example flows I will be showing a flow using insert/update to do upserts to a database table and then a flow using a bulk upsert and pointing out the improvement in throughput. 

These first two images show a flow loading data from a table in a MS SQL Server database to a Postgresql database table using insert/update. This is how upserts were done in Data Migrator before Bulk Upserts were available.

upd_flow_1.thumb.png.a71397686820bab85560f55277763420.png

 

upd_flow_2.thumb.png.190c70e7960c0c80b79137924ee6ffcb.png

 

As you can see the log below insert/update process took over 35 minutes to upsert over 2 million rows into the target table.

 

upd_flow_6.thumb.png.60122f60c5bf9538fee08c8c43f91ac6.png

The next two images show the configuration of a flow that utilizes bulk upsert. The flow is configured to be optimized and Bulk Load is used to Merge data into the Target. This leverages the Merge ability of the database and is significantly more efficient than insert/update.

upd_flow_3.thumb.png.fdad0d3f9f70dc85b019a0cfb537e7b6.png

*Note: On the Web Console Optimize is always set on, so this setting change is unnecessary.

upd_flow_4.thumb.png.e8843c1e0e1a289bfc8bd88944b03167.png

 

Both of these flows used the same source table, with target tables that are identical. As you see in the log below, using bulk upsert, upserted over 2 million rows into the table in just over 2 minutes. A significant improvement over insert/update which took roughly 35 minutes.

upd_flow_5.thumb.png.e144e8cd94803ede5ae46861d8e983d4.png

As of 9.3 bulk upsert is supported by the following databases:

MS Azure Synapse Analytics (formerly SQL DW)

MS SQL Server ODBC/AzureDB

MS SQL Server OLE DB/AzureDB

MySQL

Oracle

Teradata

Amazon Redshift

Apache Hive

EXASol

Google BigQuery

Greenplum DB

Hyperstage (PG)

Informix

MariaDB

Netezza

PostgreSQL

Snowflake

Sybase

Vertica

Salesforce.com

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
  • Create New...