Jump to content

ETL Patterns using iWay

Recommended Posts

iWay Service Manager ships with a unique connector which empowers developers to build flows which can implement ETL based operations. Challenges developers face in building ETL based integration solutions often come down to both performance and memory usage. The Dataflow connector is designed to assist in these challenges.


Lets take a simple example where we are going to move database rows from one table to another using the Data Flow Connector.


The Database

Below is screenshot of the MySQL Workbench. We are going to use the Community Edition of MySQL



In this MySQL instance we have two schemas. One named "demo" and the other "demo-target". Each of these schemas have a table named "models" which share the same database schema. In this example we will move the data found in demo.model to demo-target.model.


Building our Flow

To get started we load iWay Integration Tools (IIT) and create an application named "dataflowdemo" as shown below. Next we create a flow named "copydata" and drag and drop a Data Flow object from the Palette to the  Flow Canvas. As shown below IIT indicates that we have a bit more work to do.  We will have to configure the Data Flow connector to get rid of these errors and also draw the proper connections in the flow.


We will start by arranging our objects as shown below:



The Data Flow Connector is actually an iterator which queries the source database in each pass for a specific number of rows (batch) and creates an XML document which its values can be transformed by using an XSLT, iWay transform or external program.


Configuring the Data Flow Connector

Now we need to complete the flow by configuring the connector. 

Click on the "plus sign" to the right of the Data Flow Object's configuration drop down. This will bring up the following dialog:



As we can see this connector needs its JNDI/JDBC Configuration requirements met. We will need to do this for both source and target dbs. 



Depending on your database these values will be different. Since we will be using MySQL yours might look something like this:





Once we have our providers defined were back to completing our dataflow generic configuration. When finished your dataflow configuration should look like:


Notice how we see no errors.. better yet nothing in  "red".

Clicking Finish brings back to the main configuration of the objects itself. Notice how we now have met the connectors configuration requirements.




All which is left to do is to provide some additional configuration properties which is local to the object. This connector requires both a source
and target query.


For this example we have decided to keep it fairly simple.

Type in both the Source Extraction Query

Source Extraction Query
SELECT model,categorycode,name,skucount FROM demo.models;


 and the Target Load Query.

Target Load Query
INSERT INTO demotarget.models (model, name, categorycode, skucount) values (?model,?name,?categorycode,?skucount)


The results of this is that we should not see any errors as shown below.



Once completed we need to include the MySQL JDBC driver the which the Data Flow connector will call to perform the data transfer.

Adding the MySQL JDBC Driver

Double click on the applications bundle will launch its editor. We are interested in the libraries tab as shown below.



Eclipse gives you the option to add a driver found within your workspace or to browse your machine for the driver. We downloaded our driver from Maven Central
and made it available on our local disk which we now bring in to our application. All our application setup and dependencies are now met.


Testing our Application

To test our application we have a couple options. We could IIT's step debugger... or invoke IIT's Test Run capability. Since the debugger is a bit more visual
lets invoke the debugger by selecting the debug icon and clicking on the runtime launcher.  We will use the Debug icon found at the top of IIT as shown below.



Once the debugger is started you should see a view similar to the one below.


By stepping through the flow you should see the variables change within the variable tag as well as the document be transformed via the execution of the flow. On its first pass of the iteration you will see the first extraction document



After the first execution of the Data Flow connector the Source Extraction query is executed and a document is created as shown above. We now can transform the values
accordingly which then feeds back into the connector to be loaded into the target data base using the target load query. We call this a batch. The size of the batch is
configurable within the connector which can greatly impact both performance and memory usage.

If we let this flow complete the end result is that our target database is populated as shown below:




Notice that only the columns which were specified in our target query gets populated. The user has full control of what gets extracted, transformed and loaded.


Follow Up

If you like articles like this... please let us know and comment. Our goal is to provide value to you and your organization in helping you reach your goals.
Feedback is always appreciated.




Edited by Joseph Vogel
  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...