Skill Level: Beginner

It is required to know how to create Bluemix web applications, connecting web services to them, and basic database administration.

A step-by-step tutorial to synchronize data between an on-premise and an on-demand database using Talend Data Integration and an IBM Bluemix application.

This tutorial is brought to you by

-Nora Castillo
-Jerson Rivas
-David Heredia


- An IBM Bluemix application

- A cloud database service

- A Database Management System (and a database) stored on your computer


  1. Create a Bluemix application

    First, create a PHP application using IBM Bluemix.

    A database service such as ClearDB must be connected to the application.


    In order execute the web application properly, do not forget to create a file called “options.json”, and include it in a folder named “.bp-config”.

    The options.json file must include the following code:

    "PHP_EXTENSIONS" : ["mysqli"]}
  2. Create a Talend Data Integration project

    Create a new Talend Data Integration Project:

    Click on “create new project”, and then, “Create”


    Then, Tick “Select an existing project” to select the previously created project and after that, click “Finish”.



  3. Create a connection for each database

    After the project is loaded, go to the Repository Panel and expand the Metadata menu. Then, right click on “DB Connections” and select “Create Connection”.


    It is important to create an on-premise database to store the data that would be uploaded to the on-demand database when performing an update.

    For this example, the on-premise database was called “Provider”, and the on-demand database was named “Datawarehouse”.

    Write the on-premise database name and then click “Next”:


    Then, insert the database Type and the database credentials

    For this example, the database management system that was used to create the “Provider” database was MySQL. So, that will be the DB Type.

    Then, click “Check” to confirm if the connection was successful.



    If so, click OK, and then, Finish.

    After creating a connection for the created for the on-premise database called “Provider”, We should proceed to create another for the on-demand database (Datawarehouse) in a very similar way to the first connection:


    Then, write the database credentials and click “Finish”.


  4. Extract Schemas from the two databases

    First, right click the “Datawarehouse” database connection and select “Extract Schema”.


    A window will appear.


    Click Next.

    Then, select the database and the tables you wish to transfer. In this case, all tables will be selected.


    After that, click “Finish”.


    Finally, extract the schema from the “Provider” database (right click on the database name displayed in the repository panel on the left, select “Extract Schema”, select the tables you wish to work with, click Next, and finally, click Finish)

  5. Synchronize!

    Go to the Repository Panel on the left, and select “Job Design”. Right click on it, then, select “Create Standard Job”.


    The following window will show up:


    Write the job name. In this case, it was called “etlFromProvider”. Then, click Finish.

    Go to Provider (Database) in the Repository Panel. Expand “Table Schemas” and select the table you are going to work with.

    Drag the table to the main area, and the Components window will pop up. Select tMySqlInput from the Components Window, and click OK.


    It will look this way:


    After that, expand “Datawarehouse” (name of the on-demand database) from the repository menu, and find “Table Schemas”. Expand it and drag a table to the main area.


    Select tMysqlOutput, and click OK.

    The diagram will look like the following one:


    Go to the palette, expand it, and select tMap.


    Drag tMap to the main area.

    Right click the Providers table icon in the main area. Select Row, and then, Main. Drag the arrow to the tMap component.


    Right click tMap, select Row, and then, select New Output. Drag the arrow to the Datawarehouse table.



    Give any name to the output.

    This would be the result:


    Double click the tMap component

    Then, click on Automap, on the upper left corner.


    All table rows should be related. Click on APPLY and then, OK.

    Finally, execute the process by clicking RUN:


    If the process ran successfully, the following message would appear:


    And the synchronized tables will be updated

Join The Discussion