Overview

Skill Level: Any Skill Level

This recipe helps you create, configure, compile, and execute a DataStage job that can write data from DB2 on to Google BigQuery

Ingredients

1. IBM Infosphere Information Server Datastage 11.7.1 and above

2. DB2 database

3. Google BigQuery Account

Step-by-step

  1. Description

    BigQuery is Google’s fully managed, petabyte scale, low cost analytics data warehouse that enables users with super-fast SQL queries using the processing power of Google’s infrastructure.

    Information Server provides a native BigQuery Connector to read/write data from/to the tables on BigQuery and integrate it into the ETL job design.

    We demonstrate a sample use case here which performs a write operation on BigQuery using BigQuery Connector. The datastage job includes a DB2 Connector as source stage and a BigQuery Connector  as target, where the data from DB2 is written to a table stored on Google BigQuery, moving data from OnPremise environment on to cloud.

    Screen-Shot-2019-08-09-at-12.19.42-pm

    In this recipe, I will show you how we can configure BigQuery Connector properties to write data to Google BigQuery from DB2.

  2. Configure DB2 connector as source

    1. Provide Database, Username and Password details for DB2, in the connection properties of DB2 Connector as follows:

    Screen-Shot-2019-08-09-at-12.22.30-pm

     

    2. Select Generate SQL option to auto-generate the select statement.

    3. Provide the DB2 Table name, where the data to be read is present.

    Screen-Shot-2019-08-09-at-12.22.50-pm

  3. Configure BigQuery Connector Properties to write to Google BigQuery

    1. Download the Google service account credentials json file and copy it to any location on Engine tier.

    2. Provide the fully qualified path to the above json file under Credentials file in the Connection Properties as follows:

    Screen-Shot-2019-08-09-at-12.29.17-pm

    3. Provide the Schema name property in which the table has to reside.

    4. Provide the Table name property to which data has to be written from DB2. If the table doesn’t already exist, it will be created during the job run and then the data will be written to the table. If the table already exists, it will be appended with the data from the job.

    Screen-Shot-2019-08-09-at-12.34.15-pm

    5. Under Input tab, provide the column name and type details of data, that needs to be written to Google BigQuery as follows:

    Screen-Shot-2019-08-09-at-12.40.21-pm

    6. Compile and run the job. The data from DB2 table is written to the BigQuery table.

    When Datastage is configured to run on multiple nodes, each node writes a chunk of data in parallel to the BigQuery table.

    Screen-Shot-2019-08-09-at-1.04.12-pm

  4. References

    https://cloud.google.com/iam/docs/creating-managing-service-account-keys

    https://cloud.google.com/bigquery/

Join The Discussion