2021 Call for Code Awards: Live from New York, with SNL’s Colin Jost! Learn more

Using Db2 native connectivity in IBM DataStage SaaS

IBM DataStage SaaS is a true cloud-native SaaS data integration service re-imagined and re-built on IBM Cloud Pak for Data as a Service.

Native connectivity enables IBM DataStage jobs to transfer data between IBM DataStage SaaS and data sources without the hassle of configuring database connections, installing the driver/database client libraries, setting up additional environment variables, or patching to resolve compatibility.

Native connectivity enables out-of-the-box use of performant connectors to easily design integration flows in IBM DataStage SaaS.

Learning objectives

The purpose of this tutorial is to create native Db2 connectivity and work with the data connection on IBM DataStage SaaS.

In this tutorial, you will learn how to:

  • Create a Db2 connection
  • Work with data connection and run an IBM DataStage flow

Prerequisites

You will need an IBM Cloud account with IBM DataStage service instance to complete this tutorial.

For instructions on creating a project and working with an IBM DataStage flow, please refer to the Using the new IBM DataStage SaaS beta tutorial.

About the data

For this tutorial, we made use of an existing Db2 data connection. Please substitute details pertaining to data connection and its containing data with your own.

Estimated time

Completing this tutorial should take about 10 minutes.

Steps

Step 1. Navigate to project on IBM Cloud Pak for Data as a Service

  1. Create a project or navigate to an existing project on IBM Cloud Pak for Data as a Service.

For additional instructions on creating an IBM Cloud account, creating a project, or creating an IBM DataStage flow, please refer to the Using the new IBM DataStage SaaS beta tutorial.

Navigate

Step 2. Creating your data connection

  1. Click Add to Project in the upper-right corner.

Add to project

  1. Click on Connections to create a data connection.

Connections

  1. There are two options available for Db2: Db2 and Db2 (Optimized). Db2 is the platform-common connectivity connector, and Db2 (Optimized) is the IBM DataStage native connector. Clicking on the connection will provide the connector description and service compatibility information.

Click Db2 (Optimized) connector, then click Next.

Optimized

  1. Enter details for your Db2 (Optimized) connector.

Details

The Db2 (Optimized) connector eliminates previously required lengthy installation. You can connect simply by entering the database name, hostname, and port number.

Connect

For credentials, you can use username and password or an API key. Once the credentials are entered, click Create.

Credentials

This creates your Db2 (Optimized) connection as a data asset in your project.

Connection created

Step 3. Working with the data connection on the IBM DataStage flow editor

  1. Navigate to an existing DataStage flow or create a new flow under Projects > Assets.

Navigate to DataStage

  1. Open the DataStage flow, click on Connectors, and drag Db2 (Optimized) onto the canvas.

Drag Db2 (Optimized)

  1. Double-click on the connector to launch the connector details. Edit the connector name as desired (was named Db2_optimized_connector for this tutorial).

Click the Stage tab to see the connection properties, then click on the drop-down for Select connection, which displays all data assets for the project. Selected Db2_Optimized_DataStage created earlier.

Data assets

  1. Click the Output tab, then click on the Edit button under Columns.

View tables

This allows you to view tables and specify columns desired for use from connections defined under this project.

  1. Click Load in the upper-right corner.

Import Columns

This launches Import Columns. Select the connector you are interested in under Connections, and all schemas will be displayed. Select a schema to see all available tables. Click on the Preview icon to the right of every table to launch data details for the table. Preview allows you to see data directly on IBM DataStage without going to the database itself.

Preview

Select the DEPT_FROM_COS table to preview column details and data.

Preview column details

  1. Select the DEPT_FROM_COS table and click Next.

Select DEPT_FROM_COS

  1. You can choose to remove unnecessary columns before importing by clicking Import.

Remove columns

After import, your desired columns will be loaded.

After import

  1. To run a job with the connector, click back to the Stage tab and enter the schema and table name into the table name field. You can then create a flow and run a job using your IBM DataStage native Db2 connection.

Run using DataStage connection

For simple flow, attach a peek after Db2_optimized_connector and click Run to run the job.

Simple flow

Summary and next steps

In this tutorial, you learned how to create an IBM DataStage-native Db2 connection, view schema and data from the connection, and design and run a simple IBM DataStage flow job using a created connection. Learn more by visiting IBM DataStage.