IBM DataStage SaaS is a true cloud-native SaaS data integration service re-imagined and re-built on IBM Cloud Pak for Data as a Service.
Native connectivity enables IBM DataStage jobs to transfer data between IBM DataStage SaaS and data sources without the hassle of configuring database connections, installing the driver/database client libraries, setting up additional environment variables, or patching to resolve compatibility.
Native connectivity enables out-of-the-box use of performant connectors to easily design integration flows in IBM DataStage SaaS.
The purpose of this tutorial is to create native Db2 connectivity and work with the data connection on IBM DataStage SaaS.
In this tutorial, you will learn how to:
- Create a Db2 connection
- Work with data connection and run an IBM DataStage flow
You will need an IBM Cloud account with IBM DataStage service instance to complete this tutorial.
For instructions on creating a project and working with an IBM DataStage flow, please refer to the Using the new IBM DataStage SaaS beta tutorial.
About the data
For this tutorial, we made use of an existing Db2 data connection. Please substitute details pertaining to data connection and its containing data with your own.
Completing this tutorial should take about 10 minutes.
Step 1. Navigate to project on IBM Cloud Pak for Data as a Service
- Create a project or navigate to an existing project on IBM Cloud Pak for Data as a Service.
For additional instructions on creating an IBM Cloud account, creating a project, or creating an IBM DataStage flow, please refer to the Using the new IBM DataStage SaaS beta tutorial.
Step 2. Creating your data connection
- Click Add to Project in the upper-right corner.
- Click on Connections to create a data connection.
- There are two options available for Db2: Db2 and Db2 (Optimized). Db2 is the platform-common connectivity connector, and Db2 (Optimized) is the IBM DataStage native connector. Clicking on the connection will provide the connector description and service compatibility information.
Click Db2 (Optimized) connector, then click Next.
- Enter details for your Db2 (Optimized) connector.
The Db2 (Optimized) connector eliminates previously required lengthy installation. You can connect simply by entering the database name, hostname, and port number.
For credentials, you can use username and password or an API key. Once the credentials are entered, click Create.
This creates your Db2 (Optimized) connection as a data asset in your project.
Step 3. Working with the data connection on the IBM DataStage flow editor
- Navigate to an existing DataStage flow or create a new flow under Projects > Assets.
- Open the DataStage flow, click on Connectors, and drag Db2 (Optimized) onto the canvas.
- Double-click on the connector to launch the connector details. Edit the connector name as desired (was named Db2_optimized_connector for this tutorial).
Click the Stage tab to see the connection properties, then click on the drop-down for Select connection, which displays all data assets for the project. Selected Db2_Optimized_DataStage created earlier.
- Click the Output tab, then click on the Edit button under Columns.
This allows you to view tables and specify columns desired for use from connections defined under this project.
- Click Load in the upper-right corner.
This launches Import Columns. Select the connector you are interested in under Connections, and all schemas will be displayed. Select a schema to see all available tables. Click on the Preview icon to the right of every table to launch data details for the table. Preview allows you to see data directly on IBM DataStage without going to the database itself.
DEPT_FROM_COS table to preview column details and data.
- Select the
DEPT_FROM_COStable and click Next.
- You can choose to remove unnecessary columns before importing by clicking Import.
After import, your desired columns will be loaded.
- To run a job with the connector, click back to the Stage tab and enter the schema and table name into the table name field. You can then create a flow and run a job using your IBM DataStage native Db2 connection.
For simple flow, attach a peek after Db2_optimized_connector and click Run to run the job.
Summary and next steps
In this tutorial, you learned how to create an IBM DataStage-native Db2 connection, view schema and data from the connection, and design and run a simple IBM DataStage flow job using a created connection. Learn more by visiting IBM DataStage.