Google Cloud Storage is an online file storage web service for storing and accessing data on Google Cloud Platform infrastructure. The service combines the performance and scalability of Google’s cloud with advanced security and sharing capabilities.
Information Server provides a native Google Cloud Storage Connector to read / write data from the files on Google Cloud Storage and integrate it into the ETL job design.
We demonstrate a sample use case here which performs a write operation on Google Cloud Storage using Google Cloud Storage Connector. The datastage job includes a DB2 connector as source stage and a Google Cloud Storage Connector as target, where the data from DB2 is written to a file stored on Google Cloud Storage, moving data from OnPremise environment on to cloud.
In this recipe, I will show you how we can configure Google Cloud Storage Connector properties to write data to Google Cloud Storage from DB2.
Configure DB2 connector as source
1. Provide Database, Username and Password details for DB2, in the connection properties of DB2 Connector as follows:
2. Select Generate SQL option to auto-generate the select statement.
3. Provide the DB2 Table name, where the data to be read is present.
Configure Google Cloud Storage Connector Properties to write to Google Cloud Storage
1. Download the Google service account credentials json file and copy it to any location on Engine tier.
2. Provide the fully qualified path to the above json file under Credentials file in the Connection Properties as follows:
3. Select the Write Mode as “Write” and provide the Bucket name to which the file has to be written. If bucket doesn’t already exist in the Google Cloud Storage, it can be created during the job run by selecting Create Bucket option as “Yes”.
4. Provide the File Name property to which data has to be written from DB2.
5. Choose the File format as Delimited. Six file formats are supported currently: Delimited, CSV, Parquet, Avro, JSON, Excel. Any file format can be selected as per the requirement.
6. Once the file format is selected, optional formatting properties such as delimiters, quotation mark etc can be provided as per the usage requirement.
7. Under Input tab, provide the column name and type details of data, that needs to be written to Google Cloud Storage as follows:
8. Compile and run the job. The data from DB2 table is written to the file on the Google Cloud Storage.
When Datastage is configured to run on multiple nodes, multiple files will created with node number appended to the filename as <filename>.0, <filename>.1.