Overview

Skill Level: Beginner

To be able to create Datastage parallel jobs.

This recipe helps to configure Snowflake connector write job to perform Updates, Deletes and dual mode queries like Delete then Insert and Insert then Update when using Merge statements.

Ingredients

  1.  IBM Infosphere Information Server Datastage 11.7fp1 and above
  2.  Snowflake Connector patch (JR61390)
  3.  Snowflake datawarehouse  account.

Step-by-step

  1. Description

    In the initial version of Snowflake connector , DML statements like Update, Delete, Insert then update and Delete then insert were executed using the SQL prepared statements. Snowflake JDBC driver executes these prepared statements in a sequential manner thus hampering the performance of job execution.

    Snowflake Datawarehouse supports Merge statements using which table Updates, Deletes and the dual mode queries can be executed in a faster manner. Merge statements in Snowflake connector are executed by reading data in the staging area.

    This recipe explains about the Snowflake connector properties that have to be set to leverage Merge statements for DML operations.

    To use the merge statement functionality in Snowflake connector , input data is first written into staging area using the PUT command and later Merge statement is executed by reading the files in staging area.

  2. Configure Snowflake Connector write job

    1. Create a Snowflake connector write job.

    2. Select either Update, Delete, Insert then Update or Delete then Insert for the ‘Write mode’ property based on the requirement.

    3. Select ‘Yes’ for ‘Use merge statement’ , which is the default value too.

    4. Provide the name of the target table in the ‘Table name’ property.

    5. Under the ‘Load from file properties’ property¬† –

    ¬†¬†¬†¬†¬†¬†¬† a. Provide the path where the staging files should be created in the ‘Directory path’ property. This path should be an accessible location for datastage user on the engine tier.

    ¬†¬†¬†¬†¬†¬†¬† b. Select ‘Yes’ to create a new staging area for this job or select ‘No’ to use an existing staging area.

    ¬†¬†¬†¬†¬†¬†¬† c. Under the ‘Staging area format type options’ property specify the format options of the staging area. These format options will be used when running the Merge statement.¬† If an existing staging area is being used then these format options will overwrite the format options used when the stage was created.
    For example , if the staging area was created with comma(,) as the field delimiter and if the user intends to use pipe(|) as the field delimiter and specified the same in ‘Field delimiter’ property then during execution of Merge statement the data from the staging area is read using pipe(|) as the field delimiter. Similar is the case with other format options.

    ¬†¬†¬†¬†¬†¬† d. Specify the staging area name in the ‘Staging area name’ property , which can either be an existing staging area name or a new one which would be created during job run.

    ¬†¬†¬†¬†¬† e. Specify ‘Yes’ or ‘No’ for ‘Purge files after loading’ property based on whether the files in the staging area should be purged or retained at the end of job run.

         f. Run the job.

                                Snowflake Connector properties

  3. References

    Writing data using Snowflake Connector ‚Äď

    https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.7.0/com.ibm.swg.im.iis.conn.snow.usage.doc/topics/writing_data_parent_snow.html

    Configuring Snowflake Connector as target ‚Äď

    https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.7.0/com.ibm.swg.im.iis.conn.snow.usage.doc/topics/specifying_write_mode_snow.html

     

    The Readme for the patch JR61390 contains the list of fixes done in Snowflake connector after it was first released.

Join The Discussion