Learn how you can use IBM App Connect Designer to create an event-driven flow that parses the contents of a new comma-separated values (CSV) file that’s uploaded to an SFTP server, creates records in an IBM Db2 database from the parsed contents, and then renames the file to indicate it’s been processed.

Scenario:

You’re using an SFTP server to securely transfer and work with files in the Sales department of your global company. Each day, Sales managers upload sales data for their regions to a folder in the SFTP server for processing. The data is uploaded in CSV files and needs to be extracted and stored as records in your IBM Db2 database.

You’d like to use App Connect to poll the SFTP server periodically to detect new CSV files, retrieve and parse the file contents, and then use the parsed contents to create individual sales records in your Db2 database. And after each file is processed, you’d also like to rename the file on the SFTP server.

SFTP, CSV Parser, and IBM Db2 tutorial flow in App Connect

First, find or create everything you need:

  1. Obtain an App Connect service. (A Lite plan service was used for this tutorial.)
  2. Ensure that you have access to an on-premises SFTP server or cloud-hosted SFTP service, and have the relevant permissions to perform file-based operations. You’ll also need an SFTP client that you can use to transfer files to the server. If you don’t have a paid or business SFTP account (or restrictions are placed on your business account), you can sign up for a free or trial account for a cloud-hosted SFTP service.

    For this tutorial:

    • We used a 30-day trial version of a cloud-hosted SFTP service from BrickFTP, and performed all file-based SFTP operations by using FileZilla.
    • We modified the user settings in BrickFTP to set the time zone to UTC (Coordinated Universal Time) because our time zone setting at the time of writing was Greenwich Mean Time (with a UTC+0 offset). The flow we’re about to create will be triggered when files are added to the SFTP server and requires that their time stamps be read. Therefore, it’s imperative that the time zone of our BrickFTP user account matches the time zone configured in App Connect. (We modified the BrickFTP time zone because we are using the default UTC setting in App Connect. More on this later.)

      Note: If you live in a country where the clocks change (for example, for Daylight Saving Time), you might prefer to change from the UTC time zone to a more appropriate setting at the relevant time of year.

      • Example: Setting the time zone to UTC

        In BrickFTP, we accessed User Settings and edited the settings for the signed-in user to change the default time zone to UTC.
        More information: Users (BrickFTP documentation)

        Setting the time zone in BrickFTP

      • Example: Setting the time zone to accommodate Daylight Saving Time/British Summer Time

        During Daylight Saving Time/British Summer Time, the server’s UTC time in BrickFTP was found to be one hour behind. We accessed User Settings and edited the settings for the signed-in user to change the time zone for Daylight Saving Time/British Summer Time.
        More information: Users (BrickFTP documentation)

        Setting the time zone in BrickFTP for DST /BST

  3. Ensure that you have access to an IBM Db2 server with a database and schema that you can run operations on. For information about the supported Db2 offerings, see How to use IBM App Connect with IBM Db2. You might also find it useful to have a GUI tool that you can use to administer the database; for example, IBM Data Studio. If you don’t have a paid or business account for IBM Db2 (or restrictions are placed on your business account), you can sign up for a free or trial account. You’ll require an IBM account with an IBMid and password to download the product.

    For this tutorial, we used the IBM Db2 database (Windows) 90-day trial version, which was downloaded from the Trial and Downloads page. We also created a database called SAMPLE (with a default db2admin schema) for use. If you are using a non-trial version of Db2, create a database called SAMPLE that you can use in this tutorial (or tailor the steps to suit the database you’re using).

    • Tips for installing the Db2 trial and creating the SAMPLE database

      Here are the high level steps that were used to install and set up our trial Db2 environment. We installed locally as a user logged in to Windows with an Administrator account. If you are using a trial edition and haven’t installed Db2 before, you might find this information useful.

      1. Install Db2:
        1. Extract the contents of the downloaded file to a temporary location.
        2. Go to the extracted_files_dir\SERVER_T\image location, right-click setup.exe and then select Run as administrator.
        3. From the DB2 Setup launchpad, click Install a Product. Then, click Install New (under the Db2 Version v.r.m Server Editions section in the “Install a Product” pane), and follow the installation prompts, accepting the defaults.

          Note: Accept db2admin as the user name for the Db2 Administration Server (DAS) and specify a password for this user. (We’ll use this user name and password later, so make a note of the values.) The default host name and port for connecting to the Db2 instance are given as localhost and 50000.

          At the end of the installation, close the DB2 First Steps window if open. (We’ll verify the installation in the next step.)

        More information: Installing DB2 database servers using the DB2 Setup wizard (Windows)

      2. Verify the installation by creating a SAMPLE database and running SQL commands to retrieve sample data. (We’ll be using this database in this tutorial.)
        1. Start the command line processor in admin mode by selecting Start > All Programs > IBM DB2 DB2COPY1 (Default) > DB2 Command Window – Administrator.
        2. When prompted, log on to Windows using the db2admin user name and password that you specified for the DAS user during the Db2 installation.
        3. Run these series of commands to start the database manager, create the SAMPLE database, connect to the database, and retrieve all employees (from table STAFF) who work in department 20.
          
          db2start
          db2sampl
          db2
          connect to sample
          select * from staff where dept = 20
          
          

        If you have an existing IBM Db2 installation, but do not have a SAMPLE database, you can use the db2sampl command to create one.

        More information:
        Verifying the installation using the command line processor (CLP)
        The SAMPLE database
        db2sampl – Create sample database command

    • Tips for installing and configuring IBM Data Studio client (Optional)

      Here are the high level steps that were used to install IBM Data Studio client and create a database connection for the SAMPLE database. You might find this information useful if you decide to install the client.

      1. Download IBM Data Studio client from https://www.ibm.com/developerworks/downloads/im/data/. (Note that the Data Studio web console is no longer supported.)
      2. Extract the files. Then right-click launchpad.exe and select Run as administrator to start the Data Studio client installation using the IBM Installation Manager Install wizard. (Log on to Windows using the db2admin user name and password when prompted.)
      3. In the IBM Data Studio client window, expand Install or Update Product, and choose the Administrative Installation or Update option to open the Installation Manager. In our case, there was no prior installation of Installation Manager, so that was first installed by following the prompts. On completion, we clicked the Restart Installation Manager button and then clicked Install to install the IBM Data Studio client.

        Installation Manager window

        More information: Installing IBM Data Studio client with the Installation Manager Install wizard

      4. Once installed, open the IBM Data Studio client and connect to the SAMPLE database so that you can browse its objects:
        1. Click Start > All Programs > IBM Data Studio > Data Studio 4.1.3 Client to start the Data Studio client.
        2. From the Data Source Explorer view, click the New Connection Profile toolbar icon.

          New Connection Profile

        3. In the New Connection window, select DB2 for Linux, UNIX, and Windows in the Local tab. Then from the General tab, complete the connection details. Some fields are completed by default, so you should only need to specify the db2admin user name and password, select Save password, and set the schema as DB2ADMIN. Then click Test Connection. (You should see a Connection succeeded message.)

          New Connection details

        4. Click Finish.

          You can now expand the tree view to browse the objects in the SAMPLE database.

          SAMPLE database objects in Data Studio

          More information: IBM Data Studio: Connecting to a database by using a wizard

  4. Ensure that your IBM Db2 server is running and that a database connection exists for the database (SAMPLE) you want to use with App Connect. If the DB2 Command Window is still open from the previous step (where you verified the trial Db2 installation and created the database), you should still be connected to the SAMPLE database; so no further action is required.
    • If the DB2 Command Window is no longer open

      Start the database manager and connect to the SAMPLE database as follows:

      1. Select Start > All Programs > IBM DB2 DB2COPY1 (Default) > DB2 Command Window – Administrator.
      2. Log on to Windows using the db2admin user name and password that you specified for the DAS user during the Db2 installation.
      3. Run the following commands:
        
        db2start
        db2
        connect to sample
        

  5. If the SFTP server or IBM Db2 server is in a private network (for example, behind a firewall), you’ll need to set up a gateway that App Connect will use to securely access that server. You can use the IBM Secure Gateway Client to set up the required network connection for accessing protected data. For this tutorial, we’ll need a private network for our trial Db2 installation, but not for our cloud-hosted SFTP service.

    If you’ve previously used the Secure Gateway Client to set up a network connection for an App Connect application that is on the same private network as the SFTP and Db2 servers, you can use this network connection with these servers.

    • If you don’t have such a network connection in place

      You can configure one as follows:

      1. Download and install the Secure Gateway Client. During this process, you are prompted to set up a network connection with a network name, and an auto-generated Gateway ID and Security token. (The Gateway ID will be used to start the Secure Gateway Client and the Security token provides additional security on startup. Make a note of the network name because you’ll need to specify it later when connecting App Connect to your SFTP or IBM Db2 server.)
      2. Start the Secure Gateway Client. By default, the Secure Gateway Client is denied remote access to the SFTP and Db2 servers. So, to grant the Client access to the servers, you’ll need to configure the Client’s access control list to allow connections to the servers.

        For detailed information about configuring a network connection by using the IBM Secure Gateway Client, see Configuring a private network for App Connect Designer.

  6. Ensure that the IBM Secure Gateway Client has been started. If you need to, you can start the Secure Gateway Client as described in Configuring a private network for App Connect Designer: Finally, start and configure the Secure Gateway Client.
  7. Obtain account details for connecting App Connect to your integrating applications.
    • (If you haven’t already connected App Connect to your SFTP account):
      • The host name/IP address and port number of an on-premises SFTP server (specified in the format http://hostname:port), or the instance host name or custom domain of a cloud-hosted SFTP service (specified in the format mysubdomain.sftpprovider.com).
      • The user name and password that you use to log in to the SFTP server.
      • Optional. The SFTP server’s public key in MD5 format that carries the identity of the server.
      • The name of the network (configured using the IBM Secure Gateway Client) for connecting to an on-premises SFTP server in a private network. Not required if connecting to a cloud-hosted SFTP service.

        For more information about the required values, see How to use IBM App Connect with SFTP.

    • (If you haven’t already connected App Connect to your IBM Db2 account):
      • The host name/IP address and the port number of the Db2 server (specified in the format hostname:port).
      • The Db2 database name.
      • The user name and password that you use to log in to the Db2 server.
      • The schema qualifier for the database objects you want to perform operations on.
      • The name of the network (configured using the IBM Secure Gateway Client) for connecting to an on-premises Db2 server in a private network.

        For more information about the required values, see How to use IBM App Connect with IBM Db2.

    If you haven’t yet connected to your IBM Db2 or SFTP account, you can do so either from the Applications tab on the App Connect Catalog page, or while creating the flow. In this tutorial, let’s connect from the Applications tab.

    • Connection steps

      1. From the App Connect dashboard, click Catalog to view the Applications tab.
      2. If Not connected is shown for SFTP or IBM Db2, click to expand that row and then click the Connect button to enter the account information. For example:

        Example: Connecting to IBM Db2

    • Example of completed fields for connecting App Connect to a cloud-hosted SFTP service

      Example of completed fields for connecting to SFTP

    • Example of completed fields for connecting App Connect to a locally installed Db2 server

      Example of IBM Db2 connection details

    Tip: Immediately after you connect, rename your Db2 account with a meaningful name that helps you identify which schema the account relates to; for example, DB2ADMIN schema in SAMPLEdb. Do the same for your SFTP account; for example, Cloud-hosted SFTP Account. You can rename an account only from the Applications tab and before it’s used in a flow.

    Renaming an account

    For information about renaming accounts and setting up multiple accounts, see FAQ: How do I configure App Connect to connect to my applications?

  8. Create two text files that contain sales data in a tabular format, and which can be added to the SFTP server to trigger the flow.
    1. Using a text editor, create a plain text file named area66_sales.txt with the following comma-separated values:
      
      Sales Date,Sales Person,Region,Sales
      2018-03-12,QueenLiz,London,11
      2018-03-12,PrincePhil,Manchester,9
      2018-03-12,KingLear,Leeds,7
      
    2. Create a second text file named area32_sales.txt with the following comma-separated values:
      
      Sales Date,Sales Person,Region,Sales
      2018-03-12,BullockB,Frankfurt,8
      2018-03-12,CullockC,Berlin,12
      2018-03-12,DullockD,Hamburg,6
      

    We’ll be inserting this data into the SALES table in the SAMPLE database, which has columns that match the column headers in our CSV file (with matching data types). The SALES table contains the following columns: SALES_DATE (DATE), SALES_PERSON (VARCHAR 15), REGION (VARCHAR 15), and SALES (INTEGER). This image of the SALES table is from our IBM Data Studio installation:

    Column names in the SALES table

Next, create your flow:

  1. Log in to App Connect Designer.
  2. From the Dashboard, click New > Event-driven flow.
  3. Complete the following steps. As you progress, App Connect automatically saves your changes. If you navigate away from the flow at any stage, the flow is saved as a draft flow that you can complete at another time.
    1. Enter a name that identifies the purpose of your flow; for example: Parse CSV data to create IBM Db2 sales records.

      SFTP-to-IBM Db2 flow name

    2. Select SFTP as the source application and then click Configure events to see which events can be configured for polling. Expand Folder Items – you’ll see a New or updated Folder item event.

      Discover SFTP events for polling

      We’ll configure this event to monitor the root folder in our SFTP server for file modifications, and trigger the flow if it detects that a new file or folder has been created or updated.

      Note: For the New or updated Folder item event, the polling mechanism will query the following directories for new or updated folder items:

      • If you logged in to the SFTP server (using your App Connect account) as an admin or root user, the directory polled will be the root folder. (In our trial cloud-hosted SFTP service, the account we signed up with is assigned Administrator access by default.)
      • If you logged in to the SFTP server (using your App Connect account) as an non-admin user, the directory polled will be the /user/your_username location.
    3. Click New or updated Folder item and then configure polling as follows:
      1. Ensure modifyTime is selected for both Created timestamp field and Last updated timestamp field. This default value denotes the time stamp for when a file or folder was created or last modified.
      2. Accept the default value in the Format of the timestamps field.
      3. In the Time zone field, accept UTC as the default time zone that App Connect will use to detect time stamp updates that occur after the flow is started. (Remember that in the previous section, we set the time zone on our cloud-hosted SFTP service to UTC.) If you are on a different time zone, ensure that your SFTP server’s time zone and the App Connect time zone specified here match.

        Important: If you live in a country where the clocks change (for example, for Daylight Saving Time), you might need to change from the UTC time zone to a more appropriate setting at the relevant time of year. For example, during British Summer Time (BST), set the time zone to Europe/London if your SFTP server is also on BST.

        BST time zone setting for  SFTP polling

      4. Select 5 from the Check for new ‘folderitem’ every (minutes) drop-down list.
        (We want to check for new or updated folder items every 5 minutes. We’ve specified a short polling interval so we don’t have to wait too long when testing the flow later.)

        Configuration fields for  SFTP polling

    4. Click the (+) and then select SFTP > Files > Retrieve file contents as the action that App Connect should use to retrieve the contents of the plain text CSV file that was added.
    5. Specify the details for the file whose contents you want to retrieve:
      1. Type of file: Select Text from the drop-down list.
      2. Source Path: Click within the field and click Insert a reference Insert a reference icon. Under SFTP / New or updated Folder item in the list of available inputs, select Path. This mapping represents the location where the uploaded file is stored, and resolves to the absolute path of the file (including its name); for example, /area66_sales.txt.

      SFTP field selections for the Retrieve file contents action

      We want to make the CSV data available for mapping in a subsequent action, so we’ll insert a CSV parser node next, and then use this node to define a structure for the parsed data.

      Tip: The CSV parser typically generates schema in the form of a JSON object with comma-separated key/value pairs that are surrounded by curly braces {}. The keys are defined as strings, and their values can be one of the following valid JSON data types: string, number, object, array, boolean, or null. Once generated, these keys will be included in the list of fields that are available for mapping when you click Insert a reference Insert a reference icon in a subsequent action.

    6. Click the (+), go to the Toolbox tab, and then click CSV parser.

      Selecting the CSV parser from the Toolbox tab

    7. Click within the CSV Input field and then click Insert a reference Insert a reference icon. From the list of available inputs, expand SFTP / Retrieve file contents and select File Data. (We want to parse the contents of the file we retrieved.)

      CSV Input field

    8. Expand the CSV Parsing Options and Data Schema section and supply a sample of the CSV data you want to parse:
      1. Select the First row of the CSV data is a header check box. This setting will cause the column headers in the CSV file to be parsed as keys in the JSON schema that App Connect generates.
      2. Next, copy the column headers in one of the CSV files you created earlier (Sales Date,Sales Person,Region,Sales), and paste them into the Example CSV field. Only the first row of the example CSV is used to analyze the CSV format of the schema, so there’s no need to specify any other rows from the file.

      Example CSV fields

    9. To convert the sample CSV into a JSON object, click Generate Schema. You can see the result in the JSON Schema field. Notice that the defined column headers are parsed as keys (or properties), which you can map to later.

      Generated JSON schema from the CSV parser

    10. Browse through the JSON schema. You’ll notice that all properties have been assigned a data type of string. However, we want the Sales property to have a data type of number (to match the integer data type for the SALES column in the SALES database table). So, in the JSON Schema field, let’s overwrite "type": "string" with "type": "number" for this property.

      Tweaking the generated JSON schema from the CSV parser

    11. Click the (+), go to the Toolbox tab, and then click For each. (We want to add a ‘For each’ node that will loop through all the rows in the CSV file and insert a row into the SALES database table, for each row in the file.)

      Selecting For each from the Toolbox tab

    12. Configure the For each node as follows:
      1. Click within the Select the collection of items to process field and click Insert a reference Insert reference icon (square). Then, from the list of available inputs, expand CSV Parser / Parse and select Parsed CSV.
      2. In the For each field, change the default display name of the For each node from CSV Parser Object to Row in the CSV file.
      3. Accept the default selection of Process all items in the collection sequentially. (We want to process the rows in order.)
      4. Select Exit the flow with an error. (It’s important that all the rows are processed, so we’d like to be alerted if an error occurs during processing, and stop the flow at this point.)

        For each field selections

      5. Click the plus (+) within the For each node and then click IBM Db2 > Show More in the Applications tab.

        For each node - IBM Db2 Show More link

        You’ll see a list of all tables and views in the schema you connected to.

      6. Expand SALES and click Create SALES record. (For each row parsed in the CSV file, we want to insert a row into the SALES table.)

        IBM Db2 - Create SALES record option

        We’ll now map the JSON keys that were generated by the CSV parser to the columns in the SALES database table.

      7. Click the Insert a reference icon Insert a reference icon next to the SALES_DATE field. From the list, expand For each: Row in the CSV file and select Sales Date.

        Mapping the Sales Date field

      8. Click within the SALES_PERSON field, click Insert a reference Insert a reference icon, and then expand For each: Row in the CSV file and select Sales Person.
      9. Click within the REGION field, click Insert a reference Insert a reference icon, and then expand For each: Row in the CSV file and select Region.
      10. Click within the SALES field, click Insert a reference Insert a reference icon, and then expand For each: Row in the CSV file and select Sales. (Notice that the SALES field is identified as an integer using the notation 123 because the data type for this column in the SALES database table is INTEGER. And, because in the CSV parser, we modified the JSON schema to set "type": "number" for the Sales property, we can now add a mapping with the correct data type. If we hadn’t modified the JSON schema, we would have seen a warning under the SALES field about a data mismatch.)

        Your completed fields should look like this:

        Completed Create SALES record fields

    13. Click the (+) shown to the right of the For each node. Then, from the Applications tab, click SFTP > Files > Rename file. (Now that the CSV file has been processed, we want to rename the file to indicate this.)

      SFTP Rename file

    14. Click the Insert a reference icon Insert a reference icon next to the Source Path field. From the list, expand SFTP / New or updated Folder item and select Path. (This mapping identifies the path and file name of the new file that was uploaded to the SFTP server, and which has just been processed.)
    15. Click within the Destination Path field, click Insert a reference Insert a reference icon, and then expand SFTP / New or updated Folder item and select Parent Path. Type PROCESSED_, click Insert a reference Insert a reference icon again, and then expand SFTP / New or updated Folder item and select Name. (This mapping defines the parent path and new name for the renamed file.)

      Completed fields for the SFTP Rename file action

    Your completed flow should look like this. (Click the image to view in full size mode.)

    SFTP, CSV parser, and IBM Db2 flow image

  4. From the banner, open the options menu [⋮] and click Start flow. Then click Dashboard to exit the flow. Your flow is displayed on your Dashboard as Running, and is listening for your event – a new CSV file in your SFTP server.

    SFTP-to-IBM Db2-to-SFTP Dashboard tile

Finally, test your flow:

Complete these steps to test the flow. The documented steps relate to file uploads to the root folder of our cloud-hosted SFTP service.

  1. Use your SFTP client to connect to your SFTP server.

    We used FileZilla to connect by specifying the SFTP server host name (prepended with sftp://), the login credentials for the server, and the default port of 22.

    FileZilla credentials for connecting to an SFTP host

    Initially, the root folder on our server was empty.

    Remote SFTP location in FileZilla

  2. Use your SFTP client to upload the area66_sales.txt file to the root folder of the server.

    area66_sales.txt file uploaded to the SFTP server

  3. Check the SALES table in the SAMPLE database to ensure that you can see a record for each of the rows from the CSV file. You can use a SELECT statement in the Db2 Command Window, or use a GUI tool like Data Studio to browse the data in the SALES table.

    Parsed contents of area66_sales.txt in Db2

  4. Go back to FileZilla and refresh your view of the SFTP server. You should see that the area66_sales.txt file has been renamed to PROCESSED_area66_sales.txt.

    area66_sales.txt renamed on SFTP server

  5. Wait for about five minutes and then use your SFTP client to upload the area32_sales.txt file to the root folder of the server.

    area32_sales.txt file uploaded to the SFTP server

  6. Check the SALES table in the SAMPLE database to ensure that you can see a record for each of the rows from the CSV file.

    Parsed contents of area32_sales.txt in Db2

  7. Go back to FileZilla and refresh your view of the SFTP server. You should see that the area32_sales.txt file has been renamed to PROCESSED_area32_sales.txt.

    area32_sales.txt renamed on SFTP server

Congratulations, you have created a flow that parses a CSV file when it is uploaded to an SFTP server, uses the parsed contents to insert rows in an IBM Db2 database table, and then renames the file to indicate it has been processed.

Join The Discussion

Your email address will not be published. Required fields are marked *