Have you ever wondered how to easily store analyzed data from Streams into a relational database? Streams 4.2 includes the open source JDBC toolkit that allows you to do just that. This toolkit is an alternative to the existing database toolkit that was present in previous Streams releases. The JDBC toolkit is easier to use than the database toolkit because it requires much less configuration, so it is a great way to get started. The following article will provide you with an easy step by step guide on how to integrate this toolkit with IBM DB2 database in various environment.

Skill level

To follow along in this article, you need a basic understanding of Streams and relational databases.

Prerequisites

You will need:

  • Streams 4.2 and above, which includes the JDBC toolkit. Get the Streams Quick Start edition. For Streams 4.0.x and 4.1, download the JDBC toolkit release 1.0 from GitHub.
  • A JDBC compliant database. This article will use DB2 Express-C edition.
  • JDBC Connection information.
    • JDBC driver library for your database. Download the JDBC JCC driver for DB2
    • JDBC Driver class name and connection URL for your database. Depending on the type of database source you are connecting to, the format of the JDBC URL and driver information will be different. This table lists the JDBC URL format and driver information for some databases, but is by no means exhaustive.  Consult your database vendor’s documentation to find the correct driver library, driver class name and URL.  For DB2, the URL is of the form:
      jdbc:db2://hostname:port/database_name

      and the classname is:

      com.ibm.db2.jcc.DB2Driver
  • (Optional) Username and password for the DB2 server, if needed
  • (Optional) To run the sample application in this article, you need to create a database called “CUSTDB”

    $ db2 create db custdb
    Database 1 entry:
    Database alias = CUSTDB
    Database name = CUSTDB
    Local database directory = /home/kandu
    Database release level = 10.00
    Comment =
    Directory entry type = Indirect
    Catalog database partition number = 0
    Alternate server hostname =
    Alternate server port number =

 

The JDBC toolkit

As was mentioned earlier, the JDBC toolkit supports any JDBC compliant database. It allows you to execute standard SQL statement using the JDBCRun operator, which is the only operator in this toolkit. You simply specify the JDBC connection information mentioned earlier, and the SQL statement and its parameters using the statement and statementParamAttrs parameters. When a tuple is received on the required input port, the operator runs the SQL statement.

For example, you can Insert a row in a table:

stream <insertSchema> insert = JDBCRun(create){
    param
        jdbcDriverLib: $jdbcDriverLib;
        jdbcClassName: $jdbcClassName;
        jdbcUrl: $jdbcUrl;
        jdbcUser: $jdbcUser;
        jdbcPassword: $jdbcPassword;
               statement: "INSERT INTO JDBCRUN_SAMPLE (ID, FNAME, LNAME, AGE, GENDER, SCORE, TOTAL) VALUES (?, ?, ?, ?, ?, ?, ?)";
              statementParamAttrs: "ID, FNAME, LNAME, AGE, GENDER, SCORE, TOTAL"; 
}

 

As you can see, all that is needed are the connection information for the database, and the SQL statement you want to execute. The operator expects that the input stream will contain attributes that match the column names in the statement, and that any columns in the result set of the statement will have corresponding attributes on the operator’s output stream.

One powerful feature of the JDBCRun operator is that the SQL statement to be executed does not have to be specified at compile time, but it can be bound to an attribute on the operator’s input stream. In the snippet below, the operator receives a generated SQL statement and executes it:

  stream<selectSchema >genSelect = Functor(insert){
    output
        genSelect: sql="SELECT ID, FNAME, LNAME, AGE, GENDER, SCORE, TOTAL FROM JDBCRUN_SAMPLE";
    }
        
  stream<rsSchema> select = JDBCRun(genSelect){
    param
        jdbcDriverLib: $jdbcDriverLib;
        jdbcClassName: $jdbcClassName;
        jdbcUrl: $jdbcUrl;
        jdbcUser: $jdbcUser;
        jdbcPassword: $jdbcPassword;
        statementAttr:   sql;
    }

Trying it out

The above snippets are taken from the JDBCRunSample application that is included in the toolkit. This sample application creates a table named JDBCRUN_SAMPLE in the default schema and then it will insert one row to the table. Finally, a SELECT statement will retrieve the result set and output it to a file.

Here is the application graph for the JDBCRunSample application:

 

20f157c6ea2bbaad
To see the operator in action, follow these steps to compile and run the sample in Studio.

  1. Import the sample into your workspace. In Streams 4.2, the project is located at $STREAMS_INSTALL/samples/com.ibm.streamsx.jdbc. If you downloaded the toolkit from GitHub, the sample is in the samples directory.
    jdbc sample app
  2. If you haven’t already, download the JDBC JCC driver and copy the db2jcc.jar to the opt directory in your project.
    jccdriver
  3. Build and compile the sample application. If you are not using Studio, you can simply run “make” from the command line.
  4. Submit the job, providing the following submission variables :
    jdbcDriverLib : JDBC driver library. (/opt/db2jcc4.jar)
    jdbcClassName : the JDBC driver class Name (com.ibm.db2.jcc.DB2Driver)
    jdbcUrl : JDBC connection string URL (jdbc:db2://server:port/CUSTDB)
    jdbcUser : user name
    jdbcPassword : password
    
  5. Verify the table was created and the inserted record is present in DB2 database:
    jdbc4

 

That’s it! That is all that is needed to communicate with an external database using the JDBC toolkit.

 

You can see the full source for this sample on GitHub.

Using the JDBCRun operator in Bluemix environment

Using the JDBCRun operator to connect to a database running in the Bluemix environment does not require any additional configuration.  You just need to get the JDBC connection information required by the operator as discussed in the “Prerequisites” section above and configure the operator with that information:

  •  Update the jdbcUrl, jdbcUser and jdbcPassword parameters of the operator with the correct values for your database instance.
  •  Set the jdbcDriverLib and jdbcClassName with the correct values for your database vendor.

If, for example, you are using the dashDB for Analytics service, here is how to get your credential information:

  •  From your Bluemix dashboard, go to the dashDB service, and click “Service Credentials”. If no credentials are listed, click “New Credentials”, accept the defaults, and click “Add”. Once you have credentials created, click “View Credentials”.
    ddb

You should see a JSON doc with connection information. Note down the jdbcurl, username and password attributes from this JSON doc.

dashdb

  • Modify your application and replace the jdbcUrl, jdbcUser and jdbcPassword parameters of the JDBCRun operator with the correct information you have obtained from the Bluemix dashDB credential service.

 

Once you have updated the JDBCRun operator with those parameters, you can compile and run your application and submit it to the Streaming Analytics service via the Streams Console:

  • From the dashboard of the Streaming Analytics service in Bluemix, Click “Launch” to launch the Streams console.
  • Select “Submit Job”:
  • Browse to the location of the compiled .sab file as discussed earlier and click “Submit”.

Once the job completes, you can open up the dashDB service’s dashboard and verify your data is saved as expected.

 

For dashDB, the jdbcClassName and jdbcDriverLib are the same as for DB2, so you can use "com.ibm.db2.jcc.DB2Driver" for the class name and the same db2jcc4.jar as the driver.

Connecting to other databases

The Bluemix catalog features database offerings other than dashDB, so consult the documentation for the specific database service you are using to get the right credentials, JDBC driver and JDBC driver class name.

 

Sample application for Bluemix

Here is a simple application that you can use to test your connection to the dashDB for Analytics service in Bluemix. The application connects to a REST endpoint to retrieve weather data, computes the average temperature, visibility, and wind speed, and then inserts that data into a dashDB database, which is running in Bluemix.

Assuming you have a running instance of the dashDB for Analytics service, here is the “CREATE TABLE” SQL statement you need to deploy in dashDB before running the application:

create table weather(maxtemp int, mintemp int, maxtempfeelslike int, mintempfeelslike int, avgvisibility int, avgwindspeed int, decision char(150))

To run the application, first download it from GitHub.  In WeatherAnalysis.spl, you will notice that some of the JDBCRun operator’s parameters are empty:

stream<CalculatedAvg, tuple<rstring notice>> JDBCRunInsert =
                JDBCRun(DecisionMaking)
{
  param
    jdbcClassName : "com.ibm.db2.jcc.DB2Driver" ;
    jdbcDriverLib : "opt/db2jcc4.jar" ;
    jdbcUrl :"" ;
    jdbcUser : "" ;
    jdbcPassword : "" ;
    statement :"insert into weather(maxtemp,mintemp,maxtempfeelslike,mintempfeelslike,avgvisibility,avgwindspeed,decision) values (?,?,?,?,?,?,?)" ;
    statementParamAttrs:"MaxTemp,MinTemp,MaxTempFeelsLike,MinTempFeelsLike,AvgVisibility,AvgWindSpeed,notice" ;
    }

Follow the aforementioned steps to get your database credentials, and set the values of the parameters. Then you can compile the application and submit it to Bluemix.

 

Conclusion

This has been a brief demonstration of how to use the JDBC toolkit to connect to external JDBC compliant databases.

Reference links

Join The Discussion