Overview

Skill Level: Intermediate

This guide describes the concept of using NoSql DB instead of RDBMS to store transient data of IBPM processes to accelerate development and reduce complexity. It has been created for Digital Integration Practice (DIP) at IBM.

Ingredients

IBM BPM 7.5+ Standard version

Any NoSql DB provider (Cloudant over IBM Cloud is mentioned in this example)

WAS admin access

Step-by-step

  1. Problem Overview

    IBM BPM architecture normally consists of integration with two databases. IBM BPM uses RDBMS as its database for internal data persistence called as Process DB. For transient and application database, system modeler can choose from various relational database products.

    In almost all use cases, for an application database connection JNDI is configured and same is used by BPM services for CRUD operations throughout the process application. These CRUD services have implementation to parse the input business object, create sql queries, execute the queries and parse the result set into desired business object as per the requirement of workflow. Based on the complexity of database schema, complexity of SQL queries and encapsulating services varies.

    This As-Is approach with application DB with relational database has hard dependency with schema. For every table/ entity specific CRUD service need to be written and for any changes in schema all CRUD services need to be changed. Development of CRUD services takes significant amount of time and efforts as its specific for each table. For Read services, based on complexity of database and joins associated with it, select query can become very complex.

    In this article, we look at the implementation of NoSql databases as application database with IBM BPM. With implementation of NoSql DB as application database, efforts /cost of CRUD operation can be significantly reduced and complexity of the BPM services will be reduced to measurable extent

  2. Case Study with real world problem

    Consider a logical data model for a typical banking system which is used for persisting case data in application database. Once data is persisted in DB in case creation, case data will be retrieved by activities in BPM BPD instances as a part of business processing workflow.

    For a below DB model, ‚ÄėCase‚Äô table acts as parent entity and holds basic case related data and¬† rest of tables act as children. Since a Case can have multiple parties associated with it and each party can have multiple Accounts and Addresses and party related data.

    A sample normalized data model structure is depicted in below diagram for a typical case journey implemented in IBM BPM.

     DB_Structure

     

     Since this data base will be populated and data retrieved from IBM BPM services, various services will be written in IBM BPM process application to support CRUD operations. Let’s take a look at what activities/ technical changes required to use this data model in practice with RDBMS approach.

    • Insert operation
       For a scenario with such data model, insert operation will require individual insert queries and all need to be run in same transaction to maintain atomicity. Sample queries to insert data in DB will be as below.

     

    INSERT INTO “CASES” (CASE_ID,CASE_TYPE,CASE_STATUS,CREATED_TIME_STAMP,LAST_UPDATED_TIMESTAMP,CASE_CREATED_USER) VALUES (?,?,?,?,?,?);

    ¬†INSERT INTO “PARTY” (PARTY_ID,CASE_ID,PARTY_FIRST_NAME,PARTY_MIDDLE_NAME,PARTY_LAST_NAME,PARTY_DOB,PARTY_TYPE_CODE,START_DATE,END_DATE,CREATED_USER,CREATED_TIMESTAMP,LAST_UPADATED_TIME_STAMP) VALUES (?,? ,?,?,?,,?,?,?,?,?,?);

    ¬†INSERT INTO “PARTY_EMAIL” (CASE_ID,EMAIL_ADDRESS_TYPE,EMAIL_ADDRESS_SEQUENCE_NUMBER,EMAIL_ADDRESS) VALUES (?,?,?,?);

    ¬†INSERT INTO “ADDRESS” (CASE_ID,PARTY_ID,ADDRESS_TYPE,FIRST_LINE,SECOND_LINE,THIRD_LINE,FOURTH_LINE,POST_CODE,COUNTRY) VALUES (? ,?,?,?,?,?,?,?,?);

    ¬†INSERT INTO “ACCOUNT” (ACCOUNT_NUMBER,CUSTOMER_NUMBER,RELATIONSHIP_START_DATE,PARTY_ID,CASE_ID) VALUES (?,?,?,?,?);

    • Select Operation
      For a back-office operation, when case data need to be retrieved based on case id, joins will need to be put on all the required tables to fetch the data. If the complexity of tables increases, correspondingly joins and complexity of the query will also increase.

    A sample query to get all the case details based on case Id will be as follows ‚Äď

     

    SELECT CASES.CASE_ID AS “CASE ID”,

    CASES.CASE_TYPE AS “CASE TYPE”,

    CASES.CASE_STATUS AS “CASE STATUS”,

    PARTY.PARTY_FIRST_NAME AS “FIRST NAME”,

    PARTY.PARTY_MIDDLE_NAME AS “MIDDLE NAME”,

    PARTY.PARTY_LAST_NAME AS “LAST NAME”,

    PARTY.PARTY_DOB AS “DATE OF BIRTH”,

    PARTY.PARTY_TYPE_CODE AS “TYPE OF PARTY”,

    ACCOUNT.ACCOUNT_NUMBER AS “ACCOUNT NUMBER”,

    ACCOUNT.CUSTOMER_NUMBER AS “CUSTOMER NUMBER”,

    ADDRESS.ADDRESS_TYPE AS “TYPE OF ADDRESS”,

    ADDRESS.FIRST_LINE AS “FIRST LINE OF ADDRESS”,

    ADDRESS.SECOND_LINE AS “SECOND LINE OF ADDRESS”,

    ADDRESS.THIRD_LINE AS “THIRD LINE OF ADDRESS”,

    ADDRESS.FOURTH_LINE AS “FOURTH LINE OF ADDRESS”,

    ADDRESS.POST_CODE AS “POST CODE”,

    ADDRESS.COUNTRY AS COUNTRY,

    PARTY_EMAIL.EMAIL_ADDRESS_TYPE AS “EMAIL ADDRESS TYPE”,

    PARTY_EMAIL.EMAIL_ADDRESS AS “EMAIL ADDRESS”

    FROM CASES

    JOIN PARTY ON CASES.CASE_ID = PARTY.CASE_ID

    JOIN ADDRESS ON CASES.CASE_ID = ADDRESS.CASE_ID AND ADDRESS.PARTY_ID = PARTY.PARTY_ID

    JOIN PARTY_EMAIL ON CASES.CASE_ID =  PARTY_EMAIL.CASE_ID

    JOIN ACCOUNT ON CASES.CASE_ID =  ACCOUNT.CASE_ID AND ACCOUNT.PARTY_ID =  PARTY.PARTY_ID

    WHERE CASES.CASE_ID = ?

    Based on the number of tables in data model from which data is to be retrieved, and number of columns associated, above query will add more join condition and query will become complex.

    Change in data model
    If there is change in requirement and to meet the new specifics, data model need to be uplifted. To perform this, various activities need to be worked out.

    Ex- As per new requirement a report should reflect how many cases are created for each process application in BPM. As a solution, process application name need to be persisted in Case table. To incorporate this change following activities need to be worked out.

    • Data model change is required to incorporate new column ‚ÄúProcess_Application_Name‚ÄĚ in Case table with nullable constraint.
    • Insert Services will need to be updated to accommodate following changes
      • Business object changes to include new variable to hold ‚ÄėProcess Application Name‚Äô.
      • Data mapping of new variable to query creation services
      • Insert query will change to include new column name.
    • Select/Retrieve service will need to be updated to accommodate following changes
      • Business object changes to include new variable to hold ‚ÄėProcess Application Name‚Äô.
      • Select query will change to include new column name and provide alias for mapping.
      • Parsing services will be changed to map result set to business objects.
      • Data mapping may need to be revised.

     On the same lines if column is needs to be updated in the flow, update services will need to be updated for following changes

    • Business object changes to include new variable to hold column name
    • Data mapping of new variable to query creation services
    • Update query will change to include new column name.

    With the usage of NoSql DB as application database CRUD services will have to created once and  all above steps won’t be required if there is change in data model.

  3. Advantages for NoSql database as Application database over RDBMS approach

    • No hard dependency on DB schema. BPM cargo object structure since its decided at design time, same can be used as-is to persist data in No-Sql DB.
    • If any change is required in object structure, minute changes will be required in CRUD services for data mapping.
    • For any change required in object schema, accommodating this change would require changes only at UI and interface layer. Service and persistence layer will remain immune to object schema change.
    • Since no queries need to be written, developer need not worry about query performance and implement logic by assuming that the services will return the data as per object structure.
    • CRUD services need to be written only once and can be reused without any change despite object structure changes.
    • Scalability of database is easier.
    • Build efforts for services development for CRUD operation can be saved at great extent.
    • DB skills are not mandatorily needed for BPM practitioners
    • Caching to DB results can be done easily to reduce unnecessary REST calls.

     

    Below Logical diagram shows implementation design for NoSql Database over RDBMS approach

    Approach

  4. Solution Implementation Step 1. How to set up and create a database in No-Sql DB provider over BlueMix.

    For illustration purpose a Cloudant DBaaS on BlueMix is used. However same solution can be implemented on other No-Sql DB products.

    • Login into IBM Cloud (Bluemix)

    Navigate to Catalogue and look up Cloudant NoSQL DB and hit Create.

    CreateCloudantDB

    • Once created Click on ‚ÄėLaunch‚Äô button on top right corner to launch the DB service.
    • Click on Databases on left menu and click on ‚ÄėCreate Database‚Äô link in top right corner

    CreateDB

     

    • Enter the database name when prompted and create the database.
    • On database manage screen , click on ‚ÄėPermission‚Äô on left menu to create a authentication key. Click on Generate API Key

    GenerateAPI

    • Make a note of key and password on this screen, once navigated away from this screen password cannot be viewed again.
    • This key and password need to be shared with BPM application developer and BPM will execute the REST calls by providing these credentials in Authentication header.
    • Make a note of Host name in address bar and same need to be invoked by BPM for Rest operation.
    • For BPM rest calls, Base URL will be https://<hostName>/<databaseNameCreated>/
  5. Step 2. How to add Cloudant host as Trusted host in BPM ‚Äď WAS

    • Use this link to set up SSL certificate in WAS of BPM for Cloudant¬†
    • Use the host url noted in Step 1 and use port no as 443
    • Restart Websphere server
  6. Step 3. How to use it in BPM services

    Create reusable services for For DB operations such as Insert, Update, Select, Delete.

    Below service reference can be used to create reusable services.

    • Create General system service with inputs as request body and identifier (With cloudant documents ‚Äė_id‚Äô is auto generated id and can be used as primary key for document store and retrieve).

    ServiceIOExample

     

    • In service implementation drag and drop server script from pallet and in script implementation write below code for GET and POST service

     

    ServiceDef

     

    • To Retrieve a document from Cloudant based on unique id , below code sample will be used. Code Below uses HTTP client to make REST call to ¬†Cloudant host. An id is passed as request parameter to retrieve documents from Cloudant.

     

    // Initilization

    var client =  Packages.org.apache.commons.httpclient.HttpClient();

    tw.local.baseUrl = tw.env.REST_URL;

    // For Get Call

    if(tw.local.method == ‘GET’){

    ¬†¬†¬† var url = tw.local.baseUrl+”/”+tw.local.id;

        var method = new Packages.org.apache.commons.httpclient.methods.GetMethod(url);

        //Setting up header

    ¬†¬†¬† method.setRequestHeader(new Packages.org.apache.commons.httpclient.Header(“content-type”,”application/json”));

    ¬†¬†¬† method.setRequestHeader(new Packages.org.apache.commons.httpclient.Header(“Authorization”,

    ¬†¬†¬† “Basic ZGVhbG1heWV0ZXJzYXlyZXN0ZXJnZXR1OjUyYTkwZTg4MGMzYzk5ZmJlNWRkOGZkYmYwZTdkN2Q4ZDQ5NzkzMzA=”));

        //REST call

        tw.local.responseCode = String(client.executeMethod(method));

        tw.local.getResponseBody =  method.getResponseBodyAsString();

    }

     

    • To create an record in Cloudant ,¬† service implementation will be created with below code. It uses a http client and request is sent in body to create a document. Request to be sent in body will be received from calling systems/ services.

     

    // For POST call

    if(tw.local.method == ‘POST’){

        var url = tw.local.baseUrl;

        var postRequestBody =  tw.local.postRequestBody;

        var postMethod = new Packages.org.apache.commons.httpclient.methods.PostMethod(url);

        // Setting headers

    ¬†¬†¬† postMethod.setRequestHeader(new Packages.org.apache.commons.httpclient.Header(“content-type”,”application/json”));

    ¬†¬†¬† postMethod.setRequestHeader(new Packages.org.apache.commons.httpclient.Header(“Authorization”,

    ¬†¬†¬† “Basic ZGVhbG1heWV0ZXJzYXlyZXN0ZXJnZXR1OjUyYTkwZTg4MGMzYzk5ZmJlNWRkOGZkYmYwZTdkN2Q4ZDQ5NzkzMzA=”));

        // REST call

        postMethod.setRequestBody(postRequestBody);

        // Passing the response object to output variables

        tw.local.response = String(client.executeMethod(postMethod));

        tw.local.postResponseBody =  postMethod.getResponseBodyAsString();

    }

  7. Disclaimer on usage of NoSql DB as application database

    This approach can beused as an alternative  to use NoSql DB in IBPM applications where data persistence is requirement for process workflow

    • Design efforts to design services, Business objects will still be required.
    • In this example, REST calls are executed with Java script, but in IBM 8.6 onswards, external services can also be used.
    • An application can have a large and complex object structure and dealing with large objects can have following challenges. Design strategy need to be optimally formulated to map such objects to documents to be stored in DB for below scenario
      • Putting entire big object structure in DB may not be good idea as updating a small variable can cause updating entire object each time update is made. Which can increase complexity
      • Moreover, breaking the object into various nested object to leaf level may cause excessive REST operation over DB.
    • If there are any upgrade in object structure, coexistence of different release snapshots will be difficult as GET operation of older snapshots will fail because of BO incompatibility. Such cases need to be handled with common coexistence pattern for IBM BPM.
    • BPM applications where huge common data model is used , developers should create reusable services in toolkits and can be used across all applications. Complex queries in such big data model will be avoided by using NoSql.
    • Processes where requirement is not to store application related business data in DB but only be shown to user on UI, application can be created as headless implementation and BPM will make rest calls to persist the data before instigation of process and at human tasks, UI layer will directly call REST GET services to get the data from DB and show it on screen
    • Proper design strategy and implementation will be required to tackle few tricky scenarios in this implementation such as
      • Application related to reporting where data is fetched based on specific DB data columns, In such cases contrary to RDBMS way where simple SQL query can provide aggregate values of DB columns, in NoSql way; code / logic need to be written to fetch and parse the data to show it in desired results.
      • Performance issues due to process DB queries can still arise¬† as BPM uses RDBMS as its persistence storage for internal operation. And queries related to product can get very complex which causes performance issues at many occasions. In such cases proper data purging strategy and best practices should be used to mitigate this.

1 comment on"IBM Integration with NoSQL Database as application database"

  1. PramodDPant March 15, 2018

    Good Article explaining step by step process to integrate No-SQL with BPM

Join The Discussion