Overview

Skill Level: Intermediate

Java/J2EE, Docker, Bluemix essentials

This recipe will show case a design pattern which pulls data from a data stores which in this case are secured ftp servers located in data centers in different regions.

Ingredients

IBM Bluemix account, Skills on Java/J2EE, Docker and Liberty.

Step-by-step

  1. Introduction

    This article discusses a pattern Involving Message Hub on Bluemix where in Data is been pulled by the data proxy application and pushed into respective queues based on the id available in REST URL.  The data store in scenario is through of an secure ftp servers. Due to unavailability of any no cost secure ftp server I have used folders in the applications to pull in the data. In case of real scenarios API’s such as ChannelSftp could be used.

  2. High level data Ingestion Architecture

    SC2

    Steps:

    1. Bluemix workload scheduler service is configured to make REST calls to Proxy application at regular intervals. The proxy application sets the secure ftp URL to point to specific sftp server depending on parameter passed in REST call.

    2. The proxy application polls the directory on ftp server and processes the file available over there after pulling it locally.

    3. The enterprise applications could be different for different regions though the above picture depicts it as one. For e.g on US ftp SAP could be pushing the data, on UK data might be coming from database etc.

    4. The proxy application holds the responsibility of pushing the data to respective topics on message hub. One can do data filteration in proxy app before pushing the data.

    5. The consumer applications will pull the data from these topics for futher processing.

  3. Workload Scheduler Service

    The Workload scheduler service is configured to call REST url’s of various secure ftp servers available in different data centres across the regions. The URL comprises of the keywords as a query string to identify the sftp server and map topics accordingly. The schedules are set per data availability in secured ftp servers which is in sync with schedule of applications that pushes the data in ftp servers. To implement polling kind of mechanism frequent schedules needs to be set. In current case a definite schedule is set as shown below.

    Note: Click on binded Instance of Scheduler service and configure as mentioned below.

     

    SC5

     

    For polling enter Repeat every option. To Invoke the service to pull the data the scheduler will Invoke REST url with the keyword as parameter.

    http://<Service_IP>:9080/DataInjestAppWeb/rest/sharadtestsample/fetchDCData?region=USA

     where Service_IP is a public IP allocated to your docker container

     

    SC3

     

  4. Data Proxy App

    This application has a service which is called by Scheduler and pulls the data from secured ftp server based on the keyword that is sent by in REST url.

    SC4

    The service after pulling the data will push it in respective queues. For instance, each region will have a separate queue in message hub and application or subscriber of Interest will subscribe themselves to these queues. As and when the data will land in these queues, these applications will pull the data for further processing. Below are message hub queues for each application.

    Note: Configure binded Instance of Message Hub service for topic definitions and use its credentials.

    SC6 

  5. Configuring Proxy Application in Bluemix

    The proxy application is a web application deployed in Liberty container. The application requires Message Hub and Workload Scheduler service to be Instantiated in prior and bind to this container to let application and services to communicate with each other properly.

    Note: One should configure and use credentials of binded service Instances.

     

    SC7

    To bind services to Bluemix container below command needs to be used:

    sudo bluemix ic run -P -e “CCS_BIND_SRV=kafka, WorkloadScheduler” -it –name libertycontainer registry.ng.bluemix.net/sharad_container/libertysrv

    Note the CCS_BIND variable which is been used in the command with a comma separated service values.

    To configure the message hub security prerequisites one needs to modify Liberty default server.xml and add JAAS entries to it. One can refer the below link for details:

    https://github.com/ibm-messaging/message-hub-samples/blob/master/java/message-hub-liberty-sample/src/main/wlp/server.xml

    Once the pre-requisites are met we need to create a Dockerfile to modify configuration of existing Liberty profile which we have downloaded from Bluemix.

    We will reference jars of kafkalibs folder in our modified server.xml.

    Copy all the dockerfile dependencies in one folder as shown below, this includes application war file.

    SC81

    We will reference jars of kafkalibs folder in our modified server.xml.

    Copy all the dockerfile dependencies in one folder as shown below, this includes application war file.

     

    SC9

    We are now ready to build and deploy Application container on Bluemix from command line. One can refer Docker reference to build the docker image locally and below link to upload and run container in Bluemix:

    https://console.ng.bluemix.net/docs/containers/container_cli_reference_cfic.html

     SC12

  6. Triggering data Ingestion Service

     When the scheduled job is started it hits the application through REST call. The application pulls the data from respective secured ftp site based on the token it recieves through URL and puts the update in respective message hub queue from where different consumers can pull in the data and process.

    SC14

  7. References

    a) Code : https://github.com/sharadc2001/DataInjestAppWeb

    b) Docker Assets : https://github.com/sharadc2001/MsgHubDI

    c) MessageHubWebSubscriber: https://github.com/sharadc2001/MessageHubSubscriber

    d) Test Data: https://github.com/sharadc2001/MsgHubDI/tree/master/CSVData

    Video Demo:

    https://youtu.be/ewjv1KaNaU0

     

Join The Discussion