Contents


Overview

Skill Level: Any Skill Level

INTERMEDIATEBasic knowledge of 1. IBM Watson IoT Platform2. Apache Spark3. IBM IoT Real-Time Insights4. SPSS Modeler

Recipes to enhance Analytics in IBM Watson IoT Platform Before you proceed, evaluate the following analytical recipes that suites your need.Introduction The total amount of data produced by IoT devices and systems is humongous and arriving with a very high velocity. However more than 90% of this data gets lost unless it is analyzed. One […]

Step-by-step

  1. Recipes to enhance Analytics in IBM Watson IoT Platform

    Before you proceed, evaluate the analytical and cognitive recipes from the list that suites your need. Click on the respective image below,

    llist-of-analytical-recipes            llist-of-cognitive-recipes  

  2. Introduction

    The total amount of data produced by IoT devices and systems is humongous and arriving with a very high velocity. However more than 90% of this data gets lost unless it is analyzed. One way of performing this analysis is by setting threshold which would trigger an action to be taken once it is breached. This can be seen by the danger zone readings as shown in the time-series data shown below. 

    However, this approach is at best a reactive approach and at worst simply futile (as the event has already occurred).

    The real benefit of this massive amount of data, produced by IoT, lies in performing a real-time analysis on it so to discover trends and patterns and to use these patterns to predict the failures in a timely manner (as can be seen by the unexpected temperature dip above). One of the mechanisms of performing this analysis is through the usage of Predictive analytics.

    Predictive analytics encompasses a variety of statistical techniques from predictive modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future. The core of predictive analytics relies on capturing relationships between explanatory variables and the predicted variables from past occurrences, and exploiting them to predict the unknown outcome. It is important to note, however, that the accuracy and usability of results will depend greatly on the level of data analysis and the quality of assumptions.

  3. Design and Architecture

     This recipe explains how one can integrate IBM Watson Machine Learning service with IBM Watson IoT Platform to detect a temperature change before it hits the danger zone. Similar approach can be taken to apply other type of analytics. 

    The following diagram shows various components involved in the integration. architecure-1

    A device with, a temperature sensor, keeps publishing the events in the IBM Watson IoT Platform. In absence of an actual device, we have provided a simulator which keep pumping in the events. 

    Multiple receivers, running in the Apache Spark service, subscribe to these events and make ReST calls to the SPSS model deployed in the Watson Machine Learning service.

    The SPSS stream is built on top of the SPSS streaming time series expert model. Based on the input data, it finds the most suitable time series forecast model and trains the model automatically during the scoring time. The first 50 data points are used for training and the model will adjust itself over time. The stream is deployed in Watson Machine Learning service. Through API call, the service will return the next few forecasts based on the input.

    When real time data reading is received, the Spark streaming job gets the next few forecasts from Watson Machine Learning service. It also calculates the z-score (aka, a standard score indicates how many standard deviations an element is from the mean) to indicate the degree of difference in the actual reading compared to the forecast readings.

    A z-score can be calculated from the following formula

    z = (X - µ) / σ

    where z is the z-score, X is the value of the element, µ is the population mean, and σ is the standard deviation

    Since the forecast is a trend indicator, a bigger difference than the normal range would indicate a sudden change of value. So in a way, the z-score is being used as an indicator of predict an outside the acceptable threshold event happening. Thus the z-score can be used in RTI rule to determine when an alert needs to be raised. A larger value filters out smaller spikes and dips.

    This section, explained in brief, the architecture of the components, as well as, gave a very brief introduction to the theory behind the statistical model, used in the recipe. The next sections provide you with a hands-on approach of carrying out the recipe in your Bluemix environment.

  4. Deploy the Model in IBM Watson Machine Learning Service

    In this step, we will deploy the model in the Watson Machine Learning Service running in IBM Bluemix. 

    IBM Watson Machine Learning is a service in Bluemix that makes it easy for developers and data scientists to work together to integrate predictive capabilities with their applications. Built on IBM's proven SPSS analytics platform, Watson Machine Learning allows one to develop applications that make smarter decisions, and improve user outcomes. It exposes set of ReST APIs that can be called from any programming language to predict the score. 

    What is a Predictive Model?

    A predictive model is a mathematical function that learns the mapping between a set of input data variables (usually bundled into a record) and the target variable (response). We can use the IBM SPSS Modeler to create Predictive models. IBM SPSS Modeler is a data mining and text analytics software application built by IBM, used to build predictive models and conduct other analytic tasks. It has a visual interface that allows users to leverage statistical and data mining algorithms without programming. 

    In this Recipe, we use a time series expert model that predicts the forecast temperatures. Based on the input data, it finds the most suitable time series forecast model and trains the model automatically during the scoring time. The first 50 data points are used for training and the model will adjust itself over time.

    To keep things simpler, a model is already made available in the GitHub

     Deploy the model

    1. Open your browser and go to Bluemix. If you are an existing Bluemix user, log in as usual. If you are new to Bluemix you can sign up for a free 30 day trial.
    2. Go to Bluemix and click on Catalog followed by Watson Machine Learning service (This is under Data and Analytics Category) as shown below,
      watson-ml-service

    3. Type a name for the service and for now keep the service unbound, we can later bind this service. Click the Create button.
    4. Click on the Dashboard tab to load the SPSS Model that is already built.
    5. Download the time series expert model from this link and drop it to the Watson Machine Learning service as shown below,ml-drop
    6. Specify the context id and deploy the model.

    Retrieve access key

    In order to invoke the model, one needs to obtain an access key of the Watson Machine Learning service.

    • Goto Bluemix dashboard and open the Watson Machine Learning service that is created now,
    • Click on View Credentials button to get the access_key and url of the ML service, You might observe a screen as shown below,
      ml-credentials
    • Note down the access_key and url for later use. 

    In this step, we have successfully deployed the SPSS model to the Watson Machine Learning service in the Bluemix. In the next step, we will start the data simulator that generates the temperature data at specified intervals.

  5. Register your Device(s) In Watson IoT Platform

    In order to send the temperature data (IoT sensor data), we need to register the device(s) first, in the IBM Watson IoT Platform. This section guides you in the same.

    Carry out the steps present in this recipe to register your device(s) in IBM Watson IoT Platform. When the device is registered, you will be provided with the registration details shown below. Make a note of them, we need these details to connect the device to Watson IoT Platform later.

    Generate API Key and Token of Watson IoT Platform

    In order to connect Apache Spark service and IoT Real-Time Insights(RTI) service to IBM Watson IoT Platform to receive device events and results, we need to generate the API key and token first. This can be achieved by carrying out steps present in this section – Generate API Key in Watson IoT Platform.

    Note down the Key and Token, we need these later to connect Spark application and RTI to Watson IoT Platform.

    At this step, we have successfully created the IBM Watson IoT Platform service, registered the device(s) in it and generated the API Key.

  6. Publish Temperature Data

     In this step, we will publish the temperature events to IBM Watson IoT Platform so that change in temperature values can be predicted beforehand.

    1. Download and install Maven and Git if not installed already.

    2. Clone the iot-predictive-analytics repository as follows:
      git clone https://github.com/ibm-messaging/iot-predictive-analytics-samples.git 
    3. Navigate to the DeviceDataGenerator project “cd iot-predictive-analytics-samples/DeviceDataGenerator” and build the project using maven,
      mvn clean package
    4. This will download all required dependencies and starts the building process. Once built, the sample can be located in the target directory, with the filename IoTDataGenerator-1.0.0-SNAPSHOT.jar.
    5. Modify the device.prop file present in the target/classes directory by entering the following device registration details that you noted in the previous step:
      Organization-ID = <Your Organization ID>
      Device-Type = <Your Device Type>
      Device-ID = <Your Device ID>
      Authentication-Method = token
      Authentication-Token = <Your Device Token>

      (Note: options must be modified based on your device registration)

    6. Run the data generator sample using the following command:

      mvn exec:java -Dexec.mainClass="com.ibm.iot.iotdatagenerator.IoTDataGenerator"
    7. Observe that the device connects to IBM Watson IoT and publishes the simulated temperature data (from the testDataSet file). 

    Viewing your device and events in Watson IoT Platform

    1. Open the Watson IoT Platform service that you created in the above step “Register your Device(s) In Watson IoT Platform” and click Launch Dashboard.
    2. Select Devices tab and observe that your device is connected to Watson IoT Platform.
    3. Click on the device to view the sensor events published by the simulator. To view an individual event, click on the

    In this section, we have successfully started a device sample. Lets consume and process these events by creating the Apache Spark application in the next section.

  7. Create Spark Streaming service

    In this step, we will create the Scala notebook application to onboard the device events to Apache Spark service and invoke the Watson Machine Learning service.

    Setup the Apache Spark service in Bluemix

    1. In the Bluemix Catalog go to the Data and Analytics section and select Apache Spark service.
    2. This service can be bound to your existing IoT application or left unbound.
    3. Click Create.
    4. After the Spark service is created, in case the service is bound to an application, click on the Apache Spark service. If its unbound, then no need to click on the Apache Spark service. The following UI is shown.
    5. Notebooks are interactive environments for exploring, analyzing, and visualizing data, integrated for use with IBM Analytics for Apache Spark. Click NOTEBOOKS button to show existing Notebooks.Click on NEW NOTEBOOK button.
    6. Enter a Name, under Language select Scala and click CREATE NOTEBOOK button as shown below,

    Create the notebook application to receive the device events in the Spark service,

    1. Go to the notebook, In the first cell (next to In [ ]), enter the following special command AddJar to upload the the Streaming application jar and all the dependent jars to the Spark environment.
      %AddJar https://github.com/ibm-watson-iot/predictive-analytics-samples/releases/download/0.0.3/IoTSparkAsServiceSample-3.0.0.jar -f
      %AddJar https://github.com/sathipal/spark-streaming-mqtt-with-security_2.10-1.3.0/releases/download/0.0.1/spark-streaming-mqtt-security_2.10-1.3.0-0.0.1.jar -f
      %AddJar http://central.maven.org/maven2/org/apache/wink/wink-json4j/1.4/wink-json4j-1.4.jar
      %AddJar https://repo.eclipse.org/content/repositories/paho-releases/org/eclipse/paho/org.eclipse.paho.client.mqttv3/1.0.2/org.eclipse.paho.client.mqttv3-1.0.2.jar
      %AddJar http://central.maven.org/maven2/org/apache/commons/commons-math/2.2/commons-math-2.2.jar
      %AddJar http://repo1.maven.org/maven2/args4j/args4j/2.0.12/args4j-2.0.12.jar
    2. Add another cell, by clicking on the + button below menu option Edit. Enter the following configuration parameters so that the Spark streaming application can talk to Watson IoT Platform and Watson Machine Learning service. Note that the credentials, placed between angular brackets, need to be modified. Also, the appid must be less than 20 characters
      import com.ibm.iot.iotspark.IoTSparkAsServiceSample

      //Watson IoT Platform related parameters
      IoTSparkAsServiceSample.setConfig("appid","a:<orgid>:<unique appid>")
      IoTSparkAsServiceSample.setConfig("uri","ssl://<orgid>.messaging.internetofthings.ibmcloud.com:8883")
      IoTSparkAsServiceSample.setConfig("mqtopic","iot-2/type/+/id/+/evt/temperature/fmt/+")
      IoTSparkAsServiceSample.setConfig("apikey","<a-orgaid-apikey>")
      IoTSparkAsServiceSample.setConfig("authtoken","<AUTHTOKEN>")

      // Predictive Service related parameters
      IoTSparkAsServiceSample.setConfig("window","10")
      IoTSparkAsServiceSample.setConfig("cycle","10")
      IoTSparkAsServiceSample.setConfig("predictive-service-url","<APPENDED URL>")

      (The APPENDED URL must be in the following format – http://{Watson ML service URL}/pm/v1/score/{contextId}?accesskey={access_key for this bound application}
      For example, the URL obtained in Predictive Analysis is “https://ibm-watson-ml.mybluemix.net” and the context id is “predict” and access key is “xxxxxxxxxxxxx”, then the modified URL is “https://ibm-watson-ml.mybluemix.net/pm/v1/score/predict?accesskey=xxxxxxxxxx”)

    3. Trigger the streaming job by adding the following in the next cell,
      IoTSparkAsServiceSample.startStreaming(sc, 4)
      (Note that the streaming batch interval is set to 4 seconds, you can increase/decrease by changing the value)
    4. Then run the code all at once by going to Cell->Run All as shown below,
    5. Observe that when the application starts, it verifies the connectivity to Watson IoT Platform and Watson Machine Learning service with the given configuration parameters. If there are any issues, the application stops. If the connectivity is perfect, then it reads the sensor events from Watson IoT Platform in realtime, invokes the Watson Machine Learning service, calculates the zscore & wzscore values based on the predicted values and publishes the result back to IBM Watson IoT Platform.

    6. Observe in the notebook, the temperature (originated from the device), forecasted temperature (returned by Watson Machine Learning service) and zscore & wzscore (calculated by this Spark application) values are printed every 4 seconds as shown below,

    7. You can observe the results in the Watson IoT Platform dashboard as well, by clicking on the device row in the Watson IoT Platform dashboard.

    In case you want to stop the program, there is an Interrupt Kernel button just below the Kernel.

    Building and running your own code

    One can modify the existing Spark streaming application according to their usecase and run it, its very easy. Follow the steps below to do the same,

    1. Clone the iot-predictive-analytics repository as follows:

      git clone https://github.com/ibm-messaging/iot-predictive-analytics-samples.git 
    2. Import the SparkComponentproject project into the Eclipse environment and make necessary changes.
    3. Build the project using maven (either via Eclipse or command line)
      mvn clean package 
    4. This will download all required dependencies and starts the building process. Once built, the sample can be located in the target directory. Post the jar IoTSparkAsServiceSample-2.0.0-jar-with-dependencies.jar on a publicly available URL, for example box, dropbox, etc..
    5. Go to the notebook, Modify the first cell to upload the Streaming application jar that you built instead of the one available in the Github,

      %AddJar https://github.com/sathipal/spark-streaming-mqtt-with-security_2.10-1.3.0/releases/download/0.0.1/spark-streaming-mqtt-security_2.10-1.3.0-0.0.1.jar -f
      %AddJar <URL of IoTSparkAsServiceSample-2.0.0-jar-with-dependencies.jar> -f

      Note: Modify the URL of the IoTSparkAsServiceSample jar with the URL where you placed the built application (say box, dropbox, etc). In case of dropbox, you may have to change the last part of URL (so instead of '?dl=0', you may have to change it to '?dl=1')

    6. As the IoTSparkAsServiceSample-2.0.0-jar-with-dependencies.jar is built with all the dependencies, you don't need to specify the dependencies except for the spark-streaming-mqtt-security_2.10-1.3.0-0.0.1.jar.
    7. Keep the contents of remaining 2 notebook cells same and start the streaming application by carrying out the steps mentioned in the sub-section “create the notebook application to receive the device events in the Spark service”. 

    In this section, we started a Scala Spark application which reads the device events and calls the Watson Machine Learning service, that returns back the forecast temperatures. The Spark application then calculates the zscore & wzscore and publishes them back to IBM Watson IoT and is used to create alerts in the IBM Real-Time Insights (RTI) service and charting. We will demonstrate the same in the next step.

  8. Create Rules in Watson IoT

    In this step, we will show how to create a rule for the wzscore value to alert when it crosses the threshold. 

    Create a Message Schema

    Make sure the Apache Spark Streaming application is running, otherwise, you may not get the right data points.

    1. In the Devices tab, select the Manage Schemas tab as shown below,create-schema
    2. Click Add Schema to add a new schema,
    3. Select the DeviceType for which the schema is created and click Next,
    4. Click Add a property to add the datapoints from the connected device.
    5. Select “From Connected” tab and then select the required datapoints as shown below. Make sure the Apache Spark Streaming application is running, otherwise, you may not get these data points.add-schema-3
    6. Click Finish to finish the schema creation.

     Add a Rule

    • In the Rules tab, select the Browse tab and click “Create Cloud Rule”,
    • Provide a name for the Rule, select the schema name in the “Applies to” column and click Next,rule-1
    • Define the rule as shown below, the rule will trigger an alert when the wzscore value is either above 3 (temperature spike) or below -3(temperature dip).add-rule-2
    • Also, you can associate different actions to the rule, Refer to this recipe for more information about the list of available actions in RTI and how to associate them with the rule.
    • Click Save to save the rule and then Activate to activate the rule.

     In this step we have successfully setup RTI and configured rules such that the alerts will be generated when the wzscore crosses the threshold.

  9. Results

    Realtime Alerts

    Now, when the predicted results are sent to Watson IoT Platform, the rules will analyze the wzscore data in real time and take action when a threshold is broken. Go to Dashboard tab to view the alerts and Notifications.

    1. Click on Boards tab to view the analytical cards as shown below,dashboard1
    2. Click on Device-Centric-Analytics card and select the device to view the list of alerts generated for this device.alert-dashboard

    Realtime charting

    With the new cards in the Watson Internet of Things platform, one can build their own custom Dashboard to create visualization charts for the real time data that are coming in from the devices. Refer to this recipe for detailed information about creating visualization charts. 

    Carry out the following steps to visualize the results in charts.

    1. In the dashboard, select the Boards tab.
    2. Click + Create New Board to create board for visualization. 
    3. Specify a name and create the dashboard.
    4. Open the new board and Click + Add New Card button,
    5. In the Devices section, select Realtime Chart,card-00
    6. Select a device.
    7. Now, define the data set for the visualization. Click Connect new data set.
      • Enter the name for your data set
      • Select the event
      • Select the property of the event as temperature
      • Optionally, you can select the unit of the data set as wellcard-01
      • Repeat this steps to add the wzscore property.
    8. Click Next
    9. Preview the card. You can select the size of the card now. By default Small is selected.
    10. Enter the title for the card and click Submit.
    11. Observe that the values are plotted in the chart.
    12. chart
  10. Conclusion and the Road Ahead

    This recipe shows how to integrate IBM Watson IoT, Apache Spark service and Watson Machine Learning service, so as to take timely action before an (unacceptable) event occurs. Developers can take a look at the model and code available in the github repository to understand whats happening under the hood. Developers can consider this recipe as a template for integrating Watson Machine Learning service with IBM Watson IoT Platform. They can modify the existing Spark application, as well as, the model, depending upon the use case. 

    Go through the next part of this recipe “Timeseries Data Analysis of IoT events by using Jupyter Notebook” to analyze the resultant events produced by this recipe, in a Jupyter Notebook using Spark SQL and Pandas DataFrames.

19 Comments on "Engage Machine Learning for detecting anomalous behaviors of things"

  1. Can IBM QEWS be also used on WIoT platform?

  2. YMDH_sathish_Palaniappan August 29, 2016

    Yes, one can easily integrate the IBM’s Predictive Maintenance and Quality (PMQ) system with Watson IoT Platform by creating a custom flow in the IBM Integration Bus(IIB) node in the IIB node that receives the events from the Watson IoT Platform. We are in the process of drafting a recipe that shows how one can achieve this. It might take couple of weeks to complete the recipe and publish it. We will post a comment once its published. Thank you !

  3. RomeoKienzler September 01, 2016

    > Specify the context id and deploy the model.
    What is the “context id” ?

    • Context Id is an unique id to refer the deployed model. Later this context id will be used in invoking the model for scoring.

  4. Hi – where is the sub-section “create the notebook application to receive the device events in the Spark service”?

  5. Hi, when I run the code in Step 7 (Create Spark Streaming Service) I get the following error;Name: Compile Error

    Message: :21: error: not found: value IoTSparkAsServiceSample
    IoTSparkAsServiceSample.startStreaming(sc, 4)
    ^
    StackTrace:

    Any ideas where I am going wrong?

    • Recipes@IoTF November 02, 2016

      Looks like the first 2 cells are not run. Can you make sure that the first 2 cells are run before the third cell? Actually the first cell downloads all the necessary jars that are required to run this sample which is missing.

  6. Thanks for the excellent tutorial, I was able to reach and complete step 7. Step 8 and the rest should be easier to do once we get it right until this step.

    My thanks!, I have been learning a lot in the last few days. Need to integrate this to real IoT devices such as Raspi with Node-RED for example instead of simulated one, as well as modifying the sample predictive model.

    my output at the completion of step 7:

    ——————————————-
    Time: 1478358216000 ms
    ——————————————-
    (Device01,State [prediction={“wzscore”:0.6136529541488246,”name”:”datacenter”,”temperature”:17.69,”forecast”:17.63,”zscore”:1.0805949411139768,”timestamp”:”2016-Nov-05 15:03:33″}])

    ——————————————-
    Time: 1478358220000 ms
    ——————————————-
    (Device01,State [prediction={“wzscore”:-2.3332071988718557,”name”:”datacenter”,”temperature”:17.583,”forecast”:17.66,”zscore”:-0.5894687217371645,”timestamp”:”2016-Nov-05 15:03:38″}])

    ——————————————-
    Time: 1478358224000 ms
    ——————————————-
    (Device01,State [prediction={“wzscore”:-2.1799095025532944,”name”:”datacenter”,”temperature”:17.56,”forecast”:17.66,”zscore”:-0.847580531770821,”timestamp”:”2016-Nov-05 15:03:40″}])

    • Hi Andi! How did you modify the sample predictive model? I attempt to use real IoT devices to send temperature similar to simulated one but nothing in my output at the completion of step 7 no Time, no wzscore,…and i think the reason is i didn’t modify the sample predictive model. Can you show me the way to do that?!
      thank so much, i hope receive response as soon as possible from you!

      • YMDH_sathish_Palaniappan January 02, 2017

        Hi Hieu Le,

        Thank you for contacting us. If i understand correctly, you are able to perform all the steps with the simulator available in the recipe, but facing error when you try to use the real IoT device.

        As long as the device (real device or simulator) sends the temperature events in the following format, then there is no change required in the Spark application,

        Event name: temperature <— this can be anything, but make sure that the same event name is used in Spark configuration
        payload:

        {"name":"datacenter","temperature":17.47, "timestamp": "xxxxxx"}

        Let me know the event format that is sent to the Watson IoT Platform. Also the Spark configuration (Step 7 values).

        Thanks & regards,
        Sathish

  7. In step 8 in an attempt to get the generated values on Watson IoT platform, I’ve just realized that I am not getting the forecast, zcore and wzscore values.

    I am getting all the scores in Spark…
    Somehow Spark is not writing the value back to Watson IoT Platform, or I was missing something during steps 1-7?
    Any idea what I was doing wrong?

    Thanks.

  8. AllefA.Silva February 10, 2017

    Gostaria de saber o que preencher no

  9. AllefA.Silva February 10, 2017

    Gostaria de saber o que preencher no unique appid

Join The Discussion