Stepbystep

Understand Anomaly Detection using moving zscore (optional)
Although not essentially necessary a basic understanding how the anomaly detection algorithm used in this tutorial works is very beneficial. I won’t make it too complicated, I promise.
Basically the algorithm depends on two statistical measures, mean and standard deviation:
 Mean
Mean (in laymenâ€™s terms also called average) is basically a measure of central tendency of your data. It can be simply calculated as follows:
This means you just sum up all individual values and divide by the number of values you’ve summed up  Standard Deviation
Standard Deviation is a measure on how wide data is spread around the mean and can be calculated as follows:
(Note that the mean xdash from the previous formula is now denoted as Greek letter mu but I won’t go into the details here). So what you can observe is that (although the formula looks slightly more complicated) is that the distance between every data point (or measurement) and the mean is evaluated and somehow summed up. So the more distant data points are spread around the mean the higher the measure for standard deviation is. This is important because if your data is already widely distributed around the mean detecting an anomaly needs data to be spread even more far away from the mean.
Â
So now we have all ingredients to calculate the zscore which is defined as follows:
This means, for every measurement just subtract the mean and divide it by standard deviation. So we are nearly done, the only thing what we have to do is to turn this into a “moving zscore”, we want to detect anomalies on time series, right?
The trick is as simple as follows. Instead of using ALL measurements for calculating mean and standard deviation we just take the latest k measurements into account. This approach is called windowing and is described here very nicely.
 Mean

Deploy the application
In order to make it fasttrack, you can just click on the deploy button below which will automatically deploy a NodeRED data flow tool acting as device simulator in the IBM Cloud. It also comes with a preconfigured “edge” implementation of the algorithm mentioned above. As already mentioned, we are based on a full length tutorial which can be found here.
The full tutorial mentioned above explains you the concept of “Cognitive IoT” where advanced machine learning algorithms and neural networks can be trained and run on various locations (on the edge of an IoT system, in a batch processing system or realtime data processing system in the cloud). But in this fasttrack tutorial we concentrate on the edgerule only which has been obtained by running the fullstack process mentioned above.
Please click on the following deploy button:
So what this basically does is it will create a NodeRED instance in the IBM Cloud with a data flow preconfigured for our application. Please login with your IBM Bluemix account and click on “deploy”.
Â

Understand whatâ€™s happening
After successful deployment you’ll see a screen like this, please click on “view app”:
You’ll be taken to the NodeRED flow editor where you can see the already deployed and running application. Please have a look; this should somehow look like the following:
Â
So let me walk you through each element:
 NodeRED is free, Open Source and runs everywhere! In the IBM Cloud, in every other cloud or data center, on your laptop and even on an IoT Gateway like a Raspberry Pi. So consider this flow to run on an IoT Gateway connected to an elevator and measuring voltage for the main driver motor. As we don’t have a Raspberry Pi in place we are just simulating these sensor values using an “Inject” node in NodeRED. Otherwise you would see a dedicated sensor node here
 In order to create some randomness this JavaScript function node adds some random noise to the signal and also occasionally adds an anomaly which we want to detect
 In addition we want to send data upstream to the cloud so let’s add a timestamp. It is always good to generate a timestamp (temporally and spatially) as close as possible to the sensor. So this value can be referred to as “event time” rather than “processing time”
 To stream these data to the IBM Watson IoT Platform via MQTT only one simple node is necessary
 In addition we want to create a little dashboard to monitor the voltage sensor values
 In order to achieve this we need to shift and shuffle the values a bit
 Now we are at the heart of the system. This JavaScript function will actually calculate the moving zscore for us. Since this function is described in very much detail in this tutorial we’ll skip it here for now.
 Of course we want to plot the moving zscore as well – in parallel to the voltage in order to really understand what’s going on
 Again we need to do some shifting and shuffling as a preparation
 Now we generate an alert messages in case the zscore drops below 0.5 which means some major fluctuation has been taken place recently
 We just display this alert message under the two other charts
 In order to get rid of the message once in a while we reset it
 And delay the deletion 5s, so the message keeps displayed for 5s

Observe what's going on using the realtime dashboard
In order to open the dashboard just click on the dashboard tab and on the dashboard icon as shown here:
You will observe two time series chart (run charts), one for voltage and one for the moving zscore:
Wait for some time until you observe a zscore below 0.5 and you’ll see that an alert message is being generated. Of course you can also trigger something more important like initiate an emergency shutdown of the system or raise an alert, either by sending an email/SMS from the Edge directly using a NodeRED node (see twilio or email for this) or you can also send the alert upstream to the cloud using MQTT. The latter would be a perfect example on how Edge analytics can reduce the amount of data transferred to the cloud by adding intelligence to the Edge gateway device:
Â
Thanks for sticking with me – we are done. I hope this tutorial was of value for you. Please let me know your questions, comments and thought below…
Hello, was trying to look at the example, but clicking on Deploy to Bluemix gets me ‘JazzHub is now retired. Projects are now toolchains.’ Any way to deploy to a toolchain? Thank you!