Every data project starts with data. Data is a very broad term. It can be structured or unstructured, big or small, fast or slow, and accurate or noisy. In order to create a deep learning solution for anomaly detection in IoT time-series data generated by vibration in the sensor (accelerometer), we need structured, fast, and big data, which can be noisy too.

In this tutorial guide, we are going to create a simple framework for generating data from sampling various physical models, decide on the degree of noise and switch between different states (healthy and broken) of the physical model for anomaly detection and classification tasks. We will use Node-RED as the runtime platform for the simulator because it is a very fast way of implementing data-centric applications. Node-RED is open source and runs entirely on Node.js. If you want to learn more on Node-RED, check out this excellent tutorial.

Because the data simulator is completely implemented as a Node-RED flow, we can use Node-RED from the IoT Starter Boilerplate on the IBM Cloud. Of course, the data simulator can run on any Node-RED instance, even on a Raspberry Pi, where it can be used to simulate sensor data on the edge.

Before you get started, you’ll need an IBM Cloud Account. (You can request a free account here, which you can convert into a freemium account later.)

Creating the test data simulator

  • Create an IBM Cloud app using the Internet of Things Platform Starter. If you are not logged in to IBM Cloud, log in.

  • Name your application a unique name, and click Create.

  • Wait until your application is running, and then click Visit App URL.

    Note: If you get a “404 – No route defined” error, reload the page in a couple of minutes. This error is a known issue in the Cloud Foundry open source cloud platform component and occasionally occurs if IBM faces high workloads. Cloud Foundry communicates asynchronously between it’s components, and therefore the UI thinks that the app is running but the load balancer has not been updated.

Application UI

  • Before you can access and open your Node-RED flow editor, you must secure your Node-RED editor.

    • Click Next.
    • Set a user name a password .

      Select the check box to share your application with others. By not selecting the check box, you will keep your implementation private.

    • Click Next.

    • Click Finish.
  • Click Go to your Node-RED flow editor.

Node-RED Flow

  • Log in with the user name and password you’ve just created.

  • Using your mouse, select all the nodes in Flow 1; click the Delete key to empty it.


  • From the upper-right menu, click Import > Clipboard.

  • Open this simulatorflow.json file; copy the JSON object to the clipboard.

  • On the Import nodes window, paste the JSON object to the text field, and click Import.

    The following flow is displayed in the Flow 1 tab.


  • Click Deploy. The message “Successfully deployed” will display.

    The debug tab displays the generated messages. Congratulations! Your Test Data Generator is working.


Understanding Node-RED flow

You’ve got it working, but what is going on in this Node-RED flow?

Consider the node labeled with the word timestamp.

Timestamp Node

This node is an inject node and it is generating messages in defined intervals. This is very useful as a starting point for our simulator. In a real-life scenario, this node would be replaced with some nodes that are connected to accelerometer sensors.

Double-click the timestamp node. Notice the sample rate generates 100 messages per second (or a sampling rate of 100 Hz).

Sample Rate

Next, look at the function node. It is the heart of the simulator.

Function Node

Double-click this node and see the following function code:

var h = context.global.get('h')||0.008;
var a = context.global.get('a')||10;
var b = context.global.get('b')||28;
var c = context.global.get('c')||8/3;
var x = context.global.get('x')||2;
var y = context.global.get('y')||3;
var z = context.global.get('z')||4;

x += h * a * (y - x)
y += h * (x * (b - z) - y)
z += h * (x * y - c * z)

msg.payload = {};
msg.payload.x = x;
msg.payload.y  =y;
msg.payload.z = z;
return msg;

The initial parameters of the model are h, a, b and c. We also initialize x, y, and z to some values; the equations are the actual model. They are dependent on h, a, b, c, x, y, and z. In every time step (currently 100 per second), the model is advanced one step into the future because x, y, and z are updated using values from constants h, a, b, and c and also from previous x, y, and z values.

You need to set a limit on the output for two reasons:

  • At the current sample rate (100 messages per second) you’ll using up the free 200 MB per month on the Watson IoT Platform within a couple of hours.
  • The downstream analysis might not be able to cope with this data rate.

Now let’s look at the limit to max 3000 function node. Currently, the maximum is set to 30 seconds worth of data using a simple count.


Double-click the node to see the function code:

var count = context.global.get('count') || 0;
count += 1;
if (count <= 3000) {
   return msg;

Now, consider the reset node. The function node associated with this node is set to send the next 30 seconds worth of data to the message queue.


Double-click the function node. It is implemented as follows:

msg.payload = context.global.get('count');
return msg;

The next to last steps are to switch this simulator between the broken and healthy states. To simulate faulty or broken data we can click the function node associated with the broken inject node.


The only thing this node does is update the model constants.


return msg;

And, of course, take a look at the function to switch it back to a healthy state.


It is implemented as follows:


return msg;

Last, but not least, let’s look at how this data travels to the IBM Watson IoT Platform’s MQTT message broker.


You can leave the configuration as it is and the credentials are injected for you using Cloud Foundry running on IBM Cloud.



You’ve successfully deployed a test data simulator creating a time series of events sampled from a physical model. You can also switch between two states (healthy and broken) for anomaly detection and classification.

You can now use this test data to develop deep learning cognitive IoT solutions for anomaly detection with Deeplearning4j, ApacheSystemML, and TensorFlow (TensorSpark).