What’s new in the Streaming Analytics service on Bluemix

IBM Streaming Analytics service – Improved Python Support The combined release of the latest IBM Streaming Analytics service on IBM Bluemix and version 1.6 of the Python Application API introduces a suite of Python language features that greatly ease the development of streaming applications in the cloud. Applications developed with Python 3.5 can now be […] Continue reading What’s new in the Streaming Analytics service on Bluemix

Filed under: learn-about-streams

Automatic Fusion and Threading

IBM Streams 4.2 introduces two new features designed to make it easier to attain high performance: automatic submission time fusion and the automatic threading model. Automatic submission time fusion means that you can achieve a reasonable number of PEs for your system without having to manually fuse operators, and the automatic threading model means that you can take advantage of multiple cores per host, without having to manually place threaded ports in your application. Continue reading Automatic Fusion and Threading

How to classify text using the IBM Streams Natual Language Processing (NLP) Toolkit LinearClassification operator?

The goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. Such classifiers work well for practical problems such as document classification. The LinearClassification operator identifies the category of text from streaming data according to a model. It is part of the IBM Streams NLP Toolkit (formerly known as Extension Text Toolkit). This post describes how to use it. Continue reading How to classify text using the IBM Streams Natual Language Processing (NLP) Toolkit LinearClassification operator?

How to extract text using the IBMStreams Natural Language Processing (NLP) Toolkit RutaText operator?

Text extraction is one means to get insights to unstructured data like text or speech transformed into text. There are different methods to write text extraction rules. One of them is the UIMA Ruta language. The RutaText operator extracts data from streaming text according to predefined UIMA Ruta rules. It is part of the IBM Streams Natural Language Processing (NLP) Toolkit. This post describes how to use it. Continue reading How to extract text using the IBMStreams Natural Language Processing (NLP) Toolkit RutaText operator?

Rules based processing in real-time streaming applications

Introduction In IBM Streams 4.2, we have added support for authoring rules compatible with the Operational Decision Manager (ODM) product in Streams Studio, converting them to an SPL composite and using them for real-time analysis in IBM Streams. In this tutorial, we will develop a sample application that will demonstrate each of the steps and walk […] Continue reading Rules based processing in real-time streaming applications

IBM Streams Network Toolkit Overview

IBM Streams 4.2 brings many exciting new capabilities to the customer especially in the area of specialized toolkit. One of the new toolkits that was released along with IBM Streams 4.2 is the streamsx.network toolkit. The network toolkit enables SPL applications to analyze low level network packets such as parsing DHCP,DNS,Netflow,IPFIX messages, enriching IPV4 and […] Continue reading IBM Streams Network Toolkit Overview

Cybersecurity Toolkit – What’s New!

Cybersecurity Toolkit – What’s New! The Cybersecurity Toolkit provides operators that are capable of analyzing DNS response records. The operators in this toolkit use machine learning models to analyze DNS traffic and report on suspicious behaviour. The Cybersecurity Toolkit v2.0.0 includes new operators to further allow users to detect suspicious behaviour in their network as […] Continue reading Cybersecurity Toolkit – What’s New!

IBM Streams @ World of Watson 2016

IBM Streams @WoW 2016 Monday, October 24,  2016 Time Speakers Session # Real-Time Analytics for Fast Data and IoT with Business Rules, Python and Advanced Analytics 8:00am-8:45amSouth Pacific H Mike Spicer, IBM ALY-1632 Linear Road Streaming Benchmark: Paving Ways for High-Performance Analytics and Insights at Walmart 11:00am-11:45amIslander D Sung Kim /Walmart, Roger Rea/IBM ALY-1384 Cognitive […] Continue reading IBM Streams @ World of Watson 2016

Streams Console 4.2: Secure Configuration Repository – RabbitMQ Example

New for the 4.2 release of Streams is a feature giving users the ability to securely store application specific configuration information or parameters that can be used during application authentication or connection time. These objects, called application configurations, can now be stored securely in a Streams domain. This article will demonstrate how to create and […] Continue reading Streams Console 4.2: Secure Configuration Repository – RabbitMQ Example

Configuring Kerberos authentication for Streamtool, Domain Manager, and Streams Console

In this tutorial, you will learn how to configure Kerberos authentication for three IBM Streams interfaces for the current user. The interfaces are: Streams Console, Streamtool, and Domain Manager. Prior to Streams 4.2, login modules and client certificates could be used to customize user authentication for an enterprise domain. Now, Kerberos is an option. Kerberos […] Continue reading Configuring Kerberos authentication for Streamtool, Domain Manager, and Streams Console

Steams Console 4.2 – ZooKeeper Health and Metrics

The Streams 4.2 console includes increased monitoring capabilities for ZooKeeper ensembles. The console now displays information about the ZooKeeper ensemble in the domain and allows the user to monitor the health and metrics of each ZooKeeper node in the ensemble. What’s new? The ZooKeeper Health Analysis dialog displays metrics and health statistics for all nodes […] Continue reading Steams Console 4.2 – ZooKeeper Health and Metrics

Filed under: streams-console

Bluemix Streaming Analytics Development Guide

This guide will help you through the processes for building, submitting and monitoring a streaming analytics application using the Streaming Analytics service on IBM Bluemix. In this guide you will learn how to download and setup the IBM Streams Quick Start Edition VM to use as a development environment. A sample application is provided to download, build and run in the cloud. Continue reading Bluemix Streaming Analytics Development Guide

Hourly Moving Average — Streams makes it simple

In a recent discussion around using Streams, the following use case was considered problematic for an existing system. Given a set of devices that produce metrics, calculate the hourly moving average of the metric per device. In Streams this is very simple and a sample application took about 15 minutes to construct. Continue reading Hourly Moving Average — Streams makes it simple

Streaming Analytics Airport Sentiment Demo

This article describes a demo application that runs on the Bluemix Streaming Analytics Service in the cloud. It uses a Streams Application to read from the FAA website to get airport weather and delay information. It retrieves tweets from the IBM Insights for Twitter Bluemix service. It uses Streams text analytic capabilities to categorize the area the tweets are related to such as "baggage" or "maintenance". <a href="https://developer.ibm.com/streamsdev/wp-content/uploads/sites/15/2015/10/Main2.jpg"><img class="alignnone size-full wp-image-8454" src="https://developer.ibm.com/streamsdev/wp-content/uploads/sites/15/2015/10/Main2.jpg" alt="Main2" width="917" height="521" /></a> Continue reading Streaming Analytics Airport Sentiment Demo

Detect Active Threats in Real-time: Streams Cybersecurity Toolkit

Streams is an ideal platform for providing cybersecurity analytics, a new toolkit has been added to the Streams 4.1 release called the Cybersecurity Toolkit. This toolkit will provide the building blocks to enable developers and cybersecurity analysts to gain insight into their networks in real-time. Continue reading Detect Active Threats in Real-time: Streams Cybersecurity Toolkit

IBM Streams V4.1 and User Authentication with Client Certificates

Scott Timmerman is a member of the IBM Streams development team. In his presentation, Scott provides an introduction to user authentication with client certificates, discusses public key infrastructure terms and concepts, and demonstrates how to configure Streams to authenticate using client certificates. Continue reading IBM Streams V4.1 and User Authentication with Client Certificates

Introduction to the Spark MLLib Toolkit in IBM Streams V4.1

Ankit Pasricha is the team lead of the IBM Streams Toolkit development team. In his presentation, Ankit will introduce the new Spark MLLib Toolkit that is available in IBM Streams V4.1. This toolkit combines the power of Spark MLLib and the real-time streaming capabilities of Streams. Continue reading Introduction to the Spark MLLib Toolkit in IBM Streams V4.1

Streams 4.1 — Info

This page contains a list of Streams V4.1 documents, articles and videos. Check back for updates over the next couple of weeks as we add more materials. Getting started There are several quick start guides available here. Getting started with application dashboards Getting Started with the Spark MLLib Toolkit Setting up IBM Streams v4.1 using […] Continue reading Streams 4.1 — Info

Filed under: learn-about-streams

Introduction to the Bluemix Streaming Analytics Service

IBM Streaming Analytics is available on Bluemix (www.bluemix.net). Streaming Analytics is built upon the IBM Streams technology. Streams is an advanced analytic platform allowing user-developed applications to quickly ingest, analyze, and correlate information as it arrives from a wide variety of real-time sources. The Streaming Analytics service gives you the ability to deploy Streams applications to run in the Bluemix cloud. Continue reading Introduction to the Bluemix Streaming Analytics Service

Extending Streams Functionality with Native Functions

This post demonstrates how to write C++ native functions to add functionality to Streams. When we need to wrap a library so that we can use it from SPL, there are two options. One option is to add a new primitive operator. But an alternate choice is to add a native function. A native function is an SPL function where the code is written in C++ or Java. Continue reading Extending Streams Functionality with Native Functions

Predicting the Future in a Streams Application

Time series forecasting is a very broad subject. The ability to forecast future values is applicable in areas such as sales forecasting, stock market analysis and utilities forecasting (i.e. energy consumption). Forecasting can be a complicated subject as there many different forecasting algorithms, with each algorithm having certain properties that only makes it useful in specific circumstances. This article demonstrates how to easily introduce forecasting into an application using the AutoForecaster operator. Continue reading Predicting the Future in a Streams Application

Integrating with Cloudant and many other RESTful Services

Streams integrates with other technologies using adapters to popular protocols such as TCP, ODBC, Kafka, JMS, MQTT and HDFS, just to name a few. REST is another established protocol that is gaining popularity because of its use in many cloud-based services. This article describes how to use Streams HTTP adapters to integrate SPL applications to Cloudant and other RESTful, web-based services. Continue reading Integrating with Cloudant and many other RESTful Services

Java Application API — An Introduction

The Java Application API allows streaming applications to be written in Java for IBM Streams. Tuples on a stream can be any Java object that is serializable. A stream, represented by the interface TStream, is processed using a functional programming style, when a function transforms the stream by being called on each tuple, and the returned value drives the contents of the new stream. Continue reading Java Application API — An Introduction

Parallelized File Processing with the Parse Operator

Reading from an external source—such as the network or filesystem—is often a performance bottleneck. When source operators are the performance bottleneck for a streaming application, we have a tendency to blame the reading from the external source. But, that is not always the case. Particularly for large tuples which have many attributes, the actual performance bottleneck can be parsing. Continue reading Parallelized File Processing with the Parse Operator

Text Analytics To Go

In this article, I'm going to give you two simple applications that can serve as starting points for Text Analytics applications on Streams. The first example will use BigInsights Text Analytics to do normalization of terms. The second example will show how to tokenize, both with the simple SPL function, and using the more full-featured BigInsights Text Analytics. Continue reading Text Analytics To Go

Geospatial Toolkit Hands-on Lab Solution: Part 2

Application graph SPL code The code below is formatted as generated by the graphical editor. I only changed the following preferences (in SPL editor, right-click > Preferences…): In General > Editors > Text Editors: set Displayed tab width: to 3 and check Insert spaces for tabs. In InfoSphere Streams > SPL > Formatter: set Maximum line width (in characters): […] Continue reading Geospatial Toolkit Hands-on Lab Solution: Part 2

Geofence – Smart Marketing

In Streams 4.0, the geospatial toolkit introduced a new operator called Geofence.  The Geofence operator allows you to dynamically add or remove geographical regions of interest.  As entities move in and out of these regions, the operator will provide entry and exit events. This video demonstrates how you can use the Geofence operator to run […] Continue reading Geofence – Smart Marketing

Multi-host environment: Installing to each host and setting up a domain

Getting your hosts installed and setup in a domain using InfoSphere Streams 4.0 is very easy. This article will walk through installing the product to each host in your multi-host environment and setting up the InfoSphere Streams services to automatically restart when the host reboots after a host failure. Continue reading Multi-host environment: Installing to each host and setting up a domain

Streams 4.0 — Streams for Microsoft Excel

One of the new features introduced in the Streams 4.0  is Streams for Microsoft Excel. This feature allows an Excel user to quickly and easily identify and access streaming data, to enable analysis and visualization on continually updating data with the full power of Excel.  Streams for Excel is an Excel add-in which uses the Excel […] Continue reading Streams 4.0 — Streams for Microsoft Excel

Migration to Streams 4.0 for Toolkits Developer

This document:  Migration to Streams 4.0 for SPL Developer, explains the steps required to migrate an existing SPL application to Streams v4.  The Application Bundle feature also affects custom operators and toolkits, because it requires operators to handle files differently at compile time and runtime.  This document will describe the steps to migrate a custom […] Continue reading Migration to Streams 4.0 for Toolkits Developer

Migration to Streams 4.0 for SPL Developer

In Streams V4, some of the major enhancements to the product has impact to SPL application migration.  In this document, we are going to discuss what’s changed, and how it affects your SPL application.  We will also discuss the steps required to successfully migrate your SPL application from previous release to Streams V4. Refer to […] Continue reading Migration to Streams 4.0 for SPL Developer

Migration to Streams 4.0 for Streams Integration Developers

If you have applications that use the REST API, you will need to make changes to certificate validation.  For information about these changes, see http://www.ibm.com/support/knowledgecenter/SSCRJU_4.0.0/com.ibm.streams.install.doc/doc/ibminfospherestreams-migrating-applications-rest.html. Streams 4.0 introduces a set of Java™ Management Extensions (JMX) APIs to provide programmatic access to configuration and status information for Streams objects, such as a domain and its instances, […] Continue reading Migration to Streams 4.0 for Streams Integration Developers

Tool for upgrading InfoSphere Streams Version 3.2.1 Instances to Version 4.0.0.0

Upgrading InfoSphere Streams Version 3.2.1 Instances to Version 4.0.0.0 As part of the new functions and features, InfoSphere Streams Version 4.0.0.0 has introduced the concept of a domain. An InfoSphere Streams domain is a container for InfoSphere Streams instances which provides a single point for configuring and managing common resources, security, and instances. A domain […] Continue reading Tool for upgrading InfoSphere Streams Version 3.2.1 Instances to Version 4.0.0.0

Using Streams Studio to develop applications with consistent regions

As announced in the What’s New in Streams V4 post, one of the key features in InfoSphere Streams V4 is application resiliency. The concept of consistent regions was introduced, which provides an SPL application the ability to recover from failures and guarantee at-least-once tuple processing. This is described in more detail in this post. This […] Continue reading Using Streams Studio to develop applications with consistent regions

How to submit a Consistent Region application using Redis as the checkpoint store

If you are planning to use Consistent Region Applications in Streams 4.0, you will need to setup the Checkpoint Repository Store. There are 2 options : file system or Redis. If your Streams installation does not use a shared file system you will need to Redis. Also, you might prefer to use Redis to take […] Continue reading How to submit a Consistent Region application using Redis as the checkpoint store

How to setup Redis replication with InfoSphere Streams 4.0

In InfoSphere Streams 4.0, you can use Redis replication as a part of your HA strategy. Although Redis itself supports master-slave replication (http://redis.io/topics/replication ) , this is NOT supported in Streams. Streams needs to manage the replication instead of using Redis’ own replication. Below is an example how to setup this up. 1.First you will […] Continue reading How to setup Redis replication with InfoSphere Streams 4.0

Filed under: install-and-setup

Streams 4.0 — Operating without a shared filesystem

Streams 4.0 allows users to operate without a shared filesystem. This article highlights the differences between running with and without a shared filesystem as it pertains to installation, building an application, and application development. In the application development portion, we go through an example of how to migrate an application from reliance on a shared […] Continue reading Streams 4.0 — Operating without a shared filesystem

Migration Information for Streams 4.0

InfoSphere Streams Version 4.0 is a major new release with significant advances in high availability and ease of use. This release includes a number of new features which makes InfoSphere Streams simpler to manage and more resilient, as well as providing integration with Microsoft Excel. This release also adds new and improved analytics and connectivity […] Continue reading Migration Information for Streams 4.0

Streams 4.0 — Managing Instances in the Domain Console

The redesign of the administrator console for the Streams 4.0 release allows streams users to quickly determine problems areas in the domain. They can act rapidly to resolve these issues, as well as perform general tasks like creating instances, and monitoring streams objects. This article is a follow-up to the “Navigating in the Domain Console” article […] Continue reading Streams 4.0 — Managing Instances in the Domain Console

Bandpass and bandstop filters using the DSPFilter operator

The DSPFilter operator implements a butterworth filter and can be used to isolate frequencies in a time series. For example, a low pass filter can be used to reject all frequencies above a certain point (this point is referred to as the cut-off frequency). Likewise, a high pass filter can be used to reject frequencies […] Continue reading Bandpass and bandstop filters using the DSPFilter operator

Example: Analyzing Weather Data using Windowing

My goal with this article is to provide a working example of how to build a Streams Application. In this example, I will be building a Streams application to calculate the average surface temperature and wind speed over a period of time. I will be using a CSV file that contains the temperature and wind […] Continue reading Example: Analyzing Weather Data using Windowing

Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 2

This article outlines a technique for supporting redundant flows that only generate a single external effect. The technique is described in terms of an application that generates alerts using text messages (SMS) with a theoretical SMSSink operator, though it can be applied to any sink operator. Continue reading Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 2

Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 1

Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 1 Overview An SPL application submitted to an IBM® InfoSphere® Streams instance represents a dataflow graph or flow, processing continuous data streams. Streams provides capabilities to restart processing elements that have failed due to host or process failure. However, while the processing element […] Continue reading Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 1