Automatic Fusion and Threading

IBM Streams 4.2 introduces two new features designed to make it easier to attain high performance: automatic submission time fusion and the automatic threading model. Automatic submission time fusion means that you can achieve a reasonable number of PEs for your system without having to manually fuse operators, and the automatic threading model means that you can take advantage of multiple cores per host, without having to manually place threaded ports in your application. Continue reading Automatic Fusion and Threading

How to classify text using the IBM Streams Natual Language Processing (NLP) Toolkit LinearClassification operator?

The goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. Such classifiers work well for practical problems such as document classification. The LinearClassification operator identifies the category of text from streaming data according to a model. It is part of the IBM Streams NLP Toolkit (formerly known as Extension Text Toolkit). This post describes how to use it. Continue reading How to classify text using the IBM Streams Natual Language Processing (NLP) Toolkit LinearClassification operator?

How to extract text using the IBMStreams Natural Language Processing (NLP) Toolkit RutaText operator?

Text extraction is one means to get insights to unstructured data like text or speech transformed into text. There are different methods to write text extraction rules. One of them is the UIMA Ruta language. The RutaText operator extracts data from streaming text according to predefined UIMA Ruta rules. It is part of the IBM Streams Natural Language Processing (NLP) Toolkit. This post describes how to use it. Continue reading How to extract text using the IBMStreams Natural Language Processing (NLP) Toolkit RutaText operator?

Rules based processing in real-time streaming applications

Introduction In IBM Streams 4.2, we have added support for authoring rules compatible with the Operational Decision Manager (ODM) product in Streams Studio, converting them to an SPL composite and using them for real-time analysis in IBM Streams. In this tutorial, we will develop a sample application that will demonstrate each of the steps and walk […] Continue reading Rules based processing in real-time streaming applications

IBM Streams Network Toolkit Overview

IBM Streams 4.2 brings many exciting new capabilities to the customer especially in the area of specialized toolkit. One of the new toolkits that was released along with IBM Streams 4.2 is the streamsx.network toolkit. The network toolkit enables SPL applications to analyze low level network packets such as parsing DHCP,DNS,Netflow,IPFIX messages, enriching IPV4 and […] Continue reading IBM Streams Network Toolkit Overview

Cybersecurity Toolkit – What’s New!

Cybersecurity Toolkit – What’s New! The Cybersecurity Toolkit provides operators that are capable of analyzing DNS response records. The operators in this toolkit use machine learning models to analyze DNS traffic and report on suspicious behaviour. The Cybersecurity Toolkit v2.0.0 includes new operators to further allow users to detect suspicious behaviour in their network as […] Continue reading Cybersecurity Toolkit – What’s New!

Detect Active Threats in Real-time: Streams Cybersecurity Toolkit

Streams is an ideal platform for providing cybersecurity analytics, a new toolkit has been added to the Streams 4.1 release called the Cybersecurity Toolkit. This toolkit will provide the building blocks to enable developers and cybersecurity analysts to gain insight into their networks in real-time. Continue reading Detect Active Threats in Real-time: Streams Cybersecurity Toolkit

Text Analytics To Go

In this article, I'm going to give you two simple applications that can serve as starting points for Text Analytics applications on Streams. The first example will use BigInsights Text Analytics to do normalization of terms. The second example will show how to tokenize, both with the simple SPL function, and using the more full-featured BigInsights Text Analytics. Continue reading Text Analytics To Go

Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 1

Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 1 Overview An SPL application submitted to an IBM® InfoSphere® Streams instance represents a dataflow graph or flow, processing continuous data streams. Streams provides capabilities to restart processing elements that have failed due to host or process failure. However, while the processing element […] Continue reading Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 1

Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 2

This article outlines a technique for supporting redundant flows that only generate a single external effect. The technique is described in terms of an application that generates alerts using text messages (SMS) with a theoretical SMSSink operator, though it can be applied to any sink operator. Continue reading Application High Availability in IBM® InfoSphere® Streams with Active Replicas – Part 2

Bandpass and bandstop filters using the DSPFilter operator

The DSPFilter operator implements a butterworth filter and can be used to isolate frequencies in a time series. For example, a low pass filter can be used to reject all frequencies above a certain point (this point is referred to as the cut-off frequency). Likewise, a high pass filter can be used to reject frequencies […] Continue reading Bandpass and bandstop filters using the DSPFilter operator

Example: Analyzing Weather Data using Windowing

My goal with this article is to provide a working example of how to build a Streams Application. In this example, I will be building a Streams application to calculate the average surface temperature and wind speed over a period of time. I will be using a CSV file that contains the temperature and wind […] Continue reading Example: Analyzing Weather Data using Windowing

Geofence – Smart Marketing

In Streams 4.0, the geospatial toolkit introduced a new operator called Geofence.  The Geofence operator allows you to dynamically add or remove geographical regions of interest.  As entities move in and out of these regions, the operator will provide entry and exit events. This video demonstrates how you can use the Geofence operator to run […] Continue reading Geofence – Smart Marketing

How to submit a Consistent Region application using Redis as the checkpoint store

If you are planning to use Consistent Region Applications in Streams 4.0, you will need to setup the Checkpoint Repository Store. There are 2 options : file system or Redis. If your Streams installation does not use a shared file system you will need to Redis. Also, you might prefer to use Redis to take […] Continue reading How to submit a Consistent Region application using Redis as the checkpoint store

Migration Information for Streams 4.0

InfoSphere Streams Version 4.0 is a major new release with significant advances in high availability and ease of use. This release includes a number of new features which makes InfoSphere Streams simpler to manage and more resilient, as well as providing integration with Microsoft Excel. This release also adds new and improved analytics and connectivity […] Continue reading Migration Information for Streams 4.0

Migration to Streams 4.0 for SPL Developer

In Streams V4, some of the major enhancements to the product has impact to SPL application migration.  In this document, we are going to discuss what’s changed, and how it affects your SPL application.  We will also discuss the steps required to successfully migrate your SPL application from previous release to Streams V4. Refer to […] Continue reading Migration to Streams 4.0 for SPL Developer

Migration to Streams 4.0 for Streams Integration Developers

If you have applications that use the REST API, you will need to make changes to certificate validation.  For information about these changes, see http://www.ibm.com/support/knowledgecenter/SSCRJU_4.0.0/com.ibm.streams.install.doc/doc/ibminfospherestreams-migrating-applications-rest.html. Streams 4.0 introduces a set of Java™ Management Extensions (JMX) APIs to provide programmatic access to configuration and status information for Streams objects, such as a domain and its instances, […] Continue reading Migration to Streams 4.0 for Streams Integration Developers

Migration to Streams 4.0 for Toolkits Developer

This document:  Migration to Streams 4.0 for SPL Developer, explains the steps required to migrate an existing SPL application to Streams v4.  The Application Bundle feature also affects custom operators and toolkits, because it requires operators to handle files differently at compile time and runtime.  This document will describe the steps to migrate a custom […] Continue reading Migration to Streams 4.0 for Toolkits Developer

Streams 4.0 — Streams for Microsoft Excel

One of the new features introduced in the Streams 4.0  is Streams for Microsoft Excel. This feature allows an Excel user to quickly and easily identify and access streaming data, to enable analysis and visualization on continually updating data with the full power of Excel.  Streams for Excel is an Excel add-in which uses the Excel […] Continue reading Streams 4.0 — Streams for Microsoft Excel

Using Streams Studio to develop applications with consistent regions

As announced in the What’s New in Streams V4 post, one of the key features in InfoSphere Streams V4 is application resiliency. The concept of consistent regions was introduced, which provides an SPL application the ability to recover from failures and guarantee at-least-once tuple processing. This is described in more detail in this post. This […] Continue reading Using Streams Studio to develop applications with consistent regions