This document summarizes the toolkits that are shipped as part of the Streams product.
Although toolkits marked as open source are included in the product, you can download newer release, request or contribute fixes and get support from the toolkit’s project page on GitHub.¬† This allows you to take advantage of the latest enhancements and features without having to update Streams.
The following toolkits are included as part of the product as of Streams 4.3.1:
Topology Toolkit (com.ibm.streamsx.topology)
The Topology toolkit allows you to create Streams applications entirely in Java and Python, without having to learn SPL. It is developed as part of the Github open-source project:
Mail Toolkit (com.ibm.streamsx.mail)
The open source streamsx.mail toolkits¬†enables sending and receiving mail. It provides support for SMTP and IMAP protocols.¬† Project on GitHub:
PMML Toolkit (com.ibm.streams.pmml)
The Predictive Model Markup Language (PMML) toolkit enables real time scoring of machine learning models in PMML files. It also supports using a model downloaded from the Watson Machine Learning service.
Avro Toolkit (com.ibm.streamsx.avro)
Datetime Toolkit (com.ibm.streamsx.datetime)
This toolkits provides additional functionality for working with dates and times in SPL, and is part of the Github open-source project:
JDBC Toolkit (com.ibm.streamsx.jdbc)
Allows Streams applications to connect to databases via JDBC. Developed in the open as part of the Github open-source project:
JSON Toolkit (com.ibm.streamsx.json)
Operators and functions for converting data to and from JSON.¬† This toolkit is developed as part of the Github open-source project:
Object Storage Toolkit (com.ibm.streamsx.objectstorage)
Use this toolkit to connect to Object Storage services such as IBM Cloud Object Storage service.
It is part of the Github open-source project:¬†http://ibmstreams.github.io/streamsx.objectstorage
Network Toolkit (com.ibm.streamsx.network)
The network toolkit provides operators and functions for working with network data, allowing you to ingest live or recorded network traffic, parse DHCP, DNS messages, and more.¬† This toolkit is developed as part of the Github open-source project:
RabbitMQ Toolkit (com.ibm.streamsx.rabbitmq)
Use this toolkit to integrate with RabbitMQ.¬† ¬†Also an open source project: ¬†http://ibmstreams.github.io/streamsx.rabbitmq
JMS Toolkit (com.ibm.streamsx.jms)
The JMS toolkit provides operators and functions that help you use IBM Streams to interact with JMS systems such as Websphere MQ or Apache ActiveMQ..¬† This toolkit is developed as part of the Github open-source project: ¬†http://ibmstreams.github.io/streamsx.jms/
MessageHub Toolkit (com.ibm.streamsx.messagehub)
The Message Hub toolkit allows you to connect to IBM Event Streams (formerly IBM Message Hub), IBM’s Kafka-as-a-Service offering.¬† This toolkit is developed as part of the Github open-source project: ¬†http://ibmstreams.github.io/streamsx.messagehub/
IOT Toolkit (com.ibm.streamsx.iot)
Allows Streams applications to¬† process data from IoT devices that are connected to the¬† IBM Watson IoT Platform and other IoT Foundation services. This toolkit is developed as part of the Github open-source project: ¬†http://ibmstreams.github.io/streamsx.iot
Complex Event Processing Toolkit (com.ibm.streams.cep)
This toolkit provides the ‘MatchRegex’ operator to perform complext event processing in Streams. ¬†This allows you to use patterns, expressed as regular expressions, to detect composite events in a streams of tuples.
Data Explorer Toolkit (com.ibm.streams.dataexplorer)
This toolkit provides an operator that enables streams processing applications to insert data into IBm InfoSphere Data Explorer.
Database Toolkit (com.ibm.streams.db)
This toolkits provides a set of operators that enable your Streams application to integrate with a wide range of external data systems. ¬†Some of the supported data systems include: IBM DB2, IBM InfoSphere BigInsights, IBM Netezza, Oracle Database, Teradata Database, etc. ¬†This toolkit provides adapters where you can read and write from the supported data systems. ¬†You can also run SQL queries from these systems.
Financial Services Toolkit (com.ibm.streams.financial)
The Financial Services Toolkit provides pre-built solution models, finance-specific edge adapters, and operators to reduce the time and effort needed to develop a wide variety of financial applications.
Geospatial Toolkit (com.ibm.streams.geospatial)
The Geospatial Toolkit includes operators and functions that facilitate efficient processing and indexing of location data. ¬†The toolkit provides a set of operators that helps you easily perform tasks like geofencing, hangout. ¬†It also provides a rich set of functions for location data manipulation, determining distances, figuring out if geometries intersect, etc.
HBase Toolkit (com.ibm.streamsx.hbase)
The HBase toolkit provides support for interacting with Apache HBase systems from InfoSphere Systems. ¬†We provide support for reading and writing to HBase.
This toolkit is developed as part of the Github open-source project: ¬†http://ibmstreams.github.io/streamsx.hbase/
HDFS Toolkit (com.ibm.streamsx.hdfs)
The HDFS Toolkit provides operators that can read and write data from Hadoop Distributed File System (HDFS)
This toolkit is developed as part of the Github open-soruce project: ¬†http://ibmstreams.github.io/streamsx.hdfs/
Inet Toolkit (com.ibm.streamsx.inet)
The Inet toolkit provides support for common internet protocols. ¬†This tookit contains a subset of functions from the Inet Toolkit Project from Github.
See this link: ¬†http://ibmstreams.github.io/streamsx.inet/
MQTT Toolkit (com.ibm.streamsx.mqtt)
The MQTT Toolkit provides integration between Streams and MQTT providers.
This toolkit is developed as part of the Github open-source project: ¬†https://github.com/IBMStreams/streamsx.mqtt
Messaging Toolkit (com.ibm.streamsx.messaging)
The Messaging Toolkit provides operators that allow your Streams application to read and send messages to popular messaging systems like Websphere MQ and Apache MQ. Support for Kafka, JMS and MQTT is¬† now available via separate toolkits.
This toolkit is developed as part of the Github open-source project: ¬†https://github.com/IBMStreams/streamsx.messaging
¬†Mining Toolkit (com.ibm.streams.mining)
The Mining Toolkit include operators that you can use to mine data streams by applying modes. ¬†Stream mining requires applying models that were learned from history to streaming data in order to detect patterns of interest in realtime. ¬†The Mining Toolkit supports scoring complex models against realtime data using the Predictive Model Markup Language (PMML) standard.
R-Project Toolkit (com.ibm.streams.rproject)
The R-project Toolkit includes an RScript operator, that you can use to run R commands and apply complex data mining algorithms to detect patterns of interest in data streams. ¬†The R scripts can also be updated dynamically on demand, without having to restart your Streams application.
Rules Toolkit (com.ibm.streams.rules)
The Rules toolkit integrates with IBM Operational Decision Manager (ODM). ¬†It allows you to execute business rules against streaming data, thereby allowing you to make critical business decisions in realtime.
For more information about use cases and how Streams and ODM can work together, refer to this technote: ¬†Patterns for IBM Operational Decision Manger in Big Data Streams
Telecommunications Event Data Analytics Toolkit (com.ibm.streams.teda)
The Telecommunications Event Data Analytics toolkit provides a set of generic operators that are used in telecommunications applications, and it also provides an application framework that enables you to set up new file-to-file applications. These applications are based on code templates and support customization, configurable parallel processing, graceful application shutdown, and reliable file processing.
Text Toolkit (com.ibm.streams.text)
The Text Toolkit provides an operator that helps you extract and analyse information from text data. ¬†The Text Toolkit integrates the Text Analytics component from IBM InfoSphere BigInsights. ¬†By using the Text Toolkit, a streams processing application can read text data and derive structured information that is based on various rules. These rules are defined in extractors, which are programs that extract information from within a text field. Extractors are written in AQL.
Time Series Toolkit (com.ibm.streams.timeseries)
This toolkit provides a rich set of operators and functions for you to process and analyse time series data. ¬†A time series is a sequence of numerical data that represents the value of an object or multiple objects over time. ¬†With the time series toolkit, you can read, repair or condition a time series in real time. ¬†You can also perform statistical analysis, correlations, decomposition and transformation of your time series data. ¬†Last but not least, we provide a set of data modeling operators, that allows you to create statistical models for prediction and regression analysis.
Cybersecurity Toolkit (com.ibm.streams.cybersecurity)
The Cybersecurity toolkit can detect active threats occurring within a network in real-time. The toolkit uses machine learning models to analyze and score networking traffic. The toolkit contains analytics that can build profiles of either domains or hosts within a network and report on suspicious behaviour in real-time. Furthermore, the toolkit can predict whether a domain should be added to a blacklist. This analytics provided as part of the toolkit can be used in tandem with existing forensic security software to enable both online and offline cybersecurity analysis.
Spark MLLib Toolkit (com.ibm.streamsx.sparkmllib)
The Spark MLLib toolkit can load machine learning models saved using Apache Spark’s MLLib library and use them for scoring real time data in IBM Streams. The toolkit supports a number of machine learning algorithms including collaborative filtering, K-means clustering, decision trees, regression models such as isotonic, logistic and linear regressions and others.
This toolkit is developed as part of the Github open-source project: https://github.com/IBMStreams/streamsx.sparkMLLib
DPS Toolkit (com.ibm.streamsx.dps)
The Distributed Process Store (DPS) toolkit provides a collection of APIs that allows non-fused SPL, Java and C++ operators to share data by using an external NoSQL key-value store such as Redis.
This toolkit is developed as part of the Github open-source project: https://github.com/IBMStreams/streamsx.dps