This is a summary of all open-source projects from GitHub. Main IBMStreams organization can be found here:  http://ibmstreams.github.io/ To make a project proposal on Github, open an issue in this project here:  https://github.com/IBMStreams/administration

Need Help?

If you need help with any of these projects, please ask your question here:
  Note: streamsLogo indicates a toolkit that is shipped with Streams.

Adapters

Projects that help with integrating Streams with other systems.

streamsLogo HBase Toolkit (streamsx.hbase)

The HBase toolkit provides support for interacting with Apache HBase systems from InfoSphere Systems.  We provide support for reading and writing to HBase.  This toolkit has releases that supports Streams v3.2.1.x and v4.0.

This toolkit is included as part of the Streams v4.0 product.

Project Page:  http://ibmstreams.github.io/streamsx.hbase/

streamsLogo HDFS Toolkit (streamsx.hdfs)

The HDFS Toolkit provides operators that can read and write data from Hadoop Distributed File System (HDFS).   This toolkit has releases that supports Streams v3.2.1.x and v4.0.

This toolkit is included as part of the Streams v4.0 product.

Project Page:  http://ibmstreams.github.io/streamsx.hdfs/

streamsLogo Inet Toolkit (streamsx.inet)

The Inet toolkit provides support for common internet protocols.   This toolkit has releases that supports Streams v3.2.1.x and v4.0.

See this link:  http://ibmstreams.github.io/streamsx.inet/

streamsLogo Messaging Toolkit (streamsx.messaging)

The Messaging Toolkit provides operators that allow your Streams application to read and send messages to popular messaging systems, like Kafka, MQTT, Websphere MQ and Apache MQ.   This toolkit has releases that supports Streams v3.2.1.x and v4.0.  The Kafka operators here are going to be deprecated.  Use streamsx.kafka instead.  The RabbitMQ operators here are going to be deprecated.  Use streamsx.rabbitmq instead.

This toolkit is included as part of the Streams v4.0 product.

Project Page:  https://github.com/IBMStreams/streamsx.messaging

 Kafka Toolkit (streamsx.kafka)

Toolkit provides operators that allow your Streams application to read and send messages from Kafka.

Project Page:  https://github.com/IBMStreams/streamsx.kafka

 RabbitMQ Toolkit (streamsx.rabbitmq)

Toolkit provides operators that allow your Streams application to read and send messages from Rabbit MQ.

Project Page:  https://github.com/IBMStreams/streamsx.rabbitmq

Multi-Connection TCP Server Toolkit (streamsx.tcp)

This toolkit contains a TCPServer operator which allows for multi-connections, and is a multi-threaded source operator.  The operator accepts text or binary data from one or more TCP sockets.

Project Page:  https://github.com/IBMStreams/streamsx.tcp

Mongo DB Toolkit (streamsx.mongoDB)

This toolkit provides support for reading and writing streaming data to MongoDB.

Project Page:  https://github.com/IBMStreams/streamsx.mongoDB

Thrift Toolkit (streamsx.thrift)

This toolkit provides Thrift server and client functionality.

Project Page:  http://ibmstreams.github.io/streamsx.thrift/

CDC Toolkit (streamsx.cdc)

It provides support to efficiently read / write data from InfoSphere CDC (Change Data Capture).

Project Page:  https://github.com/IBMStreams/streamsx.cdc

Graph DB (streamsx.graphdb)

Repository is created for reading and writing from a graph database.

Project Page: https://github.com/IBMStreams/streamsx.graphdb

streamsLogoJDBC Toolkit (streamsx.jdbc)

It provides the ability to have an IBM Streams application to interact with databases via JDBC.

Project Page:  https://github.com/IBMStreams/streamsx.jdbc

Mail Toolkit (streamsx.mail)

It provides the ability to have an IBM Streams application to send or receive emails.

Project Page:  https://github.com/IBMStreams/streamsx.mail

Cassandra Toolkit (streamsx.cassandra)

It provides the ability to have an IBM Streams application send data to Cassandra

Project Page:  https://github.com/IBMStreams/streamsx.cassandra

 Object Storage Toolkit (streamsx.objectstorage)

Toolkit provides operators that allow your Streams application to read and send data from Object Storage.

Project Page:  https://github.com/IBMStreams/streamsx.objectstorage


 Parsers and Formatters

Projects that help with parsing data in popular formats to Streams tuples, or formatting Streams tuples data to other popular formats.

streamsLogoJSON Toolkit (streamsx.json)

The JSON toolkit allows you to convert data from JSON to Streams tuples format, and vice versa.  This toolkit has been tested on Streams 3.2 or later.

Project Page:  https://github.com/IBMStreams/streamsx.json

Document Toolkit (streamsx.document)

This toolkit allows extract text and metadata from documents in a binary formats such as PDF, Word, Office, etc. For this purpose the toolkit implements a DocumentSource operator.  Some of the supported text extractors are:  Apache Tika, PDFBox, TrueZip, JUnrar, Plain Text

Project Page:  https://github.com/IBMStreams/streamsx.document

Bytes Toolkit (streamsx.bytes)

This toolkit is for ease developing and analysis of binary data.  It provides functions to process string data:  ASCII to HEX, HEX to BIN, etc.  It is also able to extract raw string from binary data.

Project Page:  https://github.com/IBMStreams/streamsx.bytes

 Parquet Toolkit (streamsx.parquet)

Parquet is a columnar storage format for Hadooop.  This repository is created for hosting operators for reading and writing data in Parquet format.

Project Page:  https://github.com/IBMStreams/streamsx.parquet

 Adaptive Parser (streamsx.adaptiveParser)

This toolkit includes operators for parsing structured and semi-structured and unstructured text data.  The goal of this project is to ease the parsing, tuple structure definition, and tuple mapping development steps in a Streams application.

Project Page:  https://github.com/IBMStreams/streamsx.adaptiveParser

10-17-2013-12-58-06-PMSpeech2Text (streamx.speech2text)

Repository is under construction.  This projects allows Streams application to parse speech to text.

Project Page:  https://github.com/IBMStreams/streamsx.speech2text

 Avro Toolkit (streamsx.avro)

        Toolkit to support serialization and deserialization in Avro format

Project Page:  https://github.com/IBMStreams/streamsx.avro


 Analytics and Processing

Projects that help with processing and analysing data.

streamsLogo Spark MLLib (streamsx.sparkMLLib)

This repository contains a toolkit for real-time scoring using Spark MLLib.

Project Page:  https://github.com/IBMStreams/streamsx.sparkMLLib

SPSS Toolkit

This toolkit contains operators that allows your application to perform real-time scoring using existing SPSS models.  The SPSS model can also be updated dynamically at runtime without application restart.

Project Page: https://github.com/IBMPredictiveAnalytics/streamsx.spss.v4

Healthcare Toolkit (streamsx.health)

It is intended to contain operators and functions related to real-time analytics for healthcare.

Project Page:  https://github.com/IBMStreams/streamsx.health

Watson Explorer Toolkit (streamsx.watsonexplorer)

This toolkit provides operators and functions to  extract data and analytics results from IBM Watson Explorer (formerly known as IBM Data Explorer).  This toolkit is a replacement of the IBM Data Explorer toolkit (com.ibm.streams.dataexplorer).  It is recommended for customer to migrate from the Data Explorer toolkit to this Watson Explorer toolkit.

Project Page:  https://github.com/IBMStreams/streamsx.watsonexplorer

Regex Toolkit (streamsx.regex)

This toolkit provides support for the RE2 regular expression library.

Project Page:  http://ibmstreams.github.io/streamsx.regex/

Math Toolkit (streamsx.math)

This repository contains operators and functions for complex mathematics and statistics.

Project Page:  https://github.com/IBMStreams/streamsx.math

streamsLogoDate Time Toolkit (streamsx.datetime)

This toolkit contains additional operators and functions to process dates and times in data.

Project Page:  https://github.com/IBMStreams/streamsx.datetime

Geospatial Extension Toolkit (streamsx.geoext)

It is intended to contain a toolkit that provides extended functions and operators to process geospatial data

Project Page:  https://github.com/IBMStreams/streamsx.geoext

Transportation Toolkit (streamsx.transportation)

It is intended to contain adapters to access transit feed as well as generic transportation based operators and functions.  See this link for proposal:    https://github.com/IBMStreams/administration/issues/42

Project Page:  https://github.com/IBMStreams/streamsx.transportation

10-17-2013-12-58-06-PM Logging Toolkit (streamsx.logging)

This repository is under construction.  It is intended to contain operators and functions for analysing log files.  See this link for project proposal:  https://github.com/IBMStreams/administration/issues/45

Project Page:  https://github.com/IBMStreams/streamsx.logging

  Social Toolkit (streamsx.social)

It contains operators and functions for integrating Streams with social media sites.

Project Page:  https://github.com/IBMStreams/streamsx.social

Anomaly Detection Toolkit (streamsx.anomalyDetection)

This repository contains operators and functions for anomaly detection.

Project Page:  https://github.com/IBMStreams/streamsx.anomalyDetection

streamsLogoInternet of Things Toolkit (streamsx.iot)

It provides the ability to have an IBM Streams application easily interact with IoT, either in Bluemix (Streaming Analytics Service) or on-premises (IBM Streams).

Project Page:  https://github.com/IBMStreams/streamsx.iot

Solr Toolkit (streamsx.solr)

It provides the ability to have an IBM Streams application to integrate with Apache Solr.

Project Page:  https://github.com/IBMStreams/streamsx.solr

OpenCV Toolkit (streamsx.opencv)

It provides the ability to have an IBM Streams application process video feeds.

Project Page:  https://github.com/IBMStreams/streamsx.opencv

Insight for Weather Toolkit (streamsx.weather)

Toolkit for accessing data from the Insights for Weather Bluemix service

Project Page:  https://github.com/IBMStreams/streamsx.weather

Natural Language Processing Toolkit (streamsx.nlp)

This toolkit provides operators and functions to  extract information from, classify, and analyze text data.  It includes operators with various algorithms for calculating n-grams, lemmatization,  and content ranking.  It also includes operators that integrate with  Apache UIMA Ruta (Rule-based Text Annotation) scripts or existing project specific UIMA PEAR files.

Project Page:  https://github.com/IBMStreams/streamsx.nlp


Utilities

Useful utilities projects for Streams:

Visualization Toolkit

A project that supports building providing beautiful visualization for data from IBM Streams applications.

Project Page: https://github.com/IBMStreams/streamsx.visualization

streamsLogoTopology Toolkit

A project that supports building streaming topologies (applications) for IBM Streams in different programming languages, such as Java.

Project Page: http://ibmstreams.github.io/streamsx.topology/

Plumbing Toolkit (streamsx.plumbing)

This toolkit contains an operator that allows your application to dynamically manipulate application tuple flow to achieve best performance.

Project Page: https://github.com/IBMStreams/streamsx.plumbing

Process Store Toolkit (streamsx.ps)

The Process Store toolkit provides a simple way for the SPL and C++ operators that are fused inside a single PE to share application specific state information.  It does this via a collection of APIs that can be called from any part of the and C++ operator code.

Project Page:  https://github.com/IBMStreams/streamsx.ps

streamsLogo Distributed Process Store Toolkit (streamsx.dps)

The Distributed Process Store toolkit provides a simple way for the SPL, C++ and Java operators belonging to a single or multiple applications to share application state information via an external key-value store.  Some of the supported value stores systems are:  Memcached, Redis, Cassandra, IBM Cloudant, HBase, Mongo, Couchbase and Aerospike.

Project Page:  https://github.com/IBMStreams/streamsx.dps

Utilities Project (streamsx.utility)

This repository contains useful utilities for InfoSphere Streams.  For example, the repository currently has an utility that displays CPU utilization for PEs in a Streams Instance.

Project Page:  https://github.com/IBMStreams/streamsx.utility

streamsLogoNetwork Toolkit (streamsx.network)

It is intended to contain operators and functions for processing network data.

Project Page:  https://github.com/IBMStreams/streamsx.network

Transform Toolkit (streamsx.transform)

Contains building blocks operators to efficiently transform input data from one format to another

Project Page:  http://ibmstreams.github.io/streamsx.transform

Shell Toolkit (streamsx.shell)

Utility toolkit to execute shell commands in a Streams application.

Project Page:  https://github.com/IBMStreams/streamsx.shell

Pypi Toolkit (pypy.streamsx)

This is a step in allowing natural use of Streams for a Python developer. A project that will be registered with PyPi to allow ‘pip install’ of Python packages that support Python developers interacting with IBM Streams.

Project Page:  https://github.com/IBMStreams/pypi.streamsx

Metrics Toolkit

This toolkit supports monitoring Streams applications by producing a stream of metrics from one or more running jobs.

Project Page: https://github.com/IBMStreams/streamsx.metrics

 

Samples and Demos

Samples and demo projects that are useful about Streams.

Water Conservation Starter Kit (streamsx.waterConservation.starterKit)

Starter kit for smart and connected sprinkler system using Apache Quarks, Streaming Analytics and Insights for Weather

Project Page:  https://github.com/IBMStreams/streamsx.waterConservation.starterKit

Cybersecurity Starter Applications (streamsx.cybersecurity.starterApps)

This repository contains starter applications to help you get up and running with Streams Cybersecurity Toolkit quickly.

Project Page:  http://ibmstreams.github.io/streamsx.cybersecurity.starterApps/

Samples

This project contains a set of useful sample Streams applications.

Project Page:  https://github.com/IBMStreams/samples

Benchmark (benchmarks)

This repository contains performance benchmark applications for Streams.  The project contains the two email processing applications, one is written with InfoSphere Streams, and another one is written with Apache Storm.

These two benchmarks were used as part of a detailed performance report:  https://developer.ibm.com/streamsdev/wp-content/uploads/sites/15/2014/04/Streams-and-Storm-April-2014-Final.pdf

Project Page:  https://github.com/IBMStreams/benchmarks

Accelerator Demo Project (streamsx.demo.accelerator)

This repository contains a collection of demo streaming applications for analyzing smart phone accelerometers or gyroscope data.

Project Page:  https://github.com/IBMStreams/streamsx.demo.accelerometer

streamsLogo Resource Manager Project (resourceManagers)

This repository contains projects on getting Streams to work with other resource managers, like Yarn.

Project Page:  https://github.com/IBMStreams/resourceManagers

Mesos Resource Manager Project (streamsx.resourcemanager.resos)

This repository contains projects on getting Streams to work with Apache Mesos.

Project Page:  https://github.com/IBMStreams/streamsx.resourcemanager.mesos

Log Watch Demo (streamsx.demo.logwatch)

This repository contains a set of applications to demonstrate basic concepts of SPL and Streams, while working through some real-world examples.  The applications are self-contained, small, easy to understand, with well-defined problem statements.

Project Page: https://github.com/IBMStreams/streamsx.demo.logwatch

10-17-2013-12-58-06-PM Patterns Repository (streamsx.pattern)

This toolkit is under construction.  This repository is intended to host pattern classes and common functionality for Java primitive operators.  See this project proposal for details:  https://github.com/IBMStreams/administration/issues/51

Project Page:  https://github.com/IBMStreams/streamsx.patterns


Tutorials

Samples and demo projects that are useful about Streams.

Tutorials

This project contains a set of useful sample Streams applications.

Project Page:  https://github.com/IBMStreams/tutorials

Teda Tutorial

This project contains tutorials for TEDA.

Project Page: https://github.com/IBMStreams/streamsx.tutorial.teda

3 Comments on "GitHub Projects Overview"

  1. Please fix the link to github page for the visualization toolkit. The link is broken.

  2. Leonid Gorelik November 24, 2016

    NgramHashing Toolkit is merged with streamsx.nlp now.
    Also please update AdaptiveParser description, as it’s not only a repository for parsers, but a powerful generic parser by itself.

Join The Discussion