The goal of statistical classification is to use an object’s characteristics to identify which class (or group) it belongs to. Such classifiers work well for practical problems such as document classification. The LinearClassification operator identifies the category of text from streaming data according to a model. It is part of the IBM Streams NLP Toolkit (formerly known as Extension Text Toolkit). This post describes how to use it.
Text extraction is one means to get insights to unstructured data like text or speech transformed into text. There are different methods to write text extraction rules. One of them is the UIMA Ruta language.
The RutaText operator extracts data from streaming text according to predefined UIMA Ruta rules. It is part of the IBM Streams Natural Language Processing (NLP) Toolkit. This post describes how to use it.
In IBM Streams 4.2, we have added support for authoring rules compatible with the Operational Decision Manager (ODM) product in Streams Studio, converting them to an SPL composite and using them for real-time analysis in IBM Streams. In this tutorial, we will develop a sample application that will demonstrate each of the steps and walk you through the process of using rules within your streaming applications.