IBM Streams (“Streams”) enables continuous and fast analysis of massive volumes of moving data to help improve the speed of business insight and decision-making. Streams provides an execution platform and services for user-developed applications that ingest, filter, analyze, and correlate the information in data streams.
With IBM® Streaming Analytics for Bluemix™, you can perform real-time analysis on data in motion as part of your Bluemix application. The Streaming Analytics service is powered by IBM Streams.
This article describes a demo application that uses a Streams Application to read from the FAA website to get airport weather and delay information. It retrieves tweets from the IBM Insights for Twitter Bluemix service. It uses Streams text analytic capabilities to categorize the area the tweets are related to such as “baggage” or “maintenance”. The Streams SPL application uses the HTTPTupleView and WebContext operators that create an embedded Jetty server to provide data. It uses a Bluemix Liberty application that uses a proxy servlet to provide the public web interface and interact with the Streams Jetty server.
Information about the demo app along with instructions for deploying and running the demo can be found at Application Source
What the Demo Shows
The demo application reads from two different data sources. The first is FAA data provided at http://services.faa.gov/docs/services/airport/#airportStatus. This data includes the airport status for any major airport, including known delays and weather data from NOAA. The second feed uses the IBM Insights for Twitter Bluemix service to retrieve tweets from Twitter related to the major US airlines. This service provides annotated tweet data related that mention any airlines. Each tweet is processed using text analytics operators to produce an Airlines view, a Cities view and a Complaints view. The complaints view for examples looks for words such as (luggage, baggage, suitcase, lost, wait, delay, delays, cancel, cancellation, service, attendant, food, rude, maintenance, repair, fault, faulty) and groups them into 4 categories: baggage, delay, service and maintenance. The complaints are aggregated for each airline and airport. The results are displayed in a browser.
The main page shows a map of the United States with airport locations indicated. A green circle indicates a location, that has FAA weather data, is not experiencing a delay and has no sentiment. A black circle would indicate a delay but no sentiment. An arrow over the airport indicates there is sentiment available. The arrow will point in the direction of the sentiment, that is, pointing up indicates more positive sentiment , down more negative sentiment and to the right neutral sentiment .
You can zoom in the map over areas for a closer look:
You can click on an airport for its details. For example, clicking on the San Jose airport shows the weather, no delay indicated and the positive sentiment:
Zooming in over New York shows one airport that is black (indicating a delay) and arrow pointing down indicating overall negative sentiment in the tweets related to that airport:
Clicking on this airport shows the details:
The table under the map shows a summary social reputation trend for each Airline:
The interface provides a set of buttons that let you drill into the underlying data in tabular form:
Clicking the FAA buttons shows the weather and delay details for each airport:
Clicking the Airport button shows the sentiment details for each airport:
Clicking the Airline button shows the sentiment details for each airline:
Clicking the Tweets Sample buttons shows several of the recent tweets:
Details of the components
The Streams Application
The streams application consists of MasterController main composite that is made up of 3 parts:
The FAA composite that is responsible for retrieving and processing the data from the FAA website.
The Tweet composite that is responsible to interact with the Bluemix Insights for Twitter service to retrieve and process tweet data.
The WebContext Operator
The FAA Composite
The FAA composite that is responsible for retrieving and processing the data from the FAA website. It uses the InetSource operator to read the data, formats it and uses the HTTpTupleView operator that uses the same jetty webserver to provide a REST interface to the formatted FAA data that the user interface dojo objects depend on.
The Tweet Composite
The Tweet composite is responsible to interact with the Bluemix Insights for Twitter service to retrieve tweet data. It uses text analytics to characterize that data related to airports, airlines and complaint categories. It then aggregates that data, and uses the HTTpTupleView operator that uses the same jetty webserver to provide a REST interface to the raw tweets, and the aggregated Airport and Airline data.
The portion of the application that interacts with the with the Bluemix Insights for Twitter service is encapsulated in a composite that uses custom written java operators that internally use the java libraries provided by that service to retrieve the data and additional operators to control the retrieval and prepare the returned results for later processing.
The Bluemix Liberty for Java Application
The StreamsProxyApp Bluemix application uses a Liberty runtime to host a proxy servlet from EdwardsTx.net HTTP Proxy Servlet to provide a public browser interface to the internal jetty server.
This demo application shows some of the power of IBM Streams applications along with the ease of deploying those applications to the cloud in the Bluemix Streaming Analytics service and leveraging other Bluemix services for a complete solution.