Attention: This article is obsolete. The general concepts still apply but the specific sample and code are no longer being maintained.

 

Introduction

IBM Streams (“Streams”) enables continuous and fast analysis of massive volumes of moving data to help improve the speed of business insight and decision-making. Streams provides an execution platform and services for user-developed applications that ingest, filter, analyze, and correlate the information in data streams.

With IBM® Streaming Analytics for Bluemix™, you can perform real-time analysis on data in motion as part of your Bluemix application. The Streaming Analytics service is powered by IBM Streams.

This article describes a demo application that uses a Streams Application to read from the FAA website to get airport weather and delay information. It retrieves tweets from the IBM Insights for Twitter Bluemix service. It uses Streams text analytic capabilities to categorize the area the tweets are related to such as “baggage” or “maintenance”. The Streams SPL application uses the HTTPTupleView and WebContext operators that create an embedded Jetty server to provide data. It uses a Bluemix Liberty application that uses a proxy servlet to provide the public web interface and interact with the Streams Jetty server.components

Information about the demo app along with instructions for deploying and running the demo can be found at Application Source

What the Demo Shows

The demo application reads from two different data sources. The first is FAA data provided at http://services.faa.gov/docs/services/airport/#airportStatus. This data includes the airport status for any major airport, including known delays and weather data from NOAA. The second feed uses the IBM Insights for Twitter Bluemix service to retrieve tweets from Twitter related to the major US airlines. This service provides annotated tweet data related that mention any airlines. Each tweet is processed using text analytics operators to produce an Airlines view, a Cities view and a Complaints view. The complaints view for examples looks for words such as (luggage, baggage, suitcase, lost, wait, delay, delays, cancel, cancellation, service, attendant, food, rude, maintenance, repair, fault, faulty) and groups them into 4 categories: baggage, delay, service and maintenance. The complaints are aggregated for each airline and airport. The results are displayed in a browser.

Main2

The main page shows a map of the United States with airport locations indicated. A green circle indicates a location, that has FAA weather data, is not experiencing a delay and has no sentiment. A black circle would indicate a delay but no sentiment. An arrow over the airport indicates there is sentiment available. The arrow will point in the direction of the sentiment, that is,  pointing up indicates more positive sentiment positive, down more negative sentiment and to the right neutral sentiment neutral.
You can zoom in the map over areas for a closer look:
ZoomCalifornia
You can click on an airport for its details. For example, clicking on the San Jose airport shows the weather, no delay indicated and the positive sentiment:
hoverSJC
Zooming in over New York shows one airport that is black (indicating a delay) and arrow pointing down indicating overall negative sentiment in the tweets related to that airport:
zoomNY
Clicking on this airport shows the details:
hoverNY
The table under the map shows a summary social reputation trend for each Airline:
trends

The interface provides a set of buttons that let you drill into the underlying data in tabular form:
revealButtons

Clicking the FAA buttons shows the weather and delay details for each airport:
revealFAA

Clicking the Airport button shows the sentiment details for each airport:
revealAirport

Clicking the Airline button shows the sentiment details for each airline:
revealAirline

Clicking the Tweets Sample buttons shows several of the recent tweets:
revealTweets

Details of the components

The Streams Application

MasterController

The streams application consists of MasterController main composite that is made up of 3 parts:

  1. The WebContext operator that instantiates a jetty webserver running on the streams application resource, and serves the HTML and javascript necessary for the browser user interface.
  2. The FAA composite that is responsible for retrieving and processing the data from the FAA website.
  3. The Tweet composite that is responsible to interact with the Bluemix Insights for Twitter service to retrieve and process tweet data.

The WebContext Operator

The WebContext operator instantiates the jetty server and serves the html and javascript files. These javascript files will use dojo widgets to mange and display the information produced by the FAA and Tweet composites.

The FAA Composite

The FAA composite that is responsible for retrieving and processing the data from the FAA website. It uses the InetSource operator to read the data, formats it and uses the HTTpTupleView operator that uses the same jetty webserver to provide a REST interface to the formatted FAA data that the user interface dojo objects depend on.
FAAComposite

The Tweet Composite

The Tweet composite is responsible to interact with the Bluemix Insights for Twitter service to retrieve tweet data. It uses text analytics to characterize that data related to airports, airlines and complaint categories. It then aggregates that data, and uses the HTTpTupleView operator that uses the same jetty webserver to provide a REST interface to the raw tweets, and the aggregated Airport and Airline data.
tweetComposite1
The portion of the application that interacts with the with the Bluemix Insights for Twitter service is encapsulated in a composite that  uses custom written java operators that internally use the java libraries provided by that service to retrieve the data and additional operators to control the retrieval and prepare the returned results for later processing.
tweetComposite2

The Bluemix Liberty for Java Application

The StreamsProxyApp Bluemix application uses a Liberty runtime to host a proxy servlet from EdwardsTx.net HTTP Proxy Servlet to provide a public browser interface to the internal jetty server.

Application Source

The demo code is available at:
StreamsAirportSentimentDemo

Instructions for deploying and running the Bluemix application and a pre-built streams application bundle file can be found at: Streams Airport Sentiment Demo README

Conclusion

This demo application shows some of the power of IBM Streams applications along with the ease of deploying those applications to the cloud in the Bluemix Streaming Analytics service and leveraging other Bluemix services for a complete solution.

23 comments on"Streaming Analytics Airport Sentiment Demo"

  1. Hi, I am trying to access the Demo code but it redirects me to a page that says “Jazzhub has retired”. It leads me to an IBM Cloud’s Toolchain page but there’s no provision to search this demo code. I also don’t know whom and where to request its Toolchain. Can anyone please help?

  2. MikeBranson January 29, 2018

    Sorry for the confusion. With the retirement of Jazzhub, this demo was also retired.

    If you are looking for another demo that uses Twitter, see the app that is used in our development guide: https://developer.ibm.com/streamsdev/docs/bluemix-streaming-analytics-development-guide/

    Other demos are linked off of this page: https://developer.ibm.com/streamsdev/docs/roadmap-for-streaming-analytics-service-on-bluemix/

    If none of these demos meet your needs, please reply with the characteristics of the kind of app you are looking for, and I’ll try to point you at the best example. Thanks.

    • Hi, Mike. I was actually trying to figure out how this project fetched data from the aviation authority (FAA) website so that I can check whether I can replicate it as is for the public website of Civil Aviation Authority in Pakistan or request CAA’s IT team for relevant APIs to fetch their data. I think there’s a relevant project in the “Other demos” link you gave me. It is about fetching traffic data from NYC DOT. However, I’d be thankful to you if you can let me know of a project that fetches airport’s data like this project does. Good day!

      • We’ll work on digging up that part of the code and get it to you. At a high level, the demo uses the InetSource operator to bring in that data. And as you’ve found, the NYCTraffic sample also does something similar with InetSource.

        Another example of using InetSource is the EventDetection sample (which should be under that other demos link you have). I’ve posed a code snippet from that one below, getting weather data. (I’ve deleted the code comments to get it to fit. So if you want to download the full sample, you will be able to see the comments). There are a lot variations possible with InetSource depending on the parameters that you set.

        stream RawObservations = InetSource() {
        param URIList : [“http://tgftp.nws.noaa.gov/data/observations/metar/cycles/00Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/01Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/02Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/03Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/04Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/05Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/06Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/07Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/08Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/09Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/10Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/11Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/12Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/13Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/14Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/15Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/16Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/17Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/18Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/19Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/20Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/21Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/22Z.TXT”,
        “http://tgftp.nws.noaa.gov/data/observations/metar/cycles/23Z.TXT”];

              incrementalFetch : true;
              fetchInterval : 240.0l;
              inputLinesPerRecord: 3u;
              initDelay: 10.0l;
              punctPerFetch : true;
        

        }

        Hope this helps. Let me know if you have any questions.

        • Hi again, Mike. Need a bit of help from you. I wanted to ask if this code snippet depicts .TXT static files. If yes, how is it being fetched here exactly? Is this link in accordance with the IBM Cloud Foundry app on IBM Cloud?

          Also, here is the website I need to fetch flight’s status data from: http://karachiairport.com.pk/
          Are there any API’s that you can recommend me for fetching this website’s data on real-time basis? If not, can you please correct me if I am wrong in my perception that the data can be fetched after short intervals, let’s say 10 mins and then add some snippets from EventDetection so that each updated status is regarded as an ‘event’?

          • MikeBranson April 13, 2018

            The .TXT files in my example can change in that they might have more data appended to them during the hour that they represent. That is why there is a fetchInterval parameter on the InetSource operator. It will poll those files for changes on that interval. They are fetched over HTTP. You can put any one of those URIs in a browser and you will see the file contents. The Streams app’s InetSource operator fetches the data. The CF app doesn’t participate in the fetching of the data.

            For the website you listed, are there URIs where they publish the (raw) data you want to analyze? Or are you trying to scrape it off of web pages? I guess the key question is where is the data you want to fetch/analyze? If its published somewhere like the weather data in EventDetection then its fairly straightforward, or if they provide a REST API or another API that allows you to programmatically retrieve it then its straightforward. But right now I don’t understand where the data is.

  3. Hello, Mike. Thanks for the guidance.

    About the website, I consulted them but it’s a semi-govt. body so I can’t get their URIs where they publish raw data so I am left with no option but to scrape it from here http://karachiairport.com.pk/Schedule.aspx?Type=Departure upon certain intervals. Can you guide me how to proceed with that?

    Also if this kind of demo is available anywhere because Jazzhub has retired…? I could use some guidance. Once again thanks a lot for your cooperation!

    • Natasha DSilva May 08, 2018

      Hi @Aamna
      If there is no API and that is the only data source, then scraping and parsing might be the only way.

      Bear in mind that this is not a robust solution at all since your application will break if they change the website structure.

      I used the HTTPRequest operator in the inet toolkit.
      This code allowed me to pull the raw html from that URL. It is based on the HTTPRequestDemo sample included in the toolkit.
      Latest release: https://github.com/IBMStreams/streamsx.inet/releases

      
      
      public composite Main {
      		
      	graph
      		stream<uint64 scrapeCount>  
      		Signal as O = Beacon() {
      			param 
      				iterations : 2; //number of attempts
      				period	 : 60.0; //period to wait
      			output O:
      				scrapeCount = IterationCount();
      		}
      		
      		stream<uint64 scrapeCount, rstring status, int32 stat, rstring contentEncoding, rstring contentType, list<rstring> responseHeader, rstring respData> Response 
      				 = HTTPRequest(Signal as I) {
      			param
      				fixedUrl: "http://karachiairport.com.pk/Schedule.aspx?Type=Departure";
      				fixedMethod : GET;
      				outputBody : "respData";
      				outputStatus : "status";
      				outputStatusCode : "stat";
      				outputContentEncoding : "contentEncoding";
      				outputContentType : "contentType";
      				outputHeader : "responseHeader";
      				fixedContentType : "text/html";
      				requestAttributesAsUrlArguments: true;
      				
      		}
      		
      		
      		() as Printer = Custom(Response as I) {
      			logic
      				onTuple I: {
      					printStringLn("******************************************");
      					printStringLn("Number of attempts ="+(rstring)scrapeCount);
      					printStringLn("status="+status+"	code="+(rstring)stat);
      					printStringLn("contentEncoding="+contentEncoding+"	   contentType="+contentType);
      					printStringLn("ResponseHeader");
      					printStringLn((rstring)responseHeader);
      					printStringLn("body");
      					printStringLn(respData);
      				}
      				onPunct I: println(currentPunct());
      		}
      	 
      	
      }
      
      

      It periodically pulls that data every 60 seconds.

      • Hi @Natasha, I’m trying to achieve the same thing as aamna. Do I run this code as it is? Because I tried doing that and it throws following errors
        1. A token is missing in the operator invocation head of the SPL program. The token is identifier. The expected token is ”
        6.missing ‘<' at 'Signal'

        • Natasha DSilva May 10, 2018

          Hi, its supposed to be
          public composite Main {

          graph
          stream <uint64 scrapeCount> Signal as O = Beacon() {…

      • Hi @natasha, this codes gives following error:
        CDISP0127E ERROR: The following toolkit file is out of date: ./toolkit.xml. This file is newer: Main.spl.

        what to do?

        • Natasha DSilva May 28, 2018

          This code gave that error when you tried to compile it?
          Are you compiling from Streams Studio or from command line? If you are compiling from Streams Studio, The toolkit.xml file should automatically be updated when you change a source file. Select the project, right click, and click “Build toolkit index” , and then try compiling again.

          If you are compiling from command line using “sc”, make sure that you are not using the –no-toolkit-indexing flag. This flag prevents the toolkit.xml file from being generated when you compile.
          If that stil doesn’t work,
          run spl-make-toolkit -i [directory]
          where [directory] is the directory containing the toolkit.xml, then try compiling again.

          Please let me know what works.

          • Thanks. I tried the Streams Studio solution you gave. But I am still getting errors. Here are some:

            CDISP0053E An unknown identifier was referenced in the SPL program:
            HTTPRequest.

            CDISP0053E An unknown identifier was referenced in the SPL program:
            GET.

            CDISP0053E An unknown identifier was referenced in the SPL program:
            scrapeCount.

            Multiple markers at this line
            – CDISP0053E An unknown identifier was referenced in the SPL
            program: stat.
            – CDISP0053E An unknown identifier was referenced in the SPL
            program: status.

            Multiple markers at this line
            – CDISP0053E An unknown identifier was referenced in the SPL
            program: contentEncoding.
            – CDISP0053E An unknown identifier was referenced in the SPL
            program: contentType.

            If you can please guide me on these too. I think they are all same in nature. So maybe one remedy would suffice. What would you say?

          • Natasha DSilva June 04, 2018

            Hi,
            It seems you do not have import statements for the needed operators. That is what “unknown identifier” means.
            Add
            use com.ibm.streamsx.inet.http::HTTPRequest; to the main composite. Sorry, I should have pasted the whole application.
            You also need to have the streamsx.inet toolkit added as a dependency to your application:
            Add the toolkit to Streams Studio by following the “Procedure” section of Adding toolkit locations.

          • Also, upon launching, it gives error:
            Error: Cannot run program “/tmp/3699269222452820477/BuildConfig/bin/standalone”: error=2, No such file or directory

            any idea what could be wrong?

  4. Javeria Nadeem May 05, 2018

    Hi, I want to try out this demo code for a project exhibition but it says that it has been obsolete. I want to create an exact version of this demo using IBM streams. Can you provide me with the working demo that has all the activated tools and products of IBM.

    My project is basically an exact replication of this demo, including retrieving twitter tweets and using weather data. I have seen the example of NYC Traffic but it does not have twitter application and text analytics involved. Guide me with a good example to fetch tweets using IBM streams and its analytics.

  5. Hi, Natasha. I tried adding that statement and the toolkit, still the errors are unresolved. Console says:
    Adding Toolkit Location: file:///opt/ibm/InfoSphere_Streams/4.2.1.1/toolkits/com.ibm.streamsx.inet/toolkit.xml encountered problems.
    Toolkit location file file:/opt/ibm/InfoSphere_Streams/4.2.1.1/toolkits/com.ibm.streamsx.inet/toolkit.xml is not a valid Toolkit Location list file.

    Any idea what could I be doing wrong?

    • Natasha DSilva June 18, 2018

      Hi,

      Does that file /opt/ibm/InfoSphere_Streams/4.2.1.1/toolkits/com.ibm.streamsx.inet/toolkit.xml exist?
      If not, it seems something has gone wrong with your Streams installation.
      I would download the latest copy of the inet toolkit: https://github.com/IBMStreams/streamsx.inet/releases/
      If you are using the Quick Start edition, get the `el6` release.
      Unpack it, and follow the instructions under the “Procedure” section of Adding toolkit locations.
      Then on your project, right click, click “Edit depenedencies” and add the newly downloaded inet toolkit (v2.9.6) to the dependencies, make sure no other versions of the com.ibm.streamsx.inet are listed.
      Then try compiling again.

      Lastly, can we please move this discussion to the forums: https://developer.ibm.com/answers/smart-spaces/22/streamsdev.html
      So others can see the solution as well?
      Please post your results there as a new question.
      Thank you.

      • Hi, Natasha. I really appreciate your help but I don’t understand that even after following your suggestions/instructions as is, I am still unable to get this working :/

        • Natasha DSilva June 21, 2018

          Hi, I am sure you are frustrated and sorry you are not making progress.
          I do not know exactly what you are trying to do, so is it possible for you to open an issue on the forums describing what you are trying to do exactly?
          https://developer.ibm.com/answers/smart-spaces/22/streamsdev.html

          – If possible, please attach a screenshot of what you are seeing/doing, and attach your project (In studio, right click your project, click export ,archive file).

          – If you cannot attach the project, paste the SPL you are trying to compile and attach the info.xml of the toolkit so we know what the dependencies are.
          – Also paste the output of
          `ls $STREAMS_INSTALL/toolkits`

          – An important comment: When reporting a problem, please be as detailed as possible and describe what you did, for example, “I right clicked my project and then chose “run”, I compiled using “sc – m abc def”, and then I got this error”. “It doesn’t work” is too vague.
          – Also include the operating system for Streams and Streams version, or QSE version if using QSE.

          Thank you and I hope to sort out your problems ASAP!

Join The Discussion