Welcome to our Quick Start Guide on using Streams Studio–the best way to develop streaming applications. One of the main ways that IBM Streams sets itself apart is its extensive tooling. Streams Studio is a great example of this. Studio is your hub for your SPL application development, test, debug, and launch. We are going to cover:

Overview

This section will outline the basic concepts and approaches for using the tooling to develop streams applications.   This section is linked with detailed instructions below so if you want to get started right away, you can jump to here to start building applications.

  • GRAPHICAL TOOLING – Start developing streaming applications intuitively using the SPL Graphical Editor, without having to know the details of the SPL language.  Studio’s graphical tooling supports all stages of development: from code development, to debug and trouble-shooting, to performance analysis and monitoring.
  • SPL TEXT EDITOR – Use the SPL Editor to take advantage of syntax highlighting and warnings, code-complete, and in-line error messages.
  • DUAL EDITING – Get the best of both worlds.  Develop your application quickly and efficiently using the graphical and text editors side by side. Changes in the SPL application from one editor will be reflected in the other editor automatically.
  • REFACTORING – Use the refactoring support to quickly rename or clone different elements of your Streams applications.  All references of the renamed element will be updated automatically.
  • AUTOMATIC BUILD – Catch errors as you work with automatic builds going on in the background every time you save your changes.
  • APPLICATION LAUNCH – Launch your applications easily and quickly with a couple of clicks.
  • APPLICATION MONITORING – Monitor your applications in real-time through live graphs, metrics views, and trace logs.

We are going to walk through building a basic application for the sake of demonstrating Studio features. To learn more about the Streams concepts and go through an introduction to SPL for the same application, take a look at the Streams Quick Start Guide.  If you are left with unanswered questions after going through this guide, you’re probably not the only one. Post your questions in the comments section or in our forum and we will get you an answer as soon as we can.

Installing Streams Studio

If you are using the VM, then Streams Studio is already installed and you can move to the next step. If you aren’t using the VM, you can install Streams Studio by following these instructions: Streams Studio Installation Instructions.

Starting and Configuring Streams Studio

  1. From the VM, double click on the icon InfoSphere Streams Studio (Eclipse).

1

  1. Choose a workspace other than the default one.  I chose /home/streamsadmin/myWorkspace. Next click OK in the Workspace Launcher window. When prompted by the Edit Install Location window, click OK.

  2. When you are prompted by the Add Streams Domain connection window, click Find domain… .

In the Select ensembles window click OK.

Click OK.

2

  1. Click on the Streams Explorer tab in the upper left part of the screen. The Streams Explorer tab is Studio’s dashboard for monitoring your domain, instance, and running applications (jobs).

3

  1. Expand the Streams Domain Connections and Streams Instances to see the status of the domain and instance that were already created and started for you.

If you are not already connected to a domain, right-click on Streams Domain Connections and select Add Connection… . Next, select Find domain…, choose your zookeeper ensemble, and then select your domain.

If you don’t already have an instance ready and started, follow these instructions on creating one and these instructions on starting one.

4

  1. Click on the Project Explorer tab (next to the Streams Explorer tab). The Project Explorer tab is where you get to start the fun part–developing and deploying your microsecond latency Streams applications.

Launching and Monitoring An Application

Download Sample Files
1. Download and unzip the sample files. This zip file contains the StockTradesAppComplete and the StockTradesAppStarter projects. The StockTradesAppComplete project contains a full application that reads in stock data from trades.csv, filters it, aggregates it, summarizes it, then prints it out to tradesSummary.csv. The StockTradesAppStarter project contains a boilerplate application that reads in stock data from trades.csv, then directly prints out to tradesSummary.csv. We will modify this application to add in the filter, aggregate, and summary steps.

  1. Import the StockTradesAppComplete project. File -> Import. Expand InfoSphere Streams Studio in the Import window, then select SPL Application Project and click Next.

1

Browse to the location where you unzipped the sample projects and select StockTradesAppComplete. Check the box next to StockTradesAppComplete, then click Finish.

2.1

Expand the project in your Project Explorer tab. Your workspace should now look like this:

3

  1. Launch your application. In the Project Explorer view, right-click on TradesAppMain and select Launch… .

Leave all of the defaults in the Edit Configuration box. Click Apply, then click Continue.

Launching like this–directly to an instance–means that you are launching a Distributed build. Distributed builds are used in production environments and for running across multiple hosts. The other option is to do a Standalone build. A Standalone build allows you to run your application as a single binary executable. This is useful for the early stages of development. You can find out how to do a Standalone build here.

  1. Monitor your job. Click on the Streams Explorer tab to switch to that view.

19

Right-click on default:StreamsInstance@StreamsDomain and select Show Instance Graph. If you launch the application again, you can watch as the operators/PEs become healthy.

You can look at the metrics of each operator in a few different ways. The simplest way is to hover over the operator of interest within the instance graph.
metrics1
Alternatively, you can get more detail by right-clicking on your instance in the Streams Explorer view and selecting Metrics -> Show Metrics. You can expand the operators you are interested in to get live metrics data.
metrics

  1. Check your results. To check your results, right-click on your data folder in the Project Explorer view and select refresh.

10

Then double-click to open “tradesSummary.csv.” The first few lines should look something like this:

“TWX”,0,17.7,10.62
“FSL”,0,26.19,15.62
“FSLb”,0,26.29,20.928
“BK”,0,32.53,26.024
“NFJ”,0,21.59,17.26
“RIO”,0,41.15,32.872
“NLY”,0,11.23,8.97
“BGG”,0,39.81,31.848

Developing Your Streams Application in Studio

We are going to walk through the steps of developing the StockTrades application that you launched and monitored above. We will start from a base application that reads from trades.csv and writes out to tradesSummary.csv.

  1. Import the StockTradesAppStarter project. File -> Import. Expand InfoSphere Streams Studio in the Import window, then select SPL Application Project and click Next.

Browse to the location where you unzipped the sample projects and select StockTradesAppStarter. Check the box next to StockTradesAppStarter, then click Finish.

  1. Open the SPL graphical editor by right-clicking on TradesAppMain, then selecting Open With -> SPL Graphical Editor.

4

  1. Add a filter operator between the file source and sink. Many times, data streams have far more information than we are interested in. The filter operator helps reduce the amount of unnecessary processing and keeps your solutions efficient.

Click in the Find box and type Filter. Click on “Filter – Filter” and drag it onto the stream between TradeQuoteSrc and CheckedTradeSummaryFile.

dragAndDrop

Click 12 to clean up the view of the operator placement. Your application should now look like this (your filter name might be slightly different):

5

Double-click on the filter operator to open and modify its properties. Edit the filter parameter as shown below:

6

Save your work. The automatic build should complete successfully.

  1. Add an aggregate operator between the filter and sink. The aggregate operator allows you to compute user-specified aggregations over tuples gathered in a window. We often use this operator to calculate mean, medians, and sums.

Click in the Find box and type Aggregate. Click on “Aggregate – Aggregate” and drag it onto the stream between the filter operator and CheckedTradeSummaryFile.

Double-click on the filter operator to open and modify its properties. Navigate to the Window tab and click Edit… Fill out the form to configure a partitioned, tumbling window of tuples that evicts after a count of 5 tuples. A tumbling window means that the entire window is cleared every time it is filled up. Partitioned means that we will have multiple windows holding tuples of a certain key. In our case, we will partition by ticker symbol so that we can aggregate based on company.

7

Go to the Param tab and click Add… -> PartitionBy. Click OK. Fill in the partitionBy value with “ticker”.

8

Rather than finish the rest within the graphical editor, let’s demonstrate the SPL Editor and duel-editing mode. Right-click within the graphical editor and select Open with SPL Editor. Next, exit out of the Outline view. 14 Click on the SPL Editor tab of TradesAppMain.spl and drag it to make it side-by-side with the SPL Graphical Editor.

One of my favorite features of this side-by-side development is being able to click on an operator in the SPL Graphical Editor, and having it get highlighted in the SPL Editor.

To demonstrate this, click on the aggregate operator in the graphical editor.

9

Within the text editor, modify the aggregate operator’s output stream schema to be of type: stream<rstring ticker, float64 min, float64 max, float64 average>

Next, change the output section to this: Aggregate_4_out0 : average =Average(askprice), min = Min(askprice), max = Max(askprice) ;

The resulting operator should look like this (your operator names will probably have different numbers):

(stream<rstring ticker, float64 min, float64 max, float64 average>
            Aggregate_4_out0) as Aggregate_4 = Aggregate(Filter_3_out0 as inputStream)
        {
            window
                inputStream : tumbling, count(5), partitioned ;
            param
                partitionBy : ticker ;
            output
                Aggregate_4_out0 : average =Average(askprice), min = Min(askprice), max = Max(askprice) ;
        }

Save your work, then click on the graphical editor. You will get prompted as to whether or not you would like to load the changes that you made in the text editor. Click Yes. You can now see the text editor updates that you made in the graphical editor.

Your project should automatically build without errors.

There are no set rules on when the graphical editor is best, or when the basic text editor is best for development. When applications get large and complicated, I find that the graphical editor provides a great way to think about and build my conceptual approach, while the text editor gives me full control to quickly make the changes I want. Here is an example of an application where the graphical editor is saving me a lot of struggle:

17

 

 

To finish your application, you can paste the following code below your aggregate operator (make sure it’s still within the composite brackets). However, I recommend that instead of pasting this code, you try to match it using a mix of the SPL Graphical Editor and the code that you already have in the text editor.

(stream<rstring ticker, float64 min, float64 max, float64 average>
            CheckedTradesSummary) as CustomProcess = Custom(Aggregate_4_out0 as
            TradesSummary)
        {
            logic
                onTuple TradesSummary :
                {
                    if(average == 0.0l)
                    {
                        printStringLn("ERROR: " + ticker) ;
                    }

                    else
                    {
                        printStringLn("Submitting summary: " +(rstring) TradesSummary) ;
                        submit(TradesSummary, CheckedTradesSummary) ;
                    }

                }

        }

After you add the code and save your work, your application should look like this:

18

You can now submit and monitor this job as you did with the completed version you imported.

For more detailed information on how this application works, as well as the SPL constructs behind it, check out Samantha Chan’s great Streams Quick Start Guide.

A time-saving debug tip

When a Streams application is run in distributed mode, the application processes are run onto different hosts.  It may not be too obvious how you can debug application in this distributed environment.

For example in the CustomProcess operator from the StockTrades app, we have this print statement:

printStringLn("Submitting summary: " + (rstring) TradesSummary);

If you’re running your application in Standalone mode, that gets printed to the terminal and all is well. But what about if you’re running in Distributed mode on 15 systems, or you have 30 operators printing at the same time? This is where Studio comes to the rescue.

In the Instance Graph view, right-click on the CustomProcess operator and select Show Log -> Show PE Console. You should see printed results just from that PE that look like this:

Submitting summary: {ticker="TWX",min=0,max=17.7,average=10.62}
Submitting summary: {ticker="FSL",min=0,max=26.19,average=15.62}
Submitting summary: {ticker="FSLb",min=0,max=26.29,average=20.928}
Submitting summary: {ticker="BK",min=0,max=32.53,average=26.024}

Running a FileSink from every stream that you want to debug is also a popular approach, but I personally find the printing to be much more efficient.

I hope you found this helpful! Comment below if you have any questions, or if you have ideas on how we can make Studio work better for you.

3 comments on"Streams Studio Quick Start Guide"

  1. When I import StockTradesAppComplete I notice two things:

    1) The app TradesAppMain does not show [Build: Distributed]
    2) The “launch” option is greyed out when I right click on the app.

    My Console shows a clean build upon startup. Is there a step missing in the flow above?

    Thanks!

    • Thanks for catching that Bryan. I have updated the zip file to make sure they create a distributed build automatically (Studio won’t let you try to launch something that doesn’t have a build).

      If you don’t want to download the new zip, you can also right-click on a main composite -> New -> Distributed Build. Hope this helps!

  2. Hi Alex, I thought I had updated my post but apparently not. Thanks for the update. Creating a new build, distributed or standalone is, as you mention, the fix to allow a launch.

    Excellent article BTW!

Join The Discussion