Get the code
View the demo
By Mark Sturdevant, Jacques Roy | Updated April 22, 2018 - Published April 4, 2018
If you’re interested in doing data analytics on event data while it is streaming, then this code pattern is for you. This code pattern uses Jupyter Notebook, Spark SQL, and matplotlib to show taxi trip event statistics while the events are streaming. A Java program streams the data into IBM Db2 Event Store which is optimized for event-driven data processing and analytics.
In this code pattern, a Java program runs as a daemon and submits events to IBM Db2 Event Store. The Jupyter Notebook is used to show how to interact with the event store using Python. An animated matplotlib chart is used to visualize the changing data while the events are streaming. Taxi trip data is used as the event stream. The average trip duration for each start time is continuously updated. The chart also shows the trip count to help visualize the growing database of taxi trips.
To keep things simple, for this example, we chose to use taxi data in a CSV file format. With this data, you can easily run this code pattern without signing up for another external data feed. But, it should be clear that this code pattern is designed to demonstrate event-driven data processing and analytics that scales to support massive amounts of data. This code pattern can easily be modified to work with your own event stream. Our data includes timestamps which make it easy to see simple statistics on all the data, including the latest events. With your own events, you can use the notebook to experiment with charts and show how those events are trending with up-to-the-minute statistics.
After completing this code pattern, you will understand how to:
Ready to put this code pattern to use? Complete details on how to get started running and using this application are in the README.
Get the Code »
Back to top