Ingest and analyze event data streams for timely insights

Get the code View the demo

Summary

If you’re interested in doing data analytics on event data while it is streaming, then this code pattern is for you. This code pattern uses Jupyter Notebook, Spark SQL, and matplotlib to show taxi trip event statistics while the events are streaming. A Java program streams the data into IBM Db2 Event Store which is optimized for event-driven data processing and analytics.

Description

In this code pattern, a Java program runs as a daemon and submits events to IBM Db2 Event Store. The Jupyter Notebook is used to show how to interact with the event store using Python. An animated matplotlib chart is used to visualize the changing data while the events are streaming. Taxi trip data is used as the event stream. The average trip duration for each start time is continuously updated. The chart also shows the trip count to help visualize the growing database of taxi trips.

To keep things simple, for this example, we chose to use taxi data in a CSV file format. With this data, you can easily run this code pattern without signing up for another external data feed. But, it should be clear that this code pattern is designed to demonstrate event-driven data processing and analytics that scales to support massive amounts of data. This code pattern can easily be modified to work with your own event stream. Our data includes timestamps which make it easy to see simple statistics on all the data, including the latest events. With your own events, you can use the notebook to experiment with charts and show how those events are trending with up-to-the-minute statistics.

After completing this code pattern, you will understand how to:

  • Install IBM Db2 Event Store developer edition
  • Interact with Db2 Event Store using Python and Jupyter Notebook
  • Use a Java program to insert into IBM Db2 Event Store
  • Query the database while inserts are in progress
  • Show live updates with an animated chart

Flow

flow

  1. Run the Jupyter Notebook
  2. Connect the Jupyter Notebook to Db2 Event Store to analyze the live event stream
  3. External Java program sends live events

Instructions

Ready to put this code pattern to use? Complete details on how to get started running and using this application are in the README.