Taxonomy Icon

Analytics

Stream and store retail order data for analysis

Get the code Watch the video

Summary

In this code pattern, we’ll build a Scala app that uses Akka to implement a WebSockets endpoint which streams data to a Db2 Event Store database. For our data, we’ll use online retail order details in CSV format. We’ll use Jupyter notebooks with Scala and Brunel to visualize the Event Store data.

Description

In order to quickly react to changes in your business, you need to have high-speed data collection for current event-driven and also historical analysis. We’re using thin, fast WebSockets clients to send messages via WebSockets and letting the Akka toolkit handle the message streams, transform the data, and feed it into Event Store.

Akka HTTP gives us a very easy way to provide the WebSockets endpoint. Akka Streams is a powerful and elegant way to move the data from source to sink. We’re using Akka Streams with Alpakka to parse and transform the data from CSV text strings or CSV files into Spark SQL Rows. We use a simple divertTo function to decide which messages are new orders and which ones are cancellations.

We’re providing code to implement EventStoreSink and EventStoreFlow. This ultimately pushes the transformed data into Event Store tables. Once the data is in Event Store, we use the bundled Jupyter notebooks environment to analyze and visualize the data.

This is a simple example you can run on a laptop, but it’s built with tools that are designed to scale. It can easily be extended, whether you want to build a client, use more Akka functionality to handle the data streams, or flex your data science muscles with Event Store and Jupyter notebooks.

After completing this code pattern, you’ll understand how to:

  • Implement WebSockets with Scala and Akka.
  • Use Alpakka Flows for CSV parsing and inserts into Event Store
  • Use Jupyter notebooks and Scala to interact with Event Store
  • Use Spark SQL and Brunel visualizations to analyze the data

Flow

flow

  1. Setup the database with a Jupyter Notebook.
  2. Submit CSV data via WebSockets.
  3. Use Akka and Alpakka to transform the data and feed it into Event Store.
  4. Present the data with Brunel visualizations in a Jupyter Notebook.

Instructions

Get the detailed instructions in the README file. These steps will show you how to:

  1. Clone the repo.
  2. Install IBM Db2 Event Store Developer Edition.
  3. Run the database setup notebook.
  4. Run the Scala app.
  5. Feed in data.
  6. Visualize the data.