Build a machine learning recommendation engine to encourage additional purchases based on past buying behavior


Most websites selling products online show you a list of items that you might be interested in. The better the recommendations the more likely that you will buy any of these, which will increase their sales. But how are these recommendations created? This code pattern shows you how to build a recommendation engine from customer data with Jupyter Notebooks and Apache Spark, which are all open source projects. When combined with Watson Studio and Watson Machine Learning you can quickly produce an interactive dashboard to explore and test a recommendation model.


Using purchase data from all customers is the fastest way to create recommendations. With this data, you’re able to create groups (clusters) of customers that have bought similar products. Within each cluster are customers who are more similar to each other than the customers in other groups.

In this code pattern, we use historical shopping data to build a recommendation engine with Spark and Watson Machine Learning. The model is then used to create a list of recommendations based on the contents of a shopping basket.

When you have completed this code pattern, you will understand how to:

  • Use Jupyter Notebooks in IBM Watson Studio
  • Build a recommendation model with SparkML and Watson Machine Learning to provide product recommendations for customers based on their purchase history



  1. Log in to IBM Watson Studio.
  2. Load the provided notebook into Watson Studio.
  3. Load and transform the customer data in the notebook.
  4. Build a k-means clustering model with SparkML.
  5. Deploy the model to Watson Machine Learning.
  6. Test and compare the models built in the notebook and through the Watson Machine Learning API


Find the detailed instructions in the README file. These steps will explain how to:

  1. Sign up for Watson Studio.
  2. Create a project and add services.
  3. Create a notebook.
  4. Load customer data in the notebook.
  5. Add a Watson Machine Learning service.