2021 Call for Code Awards: Live from New York, with SNL’s Colin Jost! Learn more

Build a machine learning model for calculating product order return propensity


In the growing era of e-commerce, products being returned amounts to a large portion of the lost revenue. The sooner we can identify the chances of an order being returned the better equipped we are at reducing the loss of revenue — directly and indirectly in the form of lost customers.

This developer pattern shows how to build and deploy a machine learning model on IBM Cloud Pak for Data using IBM Watson® Studio and Watson Machine Learning. This model can then be used to predict the probability (return propensity) of a particular order being returned.


As part of the intelligent IBM Sterling Call Center solution, our AI assistant can surface real-time insights about a customer’s orders, based on the customer’s order history and previous transactions. It also provides actionable recommendations to provide easy resolutions to common customer issues, leading to an enhanced customer experience. Machine learning models and techniques are used on the aggregated customer data from multiple data sources to gain insights like customer lifetime value, churn, return propensity, and up-sell purchase probability. These help the call center associate know the customer better and make better business decisions — up-selling, cross-selling, and customer appeasement — during a customer conversation.

This developer code pattern demonstrates how to build and deploy a machine learning model using a dataset consisting of curated order records obtained from IBM Sterling Order Management. When you have completed this code pattern, you will understand how to:

  • Leverage IBM Cloud Pak for Data, Watson Studio, and Watson Machine Learning to build and deploy machine learning models.
  • Build a model to classify returns.
  • Generate predictions using the deployed model by making REST calls.



  1. User loads the Jupyter Notebook into IBM Cloud Pak for Data.
  2. The data set is loaded into the Jupyter Notebook, either directly from the GitHub repo or by uploading a copy obtained from the GitHub repo.
  3. The data is preprocessed, and machine learning models are built and saved to Watson Machine Learning on IBM Cloud Pak for Data.
  4. The model is deployed into production on IBM Cloud Pak for Data and a scoring endpoint is obtained.
  5. Using the scoring endpoint, the model is used to predict the propensity of returning an order using a front-end application.


Ready to get started? Check out the README for step-by-step details on:

  • Creating a new project
  • Adding the dataset and custom library ZIP to the assets section of your project
  • Adding the notebook to your project
  • Following the steps in the notebook
  • Loading and preprocessing the data
  • Building the model
  • Saving and deploying the model
  • Testing the model
  • Creating a Python Flask app that uses the model