Solve a business problem and predict customer churn using a customer churn dataset
Use Watson Machine Learning and Jupyter Notebooks on IBM Cloud Pak for Data to predict customer churn
This pattern is part of the Getting started with IBM Cloud Pak for Data learning path.
|100||Introduction to IBM Cloud Pak for Data||Article|
|101||Virtualizing Db2 Warehouse data with data virtualization||Tutorial|
|201||Data visualization with data refinery||Tutorial|
|202||Find, prepare, and understand data with Watson Knowledge Catalog||Tutorial|
|301A||Data analysis, model building, and deploying with Watson Machine Learning with notebook||Pattern|
|301B||Automate model building with AutoAI||Tutorial|
|301C||Build a predictive machine learning model quickly and easily with IBM SPSS Modeler||Tutorial|
|401||Monitoring the model with Watson OpenScale||Pattern|
In this developer code pattern, we’ll use IBM Cloud Pak® for Data to go through the whole data science pipeline to solve a business problem and predict customer churn using a Telco customer churn dataset. IBM Cloud Pak for Data is an interactive, collaborative, cloud-based environment. It can help data scientists, developers, and others interested in data science use tools to collaborate, share, and gather insights from their data — as well as build and deploy machine learning, and deep learning models.
Customer churn (when a customer ends their relationship with a business) is one of the most basic factors in determining the revenue of a business. You need to know which of your customers are loyal and which are at risk of churning — and you need to know the factors that affect these decisions from a customer perspective. This code pattern explains how to build a machine learning model and use it to predict whether a customer is at risk of churning. This is a full data science project, and you can use your model findings for prescriptive analysis later or for targeted marketing.
After you’ve completed this code pattern, you’ll understand how to:
- Use Jupyter Notebooks to load, visualize, and analyze data.
- Run Notebooks in IBM Cloud Pak for Data.
- Build, test, and deploy a machine learning model using Spark MLib on IBM Cloud Pak for Data.
- Deploy a selected machine learning model to production using IBM Cloud Pak for Data.
- Create a front-end application to interface with the client and start consuming your deployed model.
- User loads the Jupyter Notebook into the IBM Cloud Pak for Data platform.
- Telco customer churn data set is loaded into the Jupyter Notebook, either directly from the GitHub repo or as virtualized data after following the Data virtualization tutorial from the Getting started with Cloud Pak for Data learning path.
- Preprocess the data, build machine learning models, and save to Watson® Machine Learning on IBM Cloud Pak for Data.
- Deploy a selected machine learning model into production on the IBM Cloud Pak for Data platform and obtain a scoring endpoint.
- Use the model for credit prediction using a front-end application.
Ready to put this code pattern to use? Complete details on how to get started running and using this application are in the README, including how to:
- Create a new project.
- Create a space for machine learning deployments.
- Upload the dataset if you are not on the IBM Cloud Pak for Data learning path.
- Import Jupyter Notebook to IBM Cloud Pak for Data.
- Run the notebook.
- Deploy the model using the IBM Cloud Pak for Data UI.
- Test the model.
- Create a Python Flask app that uses the model.
This code pattern showed how to use IBM Cloud Pak for Data and go through the whole data science pipeline to solve a business problem and predict customer churn using a Telco customer churn dataset. The pattern is part of the Getting started with IBM Cloud Pak for Data learning path. To continue the series and learn more about IBM Cloud Pak for Data, take a look at the next code pattern titled Monitoring the model with Watson OpenScale.