Build a predictive machine learning model quickly and easily with IBM SPSS Modeler
Tap into data assets and modern applications with complete algorithms and models that are ready for immediate use.
This tutorial is part of the Getting started with IBM Cloud Pak for Data learning path.
|100||Introduction to IBM Cloud Pak for Data||Article|
|101||Virtualizing Db2 Warehouse data with data virtualization||Tutorial|
|201||Data visualization with data refinery||Tutorial|
|202||Find, prepare, and understand data with Watson Knowledge Catalog||Tutorial|
|301A||Data analysis, model building, and deploying with Watson Machine Learning with notebook||Pattern|
|301B||Automate model building with AutoAI||Tutorial|
|301C||Build a predictive machine learning model quickly and easily with IBM SPSS Modeler||Tutorial|
|401||Monitoring the model with Watson OpenScale||Pattern|
In this tutorial, we will use IBM Cloud Pak for Data to build a predictive machine learning model with (IBM SPSS Modeler) and decide whether a telco customer will churn or not. IBM Cloud Pak for Data is an interactive, collaborative, cloud-based environment that allows developers and data scientists to work collaboratively, gain insight from data and build machine learning models.
After completing this tutorial, the user will learn:
- How to upload data to IBM Cloud Pak for Data
- How to create an SPSS Modeler flow
- How to use the SPSS tool to inspect data and glean insights
- How to modify and prepare data for AI model creation using SPSS
- How to Train a machine learning model with SPSS and evaluate the results.
Completing this tutorial should take about 30 minutes.
- Upload the data
- Create an SPSS Modeler Flow
- Import the data
- Inspect the data
- Data preparation
- Train the ML Model
- Evaluate the results
Step 1: Upload the data
Download the Telco-Customer-Churn.csv dataset.
Assets tab of your project, click on the
01/00 icon. You can either drag and drop the file or click on
browse to choose and upload the
Step 2: Create an SPSS Modeler flow
From the Project home, click on Add to Project button and choose Modeler Flow. Give the flow a meaningful name such as “Telco Customer Churn Flow” and click Create.
Step 3: Import the data
- From the Import section drag and drop a
Data Assetnode on the canvas. Double click on the node and click on Change data asset.
- On the Assets page, open the Data Assets tab, choose the “Telco-Customer-Churn.csv” file you have previously uploaded and click OK.
Step 4: Inspect the data
- To gain insight on your data, open the Output tab and drag and drop the
Data Auditnode onto the canvas. Connect the Data Audit node to the Data Asset node by drawing a line between the little circles on their sides. The node will be automatically renamed as 21 Fields.
- Click on the three dots on the Data Audit node or right click on the node to open up the menu for the node and click Run. The output can be viewed from the Outputs menu on the right. Double click on the output to view statistics about the data.
- Click on the project name on top to go back.
Step 5: Data preparation
- From the Field Operations tab, drag and drop the
Typenode onto the canvas. Connect the Type node with the Data Asset node and double click on the Type node to make the necessary configurations.
- Click on Read Values. Check if the Measure and Role for each Field is correct. Change the role of Churn field from Input to Target. Then click Save to close the tab.
Step 6: Train the ML model
- From the Modeling tab, add
Random Forestnode onto the canvas and connect with the Type node. The node will be automatically renamed to Churn.
- Right click on the Random Forest node and click Run. When the execution is done, you will see a new golden nugget-like
Churnnode added to the canvas.
- Right click on the Churn golden nugget node and choose Preview to inspect the output results.
Step 7: Evaluate the results
- Drag and drop an
Analysisnode from Output tab onto the canvas. Connect with the Churn golden nugget node. Right click on the Analysis node and click Run.
- From the Outputs tab on the right, double click on the Analysis output to gain insight on the accuracy of the results. Right click on the node and click Run. Then from the Outputs menu double click on the Analysis output.
Click on the flow name on top to go back.
From the Graphs tab, drag and drop the
Evaluationnode onto the canvas and connect with the Churn golden nugget node.
- Right click on the Evaluation node and click Run. Then double click on the R-Churn output to visualize the graph. Click on the flow name to go back.
This tutorial demonstrates a small example of creating a predictive machine learning model on IBM SPSS Modeler on IBM Cloud Pak for Data. The tutorial goes over on importing the data into the project and the modeler flow, and preparing the data for modeling. The tutorial then goes over the steps of choosing an appropriate algorithm for the data and training a prediction model. The last step of the tutorial is about how to visualize and evaluate the results of the trained model.
This tutorial is part of the Getting started with IBM Cloud Pak for Data learning path. To continue the series and learn more about IBM Cloud Pak for Data, take a look at the next tutorial, Monitoring the model with Watson OpenScale.