Build a predictive machine learning model quickly and easily with IBM SPSS Modeler

This tutorial is part of the Getting started with IBM Cloud Pak for Data learning path.

In this tutorial, we will use IBM Cloud Pak for Data to build a predictive machine learning model with (IBM SPSS Modeler) and decide whether a telco customer will churn or not. IBM Cloud Pak for Data is an interactive, collaborative, cloud-based environment that allows developers and data scientists to work collaboratively, gain insight from data and build machine learning models.

Learning objectives

After completing this tutorial, the user will learn:

  • How to upload data to IBM Cloud Pak for Data
  • How to create an SPSS Modeler flow
  • How to use the SPSS tool to inspect data and glean insights
  • How to modify and prepare data for AI model creation using SPSS
  • How to Train a machine learning model with SPSS and evaluate the results.

Prerequisites

Estimated time

Completing this tutorial should take about 30 minutes.

Steps

  1. Upload the data
  2. Create an SPSS Modeler Flow
  3. Import the data
  4. Inspect the data
  5. Data preparation
  6. Train the ML Model
  7. Evaluate the results

Step 1: Upload the data

Download the Telco-Customer-Churn.csv dataset.

From the Assets tab of your project, click on the 01/00 icon. You can either drag and drop the file or click on browse to choose and upload the Telco-Customer-Churn.csv file.

Upload data

Step 2: Create an SPSS Modeler flow

From the Project home, click on Add to Project button and choose Modeler Flow. Give the flow a meaningful name such as “Telco Customer Churn Flow” and click Create.

Create flow

Step 3: Import the data

  1. From the Import section drag and drop a Data Asset node on the canvas. Double click on the node and click on Change data asset.

Data asset

  1. On the Assets page, open the Data Assets tab, choose the “Telco-Customer-Churn.csv” file you have previously uploaded and click OK.

Import data

Step 4: Inspect the data

  1. To gain insight on your data, open the Output tab and drag and drop the Data Audit node onto the canvas. Connect the Data Audit node to the Data Asset node by drawing a line between the little circles on their sides. The node will be automatically renamed as 21 Fields.

Data Audit

  1. Click on the three dots on the Data Audit node or right click on the node to open up the menu for the node and click Run. The output can be viewed from the Outputs menu on the right. Double click on the output to view statistics about the data.

Data Inspection

Data Inspection 2

  1. Click on the project name on top to go back.

Step 5: Data preparation

  1. From the Field Operations tab, drag and drop the Type node onto the canvas. Connect the Type node with the Data Asset node and double click on the Type node to make the necessary configurations.

Type

  1. Click on Read Values. Check if the Measure and Role for each Field is correct. Change the role of Churn field from Input to Target. Then click Save to close the tab.

Data Preparation

Step 6: Train the ML model

  1. From the Modeling tab, add Random Forest node onto the canvas and connect with the Type node. The node will be automatically renamed to Churn.

Random Forest

  1. Right click on the Random Forest node and click Run. When the execution is done, you will see a new golden nugget-like Churn node added to the canvas.

Start Training

  1. Right click on the Churn golden nugget node and choose Preview to inspect the output results.

Step 7: Evaluate the results

  1. Drag and drop an Analysis node from Output tab onto the canvas. Connect with the Churn golden nugget node. Right click on the Analysis node and click Run.

Analysis

  1. From the Outputs tab on the right, double click on the Analysis output to gain insight on the accuracy of the results. Right click on the node and click Run. Then from the Outputs menu double click on the Analysis output.

Analysis Output

  1. Click on the flow name on top to go back.

  2. From the Graphs tab, drag and drop the Evaluation node onto the canvas and connect with the Churn golden nugget node.

Evaluation

  1. Right click on the Evaluation node and click Run. Then double click on the R-Churn output to visualize the graph. Click on the flow name to go back.

Evaluation Graph

Summary

This tutorial demonstrates a small example of creating a predictive machine learning model on IBM SPSS Modeler on IBM Cloud Pak for Data. The tutorial goes over on importing the data into the project and the modeler flow, and preparing the data for modeling. The tutorial then goes over the steps of choosing an appropriate algorithm for the data and training a prediction model. The last step of the tutorial is about how to visualize and evaluate the results of the trained model.

This tutorial is part of the Getting started with IBM Cloud Pak for Data learning path. To continue the series and learn more about IBM Cloud Pak for Data, take a look at the next tutorial, Monitoring the model with Watson OpenScale.

Begum Demirel
Scott Dangelo