Getting started with IBM Watson OpenScale (automated setup)

This tutorial is part of the Getting started with Watson OpenScale learning path.


In this tutorial, you’ll see how IBM® Watson™ OpenScale can be used to monitor your artificial intelligence (AI) models for fairness and accuracy. You’ll get a hands-on look at how Watson OpenScale will automatically generate a debiased model endpoint to mitigate your fairness issues and provides an explainability view to help you understand how your model makes its predictions. In addition, you’ll see how Watson OpenScale uses drift detection. Drift detection will tell you when runtime data is inconsistent with your training data or if there is an increase the data that is likely to lead to lower accuracy.

This tutorial works with IBM Cloud Pak for Data or on IBM Cloud (with a free trial). The automated setup is used to get you started quickly with an example model.

The fairness use case

The model used in this tutorial is a credit risk predictor. The data set contains loan applicant data and is used to predict “Risk” or “No Risk”. The data includes two attributes that are considered sensitive: sex (gender) and age. Using Watson OpenScale with this model, you will be able to detect, explain and fix gender discrimination in the credit risk predictor.

Automated setup

The automated setup guides you through the process by performing tasks for you in the background. The automated setup tour is designed to work with the least possible user interaction. It automatically makes the following decisions for you:

  • If you have multiple IBM Watson Machine Learning instances set up, the installation process runs an API call to list the instances and chooses the Watson Machine Learning instance that appears first in the resulting list.
  • To create a new lite version of a Watson Machine Learning instance, the Watson OpenScale installer uses the default resource group for your IBM Cloud account.

The automated setup ends with a guided tour, which highlights key features of Watson OpenScale as you move through the scenario by clicking Next. When you exit the tour (you can exit at any point), you can explore the UI on your own. The credit risk model was automatically deployed so that you have something to explore. This tutorial uses the credit risk model to help you explore the features of Watson OpenScale.

Estimated time

It should take you approximately 45 minutes to complete this tutorial.


In this tutorial, you learn how to:

  1. Provision a Watson OpenScale service
  2. Take the guided tour
  3. Examine model fairness
  4. Use a debiased model
  5. Explain how a prediction was determined
  6. Monitor model accuracy
  7. Detect drift in accuracy

Provision a Watson OpenScale service

In IBM Cloud Pak for Data

  1. Sign in to your IBM Cloud Pak for Data instance.
  2. Click the Services icon (add-ons-icon).
  3. Click the Watson OpenScale tile.
  4. Click Open.
  5. Click Auto setup.
  6. Choose whether to use the locally installed instance of Watson Machine Learning, or to use a remote instance.
  7. To use the local instance, set the Use Watson Machine Learning instance on the local environment checkbox, and click Next.
  8. Provide the Host name/IP address, Port, Username, Password, SSL option, and Database for your Db2 Warehouse, then click Prepare.
  9. Click Let’s go to tour the Watson OpenScale dashboard.

On IBM Cloud

  1. If you do not have an IBM Cloud account, register for a free trial account here.
  2. Create a Watson OpenScale instance from the catalog.
  3. Select the Lite (Free) plan, enter a Service name, and click Create.
  4. Click Launch Application to start Watson OpenScale.
  5. Click Auto setup to automatically set up your Watson OpenScale instance with sample data.
    Demo welcome
  6. Click Start tour to tour the Watson OpenScale dashboard.

Take the guided tour

At the end of the auto setup, you should have chosen to take the tour. This puts you in a guided tour to introduce you to the OpenScale user interface. As features are highlighted, simply read the pop-up and click Next to walk through the short demo. After you complete this guided tour, the remainder of the tutorial will give you more freedom to explore.

OpenScale tour

Finishing the tour

After you finish the auto setup and exit the tour, you can either add your own model deployment to the dashboard or continue to explore the tutorial deployment.

Note: To add your own model to the dashboard, click Add to dashboard.

In the rest of this tutorial, we’ll explore the UI using the credit risk model that was deployed during the auto setup.

Examine model fairness

The Insights Insight dashboard tab provides a high-level view of your deployment monitoring. This dashboard provides a summary of all deployments and a tile for each deployment. The auto setup configured a deployment for a German credit risk model, as shown in the following image.

Insight dashboard

Data for an individual deployment displays in a series of charts. The charts track metrics such as fairness, average requests per minute, and accuracy over days, weeks, or months.

  1. Select the Model Monitors tab.
  2. Select the German credit risk model tile to view more details about that deployment.
  3. Notice the red alert indicators. You should see a red indicator under Fairness for Sex (female). This indicates that there has been an alert for the Fairness monitor. Alerts are configurable based on thresholds for fairness outcomes, which can be set and altered as needed.
  4. Click on the Fairness score.

    Insight dashboard

  5. Use your mouse and hover over the chart to see the statistics for an individual hour.


  6. Click on the chart.

    gender bias

    The first view uses the Payload + Perturbed data set. This data set uses actual requests sent to the model, as well as perturbed data generated by Watson OpenScale to test the effect of altering certain feature values.

    Only 66% of the group female received favorable outcomes (your data might vary) compared to 77% of the group male.

    Notice the recommendation: Watson OpenScale has already created a model that is more fair!

  7. Click the Debiased radio button near the top to see how the debiased model performed.

    gener debiased

Use a debiased model

Click View Debiased endpoint. Watson OpenScale provides a scoring endpoint for this debiased model. Example code snippets are provided to help you use this debiased endpoint in your apps for further testing or production.

Explain how a prediction was determined

To better understand and fix bias, you want to examine some of the individual transactions that contributed to bias.

  1. Click View transactions to view the individual transactions that contributed to bias.
  2. Use the radio button to select Biased transactions.

A list of transactions where the deployment has acted in a biased manner is shown. Click Explain for any of the transaction IDs to get details about that transaction in the Explainability tab.

transaction explainability

The Explainability tab shows how this prediction was determined.

How this prediction was determined: The GermanCreditRiskModelICP predicts Risk with 58.37% confidence. The following features were most important in determining this prediction: Age (23.90%), CheckingStatus (14.02%), and LoanAmount (10.26%).

The chart also shows the features that indicated Risk or No Risk. In this example, the focus is on the factors that contributed the most to the Risk prediction.

Monitor model accuracy

Use quality monitoring to determine how well your model predicts outcomes. When quality monitoring is enabled, it generates a set of metrics every hour by default. You can generate these metrics on demand by clicking Check quality now.

quality monitor

You can review all metrics values over time on the Watson OpenScale dashboard. Instead of selecting a Fairness metric, use the left sidebar to select a Quality metric.

The following table shows available selections.

Metric Description
Area under ROC Area under recall and false positive rate curve.
Area under PR Area under precision and recall curve.
Accuracy Accuracy is proportion of correct predictions.
True positive rate (TPR) Proportion of correct predictions in predictions of positive class.
False positive rate (FPR) Proportion of incorrect predictions in positive class.
Recall Proportion of correct predictions in positive class.
Precision Proportion of correct predictions in predictions of positive class.
F1-Measure Harmonic mean of precision and recall.
Logarithmic loss Mean of logarithms target class probabilities (confidence). It is also known as Expected log-likelihood.

Detect drift in accuracy

Even if you start with a great training data set and create a fair and accurate model, over time your training data may become a less appropriate example of real world decisions. Watson OpenScale uses drift detection to warn you and help you update your model to be more relevant.

  • To see drift in accuracy and data consistency over time, use the left sidebar to select Drop in accuracy.

Watson OpenScale uses drift detection to alert you when there is a drift in accuracy or a drift in data consistency.

  • A drift in data consistency indicates that the runtime data is not consistent with the training data.
  • A drift in accuracy indicates an increase transactions similar to those that did not evaluate correctly during training.


Transaction contribution to drift

To see the transactions that contributed to drift:

  1. Click on the chart at a point in time that shows a drop in accuracy.
  2. Click on Transactions responsible for drop in accuracy.

Watson OpenScale analyzes all transactions to find the ones that contribute to drift. It then groups the transactions based on the similarity of each feature’s contribution to the drift.

In each group, Watson OpenScale also estimates the important features that played a major role in the drift in accuracy and classifies their feature impact as large, some, and small.


Fixing drift

After drift has been detected by Watson OpenScale, you must build a new version of the model that fixes the problem. A good place to start is with the data points that are highlighted as reasons for the drift. Introduce the new data to the predictive model after you have manually labeled the drifted transactions and use them to retrain the model.


This tutorial covered some of the features available in Watson OpenScale to help you detect and fix model issues with fairness, accuracy, and drift. The tutorial is part of the Getting started with Watson OpenScale learning path. To continue, look at the next step Monitoring the model with Watson OpenScale.