This tutorial explains how to add trusted AI packages to a custom Anaconda channel and use them inside IBM Watson® Studio on IBM Cloud Pak® for Data to solve a business use case.
Learning objectives
We’re trying to solve a binary classification problem to predict whether a mortgage is approved. Various models are evaluated on various trust dimensions, including predictive performance, fairness, adversarial robustness, and more.
When using the Anaconda central-enterprise repository, you can:
- Get access to more than 7,500 Python and R packages
- Add your own proprietary packages
- Manage package vulnerability, access, and usage — providing a secure pipeline
- Easily distribute packages across user workflows
We will combine these features to develop a secure application.
Steps
- Install IBM Cloud Pak for Data.
- Install the Anaconda repository for IBM Cloud Pak for Data. Store the link to the Anaconda user management UI (Keycloak) and login credentials.
- Use the login credentials displayed to open the Anaconda Team Edition landing page.
- Create channels and set appropriate permissions.
- Since trusted AI packages are not available by default, upload the package files (wheel, sdist, egg) to the Anaconda channel.
- Configure Anaconda in IBM Cloud Pak for Data to access packages from your Anaconda repository.
- Configure Watson Studio environment to use the custom channel.
- Create a notebook in the custom environment and run it.
Trusted AI packages include:
- AIF360, which helps detect and mitigate bias in machine learning models throughout the AI application lifecycle
- AIX360, offering interpretability and explainability of data and machine learning models
- ART, which is a Python library for machine learning security, including evasion, poisoning, extraction, and inference
Steps
Step 1. Use the credentials
Log-in credentials will look like this:
User anaconda created, realm=dev, roles=admin
password: xxxx
User admin created, realm=master, roles=admin
password: xxx
…
…
…
Install success
Please login using admin username and generated admin password above at http://xx.xx.xx.xx/auth/admin
Step 2. Create a new channel
Create a channel by clicking on Create Channel and filling the details and setting the privacy. For this article, the channel name is set as trusted_ai.
Step 3. Upload files
Next, you will upload package files to the trusted_ai channel.
Package | Wheel File location |
---|---|
AI Fairness 360 | https://pypi.org/project/aif360/#files |
AI Explainability 360 | https://pypi.org/project/aix360/#files |
Adversarial Robustness Toolbox | https://pypi.org/project/adversarial-robustness-toolbox/#files |
After package upload, you will see the following image.
You can view the vulnerability scores of the packages, groups using this channel, dependencies for each package, and more. This provides a way to create secure managed pipelines.
Step 4. Access channel packages
Update the condarc IBM Cloud Pak for Data file to access the packages from the new channel.
channel_alias: http://xx.xx.xx.xx/api/repo
channels:
- trust_ai
default_channels:
- http://xx.xx.xx.xx/api/repo/trust_ai
ssl_verify: False
Step 5. Create a Watson Studio project
Log in into your Cloud Pak for Data cluster and create a new Watson Studio project and create new environment.
Step 6. Update definition
Update the environment definition to use by clicking Create under customization. Edit using the instructions and apply the changes.
Step 7. Run using new environment
Upload the mortgage approval and run using the new environment.
The notebook will start by applying the customizations. Now you can run the notebook without installing trusted AI packages.
Summary
This tutorial has shown how to add trusted AI packages to the custom Anaconda channel, create a project, and use the packages inside IBM Watson Studio on IBM Cloud Pak® for Data to solve a business use case.