Automate Kubeflow deployment to IBM Cloud with Schematics

Archived content

Archive date: 2024-06-25

This content is no longer being updated or maintained. The content is provided “as is.” Given the rapid evolution of technology, some content, steps, or illustrations may have changed.

In this tutorial, learn how to quickly deploy a Kubeflow cluster in IBM Cloud for multiple users, then run a sample pipeline to understand how various Kubeflow components work together to support machine learning operations.

Prerequisites

To use this tutorial, you need to prepare the following in IBM Cloud:

An IBM Cloud account (with the ability to create an IBM Kubernetes cluster, either classic or VPC Gen2)
An IBM Cloud API key
An organization
A space

You use the following tools:

Estimated time

It should take you approximately 2 hours to complete the tutorial, including time for the automation to complete.

What does the automation do?

When deploying to a classic cluster, the script and description are in this folder. The automation will:

Create a classic cluster
Create an IBM AppID service instance
Deploy Kubeflow for multi-user into the classic cluster
Configure Kubeflow for multi-user with AppID service

When deploying to a VPC Gen2 cluster, the script and description are in this folder. The automation will:

Create a VPC
Create a VPC Gen2 cluster
Create a VPC subnet
Create a VPC public gateway for the VPC cluster and subnet
Create an IBM AppID service instance
Deploy Kubeflow for multi-user into the VPC Gen2 cluster
Configure Kubeflow for multi-user with AppID service

Steps

Step 1. Set up prerequisites

To set up the prerequisites:

Create an IBM Cloud account with the ability to create classic or VPC Gen2 clusters.
Log in to your account, and select Manage -> Access (IAM).
Create an IBM Cloud API key.
Select Manage -> Account, then click Cloud Foundry orgs to create an organization.
Name the organization, and click Save.
Add a space to your newly created organization.
Select a region, name the space, then click Save.

Step 2. Create and configure the schematics workspace

Log in to your account.
From the IBM Cloud catalog, select the Schematics serivce.
Create a Schematics workspace, give it a name (keeping the default for the other fields), and click Create.
To deploy Kubeflow on a classic cluster, use the following settings to configure your workspace:
- Github resource URL: https://github.com/IBM/auto-kubeflow/tree/main/terraform/iks-classic/
- Terraform version: terraform_v0.14
- Save the template information. It should look similar to the following image.
- Update the variable values.
  - ibmcloud_api_key: The IBM Cloud API key created in Step 1.
  - org: The organization created in Step 1.
  - space: The space created in Step 1.
  - appid_plan (optional): graduated-tier.
  - clustername (optional): Change the prefix for your cluster name, as the final cluster name is going to be -xxxx, where _xxxx is four characters that are randomly generated.
  - zone: The zone where the cluster will be created. You can use ibmcloud ks zone ls --provider classic to get the available zones.
  - public_vlan_id: The public vlan ID. You can use ibmcloud ks vlan ls --zone <zone> to list the available public vlans.
  - private_vlan_id: The private vlan ID. You can use ibmcloud ks vlan ls --zone <zone> to list the available private vlans.
To deploy Kubeflow on a VPC Gen2 cluster, use the following settings to configure your workspace:
- Github resource URL: https://github.com/IBM/auto-kubeflow/tree/main/terraform/iks-vpc-gen2"
- Terraform version: terraform_v0.14
- Save the template information.
- Update the variable values.
  - ibmcloud_api_key: The IBM Cloud API key created in Step 1.
  - org: The organization created in Step 1.
  - space: The space created in Step 1.
  - appid_plan (optional): graduated-tier.
  - cluster_name (optional): The prefix for your cluster name.

Step 3. Generate and execute deployment plan

After the workspace is configured, you generate and run the Terraform plan.

Click Generate plan, and watch the generated Terraform execution plan using Activity -> View log.
Click Apply plan, and watch the deployment progress using Activity -> View log.
- Note that it takes approximately 35 minutes to deploy and configure the entire Kubeflow stack in a classic cluster, or approximately 75 minutes in a VPC Gen2 cluster.
- You should see eight outputs and a Command finished successfully message at the bottom of the log, which should look similar to the following image.
Verify the deployment by checking the deployed resources in the Resources page. You can access the Kubeflow UI, https://<cluster\_hostname>, where <cluster\_hostname> is one of the outputs in the Apply plan log.
Log in as User A by using an existing Google or Facebook account. Click Login with Google or Login with Facebook.
Click Start Setup on the Kubeflow Welcome page.
Keep the default namespace, and click Finish. The initial UI should look similar to the following image.
Log in as User B by using a new account with a different email.
1. Click Sign up!
2. Enter an email and password for the new account.
3. Open the notification email that is sent to your email account, and click Verify.
4. Log in with the email and password.
5. Click Start Setup on the Kubeflow Welcome page.
6. Keep the default namespace, which should be different from User A, and click Finish.
Run the demo pipeline. You can experiment with various Kubeflow features in your own namespace. The following is a pipeline example.
1. Create an experiment for Kubeflow Pipelines in Experiments (KFP), and skip the step when prompted to start a run.
2. Click [Demo] flip-coin, observe the flow, and click Create run.
3. Choose the experiment that you created earlier, and click Start.
4. Check the run, Run of [Demo] flip-coin (xxxxx), and observe the run graph, which should look similar to the following image.
5. Click each graph node to see more details and to understand how a pipeline works.
6. Look at other Kubeflow features as wanted.
Clean up the resources (optional). After you are done with the Kubeflow experiment, you can clean up the cluster and other resources from Action...-Delete, and choose the Delete all associated resources option. As long as the workspace is not deleted, you can always go back and deploy a new cluster.

Summary

With this automation, it's easy to deploy a stack of Kubeflow components into an IBM Kubernetes cluster, classic or VPC Gen2, for multiple users. You can see the demo pipeline running, and you can look at other Kubeflow features such as Notebook Servers and Katib (hyperparameter tuning) to become more familiar with this cloud-native platform for machine learning operations.