Kubernetes with OpenShift World Tour: Get hands-on experience and build applications fast! Find a workshop!

Use OpenShift templates to install a data and AI platform

To include many AI and data products from IBM into your container environment, either on-premises or in the cloud, you can deploy IBM Cloud Pak for Data on OpenShift. It includes (but is not limited to) Watson Studio, Watson Machine Learning, Db2 Warehouse, and Waton Assistant.

Why is it installed on OpenShift? OpenShift is a hybrid-cloud, enterprise Kubernetes application platform. IBM Cloud now offers it as a hosted solution, called Red Hat® OpenShift® on IBM Cloud™, or an on-premises platform as a service (PaaS). It is built around containers, orchestrated and managed by Kubernetes, on a foundation of Red Hat Enterprise Linux. Deploying Cloud Pak for Data onto OpenShift is IBM’s recommended approach.

In our work, we have found two ways to install IBM Cloud Pak for Data:

  1. Leverage IBM’s public Container Registry (icr.io). Access is password protected, but with a few shell commands that we’ll provide in a shell script, IBM Cloud Pak for Data’s base install can be kicked off. The main icp4d-installer image will pull in all other images and set up new pods and deployments as necessary.

  2. The second way is a bit more complex. Several binary files need to be downloaded (to a Linux environment). Then, using IBM Cloud Pak for Data tools, you push the files to a container registry that you define. There are a few other steps involved, and we’ll cover that in a separate tutorial.

After completing this tutorial, you’ll undertand how to use OpenShift templates to install IBM Cloud Pak for Data on Red Hat OpenShift on IBM Cloud. Then you can access IBM AI and data tools either on-premises or in the cloud.

Prerequisites

  1. Access to the IBM Cloud Pak for Data installer on IBM’s Container Registry (icr.io).

  2. The OpenShift CLI (oc) at v3.11

    • Navigate to the OKD page and choose to download the oc Client Tools, following the install instructions on the page.

    • Alternatively, download and install the CLI with a few shell commands:

      # wget https://github.com/openshift/origin/releases/download/v3.11.0/openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz
      # gzip -d openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz
      # tar -xvf openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar
      # cd openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit
      # cp kubectl oc /usr/local/bin/
      

Estimated time

Completing this tutorial should take about 45-60 minutes.

Step 1: Getting an OpenShift cluster

The first thing we need to do to install Cloud Pak for Data is to get an OpenShift cluster.

An OpenShift cluster can be obtained from IBM Cloud. It only takes a few minutes to provision. Ensure the cluster has at least 3 workers, 16 VCPUs, and 64 GB RAM. We used 128 GB RAM, 16 VCPUs and 3 workers.

OpenShift Cluster Options OpenShift Cluster Overview
openshift-cluster-options openshift-cluster-overview

Step 2: Setting up OpenShift CLI

Now that we have our cluster provisioned and CLI installed we can log into our cluster. Copy the oc login command by launching the OpenShift console and selecting the Copy Login Command from the user profile menu.

OpenShift Launch Console Copy Login Command
openshift-cluster-launch-console oc-login
  1. Run the copied oc login command in a terminal:

    $ oc login https://c100-e.us-south.containers.cloud.ibm.com:30258 --token=some_token_from_the_openshift_console
    Logged into "https://c100-e.us-south.containers.cloud.ibm.com:30258" as "IAM#stevemar@ca.ibm.com" using the token provided.
    
    You have access to the following projects and can switch between them with 'oc project <projectname>':
    
      * default
        ibm-cert-store
        ibm-system
        kube-proxy-and-dns
        kube-public
        kube-service-catalog
        kube-system
        openshift
        openshift-ansible-service-broker
        openshift-console
        openshift-infra
        openshift-monitoring
        openshift-node
        openshift-template-service-broker
        openshift-web-console
    
    Using project "default".
    
  2. Try running oc gets pods to ensure everything is working as expected.

    $ oc get pods
    NAME                                READY     STATUS    RESTARTS   AGE
    docker-registry-55c45555f8-7nwgn    1/1       Running   0          2h
    docker-registry-55c45555f8-wsxgf    1/1       Running   0          2h
    registry-console-584bc4cdb5-k6fp4   1/1       Running   0          1h
    router-64d5df8b-mgkvr               1/1       Running   0          2h
    router-64d5df8b-n6z76               1/1       Running   0          2h
    $
    

Step 3: Creating an IBM Cloud Pak for Data OpenShift template

  1. Create a new file called install-cp4data.sh and ensure it is an executable, i.e., run chmod 775 install-cp4data.sh or chmod +x install-cp4data.sh.

  2. Copy and paste this code into the file.

    NAMESPACE=zen
    
    DOCKER_REGISTRY="us.icr.io/release2_1_0_1_base"
    DOCKER_REGISTRY_USER="iamapikey"
    DOCKER_REGISTRY_PASS="an_icp_for_data_key"
    
    oc create ns ${NAMESPACE}
    oc project ${NAMESPACE}
    
    oc create sa -n ${NAMESPACE} tiller
    oc create sa -n ${NAMESPACE} icpd-anyuid-sa
    
    # Add `deployer` serviceaccount to `system:deployer` role to allow the template kickstart
    oc -n ${NAMESPACE} adm policy add-role-to-user -z deployer system:deployer
    
    # Create the secrets to pull images from the docker repository
    oc create secret docker-registry icp4d-anyuid-docker-pull -n ${NAMESPACE} --docker-server=${DOCKER_REGISTRY} --docker-username=${DOCKER_REGISTRY_USER} --docker-password=${DOCKER_REGISTRY_PASS} --docker-email=cp4data@ibm.com
    oc secrets -n ${NAMESPACE} link default icp4d-anyuid-docker-pull --for=pull
    oc secrets -n ${NAMESPACE} link tiller icp4d-anyuid-docker-pull --for=pull
    oc secrets -n ${NAMESPACE} link icpd-anyuid-sa icp4d-anyuid-docker-pull --for=pull
    
    # Set the Security Context -  One scc is created for every namespace
    cat << EOF | oc apply -f -
    allowHostDirVolumePlugin: true
    allowHostIPC: true
    allowHostNetwork: false
    allowHostPID: false
    allowHostPorts: false
    allowPrivilegedContainer: false
    allowedCapabilities:
    - '*'
    allowedFlexVolumes: null
    apiVersion: v1
    defaultAddCapabilities: []
    fsGroup:
      type: RunAsAny
    groups:
    - cluster-admins
    kind: SecurityContextConstraints
    metadata:
      annotations:
        kubernetes.io/description: zenuid provides all features of the restricted SCC but allows users to run with any UID and any GID.
      name: ${NAMESPACE}-zenuid
    priority: 10
    readOnlyRootFilesystem: false
    requiredDropCapabilities: []
    runAsUser:
      type: RunAsAny
    seLinuxContext:
      type: MustRunAs
    supplementalGroups:
      type: RunAsAny
    users:
    - system:serviceaccount:${NAMESPACE}:default
    - system:serviceaccount:${NAMESPACE}:icpd-anyuid-sa
    volumes:
    - configMap
    - downwardAPI
    - emptyDir
    - persistentVolumeClaim
    - projected
    - secret
    EOF
    
    # Give cluster-admin permission to the service accounts used on the installation
    oc adm policy add-cluster-role-to-user cluster-admin "system:serviceaccount:${NAMESPACE}:tiller"
    oc adm policy add-cluster-role-to-user cluster-admin "system:serviceaccount:${NAMESPACE}:default"
    oc adm policy add-cluster-role-to-user cluster-admin "system:serviceaccount:${NAMESPACE}:icpd-anyuid-sa"
    
    # Set the template for the catalog
    cat << EOF | oc apply -f -
    ---
    apiVersion: template.openshift.io/v1
    kind: Template
    message: |-
      The following service(s) have been created in your project: ${NAMESPACE}.
            Username: admin
            Password: password
            Go to the *Applications* menu and select *Routes* to view the Cloud Pak for Data URL.
      For more information about, see https://docs-icpdata.mybluemix.net/home.
    metadata:
      name: cp4data
      annotations:
        description: |-
          IBM Cloud Pak for Data is a native cloud solution that enables you to put your data to work quickly and efficiently.
        openshift.io/display-name: Cloud Pak for Data
        openshift.io/documentation-url: https://docs-icpdata.mybluemix.net/home
        openshift.io/long-description: IBM Cloud Pak for Data is composed of pre-configured microservices that run on a multi-node IBM Cloud Private cluster. The microservices enable you to connect to your data sources so that you can catalog and govern, explore and profile, transform, and analyze your data from a single web application..
        openshift.io/provider-display-name: Red Hat, Inc.
        openshift.io/support-url: https://access.redhat.com
        tags: AI, Machine Learning, Data Management, IBM
    objects:
    - apiVersion: v1
      kind: DeploymentConfig
      metadata:
        name: cp4data-installer
        annotations:
          template.alpha.openshift.io/wait-for-ready: "true"
      spec:
        replicas: 1
        selector:
          name: cp4data-installer
        strategy:
          type: Recreate
        template:
          metadata:
            labels:
              name: cp4data-installer
          spec:
            containers:
            - env:
              - name: NAMESPACE
                value: \${NAMESPACE}
              - name: TILLER_NAMESPACE
                value: \${NAMESPACE}
              - name: INSTALL_TILLER
                value: "1"
              - name: TILLER_IMAGE
                value: "${DOCKER_REGISTRY}/cp4d-tiller:v1"
              - name: TILLER_TLS
                value: "0"
              - name: STORAGE_CLASS
                value: \${STORAGE_CLASS}
              - name: DOCKER_REGISTRY
                value: ${DOCKER_REGISTRY}
              - name: DOCKER_REGISTRY_USER
                value: ${DOCKER_REGISTRY_USER}
              - name: DOCKER_REGISTRY_PASS
                value: \${DOCKER_REGISTRY_PASS}
              - name: NGINX_PORT_NUMBER
                value: \${NGINX_PORT_NUMBER}
              - name: CONSOLE_ROUTE_PREFIX
                value: \${CONSOLE_ROUTE_PREFIX}
              name: cp4data-installer
              image: "${DOCKER_REGISTRY}/cp4d-installer:v1"
              imagePullPolicy: Always
              resources:
                limits:
                  memory: "200Mi"
                  cpu: 1
              command: [ "/bin/sh", "-c" ]
              args: [ "./deploy-cp4data.sh; sleep 3000000" ]
            imagePullSecrets:
            - name: icp4d-anyuid-docker-pull
    parameters:
    - description: Namespace where to install Cloud Pak for Data.
      displayName: Namespace
      name: NAMESPACE
      required: true
      value: ${NAMESPACE}
    - description: Docker registry user with permission with pull images.
      displayName: Docker Registry User
      name: DOCKER_REGISTRY_USER
      value: ${DOCKER_REGISTRY_USER}
      required: true
    - description: Docker registry password.
      displayName: Docker Registry Password
      name: DOCKER_REGISTRY_PASS
      required: true
      value: ${DOCKER_REGISTRY_PASS}
    - description: Hostname for the external route.
      displayName: Cloud Pak route hostname
      name: CONSOLE_ROUTE_PREFIX
      required: true
      value: "cp4data-console"
    - description: Storage class name.
      displayName: StorageClass
      name: STORAGE_CLASS
      value: "ibmc-file-retain-custom"
      required: true
    
    EOF
    
  3. Ensure that the DOCKER_REGISTRY_PASS variable is updated.

  4. Run the script. The script will a new namespace called zen and create a new Cloud Pak for Data template in the OpenShift catalog!

    ./install-cp4data.sh
    namespace/zen created
    Now using project "zen" on server "https://c100-e.us-south.containers.cloud.ibm.com:30258".
    serviceaccount/tiller created
    serviceaccount/default created
    serviceaccount/icpd-anyuid-sa created
    role "system:deployer" added: "deployer"
    secret/icp4d-anyuid-docker-pull created
    securitycontextconstraints.security.openshift.io/zen-zenuid created
    cluster role "cluster-admin" added: "system:serviceaccount:zen:tiller"
    cluster role "cluster-admin" added: "system:serviceaccount:zen:default"
    cluster role "cluster-admin" added: "system:serviceaccount:zen:icpd-anyuid-sa"
    template.template.openshift.io/cp4data created
    

Step 4: Installing IBM Cloud Pak for Data by running the new template

  1. Go to the OpenShift console, choose to view all projects.

    openshift-view-all-projects

  2. Choose the newly created zen project.

    openshift-project-zen

  3. Click on Browse Catalog.

    openshift-browse-catalog

  4. Select the Other tab, and choose on Cloud Pak for Data.

    openshift-catalog-cp4d

  5. On the first panel, click Next to start the configuration.

    cp4d-install-1

  6. On the second panel, enter your Docker registry password, and we recommend to choose ibmc-file-retain-custom as the Storage Class.

    cp4d-install-2

  7. The last panel should start the deployment.

    cp4d-install-3

  8. To follow the installation logs, go to Applications > Pods and find the pod prefixed with cp4data-installer.

    icp4d-all-pods

IMPORTANT!!: If the deployment fails for any reason delete the icp4data-installer pod. The pod will restart, check the installed modules and continue the installation.

Step 5: Log into the IBM Cloud Pak for Data console

  1. Ensure that all pods are up and running or have completed successfully. Especially check the icp4data-installer pod’s Logs and Events.

    icp4d-pod-overview

  2. Once the deployment is complete, go to Applications > Routes to see the URL to access the Cloud Pak for Data console.

    icp4d-route

  3. Click the route to launch Cloud Pak for Data! By default the credentials are admin/password.

    icp4d-login

Summary

Now that you’ve installed IBM Cloud Pak for Data on Red Hat OpenShift on IBM Cloud with OpenShift templates, you can try out our IBM Cloud Pak for Data code pattern to get started on your AI journey.

Steve Martinelli
Scott D’Angelo