Digital Developer Conference on Cloud Native Security: Register for free and choose your sessions. June 24, 25, & July 1, 2020 Learn more

Introduction to IBM Cloud Pak for Data

This article is part of the Getting started with IBM Cloud Pak for Data learning path.

For many industries, the journey to AI is a long-term strategy that’s only beginning. In this learning path, we’ll examine the case of a Telco company. We’ll look at the process of collecting data, which can reside on multiple clouds, in various database formats, and with various needs for access control. In our Telco, we’ll show how to organize the data with visualizations and other tools. Next, we’ll look at the case of customer churn, and create a machine learning model that helps us to predict the risk that our Telco’s clients will leave. Finally, we’ll analyze the Telco’s deployment of the machine learning model by looking at the model’s performance, explainability, and fairness.

IBM Cloud Pak for Data is a unified, pre-integrated data and AI platform that runs natively on RedHat OpenShift Container platform. Services are delivered with an open and extensible cloud native platform for collecting, organizing, and analyzing data. It’s a single interface to perform end-to-end analytics with built-in governance. It also supports and governs the end-to-end AI workflow.

Collect your data

  • Make all your data accessible–securely at its source–without the need for migration.
  • Connect to all data and eliminate data silos.

Organize your data

  • Create a trusted, business-ready analytics foundation that can simplify data preparation, policy, security and compliance.
  • Govern and automate data and the AI lifecycle.

Analyze your data

  • Build, deploy and manage AI and machine-learning capabilities that scale consistently throughout your organization.

Infuse AI

  • Operationalize AI throughout your business with trust and transparency.

  • Run anywhere with agility and avoid vendor lock-in.

IBM Cloud Pak for Data offers a prescriptive approach to accelerate the journey to AI: the AI Ladder, developed to help a client drive digital transformation in their business, no matter where they are on their journey. IBM Cloud Pak brings together all the critical cloud, data, and AI capabilities as containerized micro-services to deliver the AI Ladder in a multi-cloud platform.

Take a product walkthrough

IBM Cloud Pak for Data can help you unlock the value of your data and create an information architecture for AI. This product walkthrough offers step-by-step demonstrations on how to collect, organize, analyze and infuse AI into your data with a scalable, Kubernetes platform.


IBM Cloud Pak for Data is composed of pre-configured microservices that run on a multi-node IBM Cloud Private cluster. The microservices enable you to connect to your data sources so that you can catalog and govern, explore and profile, transform, and analyze your data from a single web application.

IBM Cloud Pak for Data is deployed on a multi-node Kubernetes cluster, using RedHat OpenShift. Although you can deploy IBM Cloud Pak for Data on a 3-node cluster, it is strongly recommended that you deploy your production environment on a cluster with at least 6 nodes for better performance, better cluster stability, and increased ease of scaling the cluster to support workload growth.


This article introduced IBM Cloud Pak for Data, some terms and concepts, a product walkthrough, and an architectural overview. This article is part of the Getting started with IBM Cloud Pak for Data learning path. To continue the series and learn more about IBM Cloud Pak for Data, take a look at the next tutorial, Virtualizing Db2 Warehouse data with data virtualization.

Scott Dangelo
Clarinda Mascarenhas