About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Article
Kubeflow Pipelines Overview
Create, deploy, and manage machine learning workflows on Kubernetes
On this page
Archived content
Archive date: 2024-06-25
This content is no longer being updated or maintained. The content is provided “as is.” Given the rapid evolution of technology, some content, steps, or illustrations may have changed.Kubeflow Pipelines on Tekton is an open-source platform that allows users to create, deploy, and manage machine learning workflows on Kubernetes. In Kubeflow Pipelines, a pipeline is a definition of a workflow that composes one or more components together to form a computational directed acyclic graph (DAG). The Kubeflow Pipelines SDK provides a set of Python packages that you can use to compose ML tasks and operations into an ML pipeline.
Kubeflow Pipelines provides a way to define and execute complex pipelines consisting of interconnected steps, enabling users to easily orchestrate the entire machine learning process from data preparation to model training, evaluation, and deployment. Kubeflow Pipelines is the core automation tool for connecting the whole AI development lifecycle and production, and it offers the following benefits for these professionals:
- For DevOps professionals, Kubeflow Pipelines taps into the Kubernetes and OpenShift ecosystem, leveraging its scalability, security, and containerization principles.
- For Data scientists and MLOps practitioners, Kubeflow Pipelines offers a Python interface to define and deploy Pipelines, enabling data passing, metadata collection, and lineage tracking.
- For DataOps professionals, Kubeflow Pipelines brings ETL bindings to participate more fully in collaboration with peers by providing support for multiple ETL components and use cases.
Kubeflow Pipelines provides:
- A user interface (UI) for managing and tracking experiments, jobs, and runs
- An engine for scheduling multi-step ML workflows
- An SDK for defining and manipulating pipelines and components
Additionally, Kubeflow Pipelines provides easy ways to integrate ML services and components such as Pytorch, HuggingFace models, and KServe ModelMesh for production AI workflows. Using Kubeflow Pipelines, AI developers can collaborate and use the latest AI technologies such as large language models and implement AI capabilities like prompt tuning.
Kubeflow Pipelines capabilities
Kubeflow Pipelines provides workflow orchestration that's highly customizable, portable, and scalable. Customizability allows developers to run any code on the same platform. Then these codes are portable to anyone using the same platform and scalable using native Kubernetes that can be run on any cloud platform.
Kubeflow Pipelines has an intuitive way for developers to program their workflows using the Python SDK and manage workflows with the visual dashboard that supports experiment tracking, pipeline versioning, and metadata with lineage tracking. It supports complex logic such as parallel loops, recursion, caching, and asynchronous wait to resemble the complex AI development cycle in the cloud-native environment.
Large Language Models and Prompt Tuning
Large Language Models (LLMs) are language models consisting of artificial neural networks with billions of parameters to generate human-like text. LLMs have gained much popularity because they can accomplish various tasks across industries.
Because LLMs need to train on large quantities of data to gain knowledge, it takes significant resources to build or modify LLMs to work on a specific use case. So, many people use the LLMs without changing the core model by providing a specific prompt. This approach is called prompt engineering which is a very manual process requiring a lot of trial and error. Therefore, many new techniques such as prompt tuning, p-tuning, and prefix tuning have been developed to make the user prompts work in a specific way.
While Kubeflow Pipelines is a powerful tool to connect all the AI services together in one place, the true potential is to experiment and develop the latest AI trends without any platform restrictions. In the case of LLMs and prompt tuning, these are still very new concepts, and many of the technologies are changing daily. So, Kubeflow Pipelines is a perfect platform since many teams can plug in their new code and dependencies at the same time in an automated fashion.
Even a simple LLM development cycle could involve multiple technology stacks with several dedicated teams. With the power of Kubeflow Pipelines, all these technologies can be connected together, which helps organizations to track AI projects in one place easily.
Summary and next steps
With its visual dashboard, versioning capabilities, and integration with Kubernetes, Kubeflow Pipelines simplifies the development and deployment of reproducible and scalable machine learning workflows.
IBM has invested heavily to make Kubeflow Pipelines compatible with Tekton and OpenShift Pipelines, providing additional security to enterprise users running on OpenShift. Watson Pipelines and Red Hat OpenShift Data Science are the two main vendors running production Kubeflow Pipelines on OpenShift. You can use Watson Pipelines on watsonx.ai to orchestrate the whole production AI lifecycle on production. Watsonx.ai brings together new generative AI capabilities, powered by foundation models, and traditional machine learning into a powerful AI platform that spans the AI lifecycle. With watsonx.ai, you can train, validate, tune, and deploy models with ease and build AI applications in a fraction of the time with a fraction of the data.
Try the watsonx.ai -- next-generation studio for AI builders. Explore more articles and tutorials about watsonx on IBM Developer.