Scale enterprise AI with IBM watsonx

IBM watsonx is a portfolio of AI products designed for enterprise AI builders. Built with the open source hybrid cloud AI tools that form the foundation of IBM’s generative AI ecosystem, the watsonx portfolio includes the following products:

IBM watsonx.ai is the state-of-the-art foundation model studio with which users can build generative AI-powered applications. Watsonx.ai also contains the ability to train classical machine learning models and provides all the necessary tools to cleanse and prepare data before building models.
IBM watsonx.data is IBM’s data lakehouse offering, which allows users to store data at scale and to query it efficiently and quickly when needed for AI training. As any data scientist or AI engineer can tell you, the quality of an AI model’s outputs are only as strong as the quality of data that is used to train or prompt it. IBM watsonx.data helps to store and retreive that quality data at scale.
IBM watsonx.governance empowers enterprises with tools to minimize the impact of AI bias. Once an AI model is trained, enterprises require confidence that the model's outputs are not biased and harmful, and that they will be able to test and validate future models in an organized, compliant and trustworthy way. Additionally, watsonx.governance monitors deployed models when in production ensuring that model outputs do not drift too far from desired responses.
The IBM watsonx AI assistants embed AI into the customer service experience across a variety of business processes and applications. While these AI assistants are part of the watsonx portfolio of products, they are part of the top layer of the generative AI tech stack. The watsonx products for AI assistants include watsonx Orchestrate, watsonx Assistant, watsonx Code Assistant, and watsonx Orders. (We provide more detail on these watsonx products in this article, "Enterprise generative AI virtual assistants.")

This article will elaborate on the features and use cases of the IBM watsonx product suite that make up the middle layer of IBM's generative AI tech stack to show how users can manage the full AI lifecycle when integrating all parts of the watsonx portfolio together.

The AI studio: IBM watsonx.ai

IBM watsonx.ai is a foundation model studio for next-generation AI builders. It contains most of the tools you might be familiar with from IBM Cloud Pak for Data. This includes all of the tools necessary to train classical machine learning models and to perform natural language processing like Watson Studio Notebooks and AutoAI to build ML pipelines with no code.

IBM watsonx.ai also natively supports a synthetic data generator and automated model workflow deployment. The AutoAI pipeline builder will automatically select relevant model features and perform feature engineering. AutoAI supports a variety of standard regression, classification and time series forecasting models. The pipeline builder will evaluate multiple models and allow the user to choose and deploy the best ones, and it is a useful first step to see which models are likely to perform well when building an ML pipeline.

The Data Refinery tool in watsonx.ai allows users to clean and prepare data for analysis. It contains simple tools to make data visualizations, replace missing values, set column data types, and even change strings and substrings. When building Machine Learning pipelines, Data Scientists and Machine Learning Engineers often spend the most time cleaning and understanding their data before fitting models. Using the Data Refinery tool can drastically cut down the total time spent on data preparation. For those developers who perform NLP tasks, the Data Refinery tool comes with automated stop word removal and tokenization.

For those interested in generative AI, watsonx.ai comes with numerous options for foundation models. Users have access to many models in the model library, including the popular Granite models from IBM, FLAN models from Google, and the Llama 2 model from Meta. Clicking on a model displays the model card, which provides detail about each model’s architecture, characteristics, and which use cases it performs best on.

Watsonx.ai also includes a UI-based Prompt Lab. This tool allows users to quickly test out different prompts and models before fully tuning a model or deploying a full app. The sample prompts section on the left contains preconfigured prompts and inputs for a range of common use cases, including summarization, classification, entity extraction, Q&A, and even Python and C++ code generation.

Users who want full customization can save their prompts to edit them in a Jupyter notebook with Python, allowing for full customization and integration into apps with the Python SDK.

Soon, watsonx.ai will also support model fine tuning. Unlike prompting, fine tuning will change the weights of the foundation model by retraining the final layers of the model’s neural network on labelled data. Foundation models themselves are engineered to perform multiple tasks, but this will increase the model’s performance on one specific task. Tuning will require the user to upload a new training dataset with labelled examples of desired outputs, and watsonx.ai will fine-tune the data for the user.

Fine Tuning

If you want to try IBM watsonx.ai for free, as a full managed service on IBM Cloud, sign up for a free trial.

The data lakehouse: IBM watsonx.data

Training enterprise-grade AI requires having access to enterprise quality data, and being able to access it fast and at scale. IBM watsonx.data’s data lakehouse architecture combines the query speed of a data warehouse with the storage costs of a data lake. This lakehouse architecture is particularly applicable to large enterprises who likely use multiple cloud providers and complex data storage solutions, making watsonx a truly hybrid-cloud native platform.

Data lakes and data warehouses, like databases, are types of data storage, and data can be retrieved or queried from them when needed. In addition to these, watsonx.data can also connect to cloud storage buckets, including the popular IBM Cloud Object Storage and Amazon S3 formats. Different storage formats are applicable depending on the type of data being stored (structured versus unstructured) and the frequency at which the data needs to be accessed.

A query engine is used to retrieve data from a data store. Watsonx.data supports the Spark and Presto query engines, and both query data with the SQL language. The following figure shows the IBM watsonx.data Infrastructure Manager, which visualizes the connections between different query engines, catalogs, databases, and cloud storage buckets. Multiple query engines can be connected to the same data source, and users can add more data sources as needed.

Query engines sit on top of a data table, which is the organized and queryable representation of unstructured and structured data in the data lakehouse. To make the data queryable, attributes about the data’s characteristics called metadata must be stored. Watsonx.data supports the open-source Apache Iceberg table format and stores metadata in the Apache Hive metadata store. Using this metadata store and the Iceberg table format, watsonx.data users can rollback to earlier snapshots of a data table at a specified time.

One of the key features of watsonx.data is that multiple query engines can access the same data sources at once. Large enterprises are often collecting new data and consistently updating their data stores. Using multiple query engines thus enables users to view and query the data at the same time it is being modified or updated. This is useful for customers, because it enables a consistent stream of real-time data to be fed into data pipelines and AI models. Querying large amounts of data is compute intensive, and watsonx.data users can scale their compute dynamically by adding or removing query engines as their needs evolve.

After query engines are connected to data sources in the Infrastructure Manager, users can use simple SQL to query all connected data sources using the watsonx.data Query Workspace. With SQL queries, data can easily be modified, copied, or moved between data stores. Enterprise users find this functionality an attractive way to save on data storage costs. Data that needs to be accessed less frequently can be moved out of data warehouses into cheaper cloud object storage, cold storage or other forms of storage.

If you want to try IBM watsonx.data for free, as a fully managed service on IBM Cloud or AWS, sign up for a free trial.

AI model lifecycle monitoring and management: IBM watsonx.governance

After enterprises successfully extract their data and train AI, they need to ensure that it is trustworthy, free from unwanted bias, and monitor its performance through its lifecycle. Biased models are ones that make predictions based on undesired variables. For example, an AI model that predicts someone’s elgibility for a loan would be biased if it used their race or religion as a factor to determine their creditworthiness. IBM watsonx.governance is IBM's AI governance solution designed to help businesses reduce risk and minimizing the impact of AI bias. This takes the form of automatic model monitoring, explainability of model outputs, streamlining report generation, and application of controls through an integrated governance, risk, and compliance (GRC) platform.

To help enterprises organize the training and deployment of their models, watsonx.governance includes the model inventory feature from Watson OpenScale. All user models will be stored in the model inventory, and users can define specific model use cases to track the lifecycle of model development. This helps Data Scientists and Machine Learning Engineers track models in the different stages of model building and deployment. As shown below, these models can be evaluated and assigned to specific users for review before deployment into production:

Model Use Case

Out of the box, this is separated into develop, test, validate and operate stages, but can be customized should the user want to implement their own governance framework. Users set the rules needed to move models from one stage to another. Models in the inventory can include both models developed in IBM and third-party platforms, such as Amazon SageMaker and Microsoft Azure ML.

All models in the inventory will be automatically tracked by Watson OpenScale, including models developed in AutoAI, Jupyter Notebooks, and the watsonx.ai Prompt Lab. Applicable metadata will be recorded in individual AI Model FactSheets to give users a snapshot of the model’s health and performance. For generative models like the ones displayed in the following figure, the FactSheet includes information about the prompts used and model hyperparameter information.

FactSheet

Users who want to measure how much an LLM-generated output differs compared to a pre-defined human output can configure ROUGE metric tracking and view it in the AI FactSheet. For predictive models, the metadata reported includes the features used to train the model, predictive accuracy on the test dataset, F1 score, log loss, precision, and recall. Users can also define their own custom metrics to track for both generative and predictive models.

Rouge Tracking

Watson OpenScale can help users automatically monitor model health by alerting users when key model health metric levels fall below specified thresholds. Many users are interested in monitoring model drift, which refers to the gradual decrease in model output quality over time. With predictive models, drift can be caused by a change in the underlying relationship between the model’s predictor variables and the outcome. Sometimes after a model is trained, a new and unaccounted for variable can significantly affect predictions, resulting in drift.

Like predictive models, generative models are also trained on data snapshots at a specific time, and are not always updated with technological advancements or current news. Changes in human language patterns and slang can also affect a generative model’s ability to understand inputs. To mitigate the impact of drift, watsonx.governance monitors how far LLM and predictive model outputs drift from ideal ones, and alerts users should drift metrics fall below specified thresholds. Users may then take appropriate action such as retraining models or changing prompts to mitigate the impact of the drift:

Drift Tracking

IBM watsonx.governance includes built-in model fairness tracking to ensure that model outputs are not unfairly biased. The platform is pre-configured to track the ability of the model to deliver different outputs for standard protected classes like gender, religion and race. Users can also set up their own custom fairness monitors and receive alerts for violations. When monitoring predictive models, the “feature importance” functionality in Watson OpenScale uses proprietary and open-source algorithms to measure how much individual model variables impact predictions. This feature can be useful for demonstrating to regulators how model predictions are generated and that outputs are not significantly different because of membership in a protected class.

Fairness

To facilitate easy reporting and tracking, watsonx.governance integrates with IBM OpenPages, which is a highly scalable, AI-powered GRC platform. Data is automatically shared between IBM OpenScale and IBM OpenPages. Users can make custom dashboards tracking metrics of interest or model fairness, training stage and performance.

OpenPages Dashboard

IBM OpenPages also contains custom control templates and risk assessments for common regulatory frameworks like the European Union AI Act. Model use cases that are deployed and tracked with IBM OpenScale can be integrated into risk assessments, and then have applicable controls and risk mitigation procedures applied to them.

Find out more about IBM watsonx.governance. You can also sign up for a free trial.

Summary and next steps

As you can see, the IBM watsonx portfolio of products, as the foundation for the generative AI tech stack, fully supports all parts of the AI lifecycle, and is a suitable platform for building AI at the scale of the enterprise.

With watsonx.data, users can store and retrieve enterprise data taking full advantage of its data lakehouse architecture. After the data is extracted, users can build Generative AI and Machine Learning (ML) powered apps with watsonx.ai. This includes a full suite of tools needed to cleanse and transform the data before training AI models. After models are deployed, users also have a full range of model governance and monitoring capabilities with watsonx.governance to minimize AI bias and ensure that their AI is trustworthy.

IBM watsonx leverages several key AI open source tools and technologies and combines them with IBM research innovations. Learn more about the foundation of IBM's generative AI tech stack, the hybrid cloud AI tools, in the article, "The open source ecosystem of watsonx." Learn more about the top layer of IBM's generative AI tech stack in the article, "Enterprise generative AI virtual assistants."

Explore more articles and tutorials about watsonx on IBM Developer.