Management and governance of AI frameworks
Successfully manage and operationalize your AI applications
In the six-part series about cognitive computing in the telecommunications and media and entertainment (TME) industries, we went through key areas to consider integrating cognitive computing into the TME enterprises. We discussed several topics, including key cognitive computing concepts, relevant industry use cases, and how to implement portions of use cases. In this final article, we focus on tying it all together to successfully manage and operationalize your AI applications. We all realize the benefits of AI systems, the differentiation it provides, and its positive impact on profitability. The challenge is to successfully operationalize AI systems. Recent advancements have raised concerns about security, bias, and trust issues in AI. We will discuss these challenges and the best practices to operationalize successful AI systems.
Challenges and need for management and governance of AI frameworks
In this section, we revisit some of the use cases we discussed in earlier articles and identify the need for strong management and governance of AI frameworks. There are numerous challenges for the safe adoption of AI into our businesses, including the following:
- Data accessibility and governance: My data exists in a variety of forms and in multiple locations. How can I catalog and provide authorized access to this information?
- Model provenance and audit: Where did this model come from, what has happened to it, and what has it done?
- Trust in AI: Do I understand how this model works and can it be explained? Does this model discriminate unfairly or use irrelevant features?
- Ethics and security: Am I using the data and model responsibly? How do I securely share data without losing insights?
We begin by focusing on previous use cases to discuss what could go wrong if our AI systems are not operationalized and governed correctly:
Self-service agent – Device Doctor: A self-service agent lets a customer interact with a communications service provider (CSP) to help resolve specific issues related to the devices a customer owns. If such systems are not managed and governed correctly, they will be ineffective and even detrimental for the business. After 12 months of operation, the administrator becomes aware that training data used to develop and enhance the model was incorrect. Carefully manipulated input data can force the model to output an incorrect classification. The questions the administrator will have are:
- How do we find all models that might have been trained using this data?
- How accurate are the results of models trained with the incorrect data?
- Have customers been adversely affected, and if so, which ones?
Network operations agent: A network operations agent enables a network engineer to more effectively troubleshoot issues within the network. Troubleshooting the network can be complex, and cognitive computing can help the engineer reduce escalations and more quickly resolve issues. Such systems can be compromised if not governed carefully. For example, a CSP is using an AI system to optimize operation of its 5G networks. A network outage occurs, impacting communications at national scale, including emergency services. A malicious exploit of the AI systems is suspected. The government demands an explanation for the outage and demonstration of appropriate controls for ensuring safe and correct operation of AI for network control. Key questions include:
- How can the organization demonstrate proper governance of the AI lifecycle, including acceptance into service?
- What was the chain of custody for the AI system from development into operation? Who did what and when?
- How does the organization provide explanations for the actions of a black-box AI system?
Capitalizing on data: A data scientist is asked to create a system that will enable a CSP to capitalize on the data the CSP to retain customers. To do this, the data scientist needs to access data from a variety of sources; identify what assets are available; create various models, including a churn model to determine how likely the customer is to churn; monitor the models; detect bias in the models; correct the models; and various other activities. The data scientist would like to use tools from vendors and open source offerings to create the solution. Key questions include:
- How can all the relevant AI assets and data be identified?
- How does one detect bias in models and correct them?
- How does one quickly create the relevant models and effectively monitor and manage them?
To address these challenges, you need a platform to collect, organize, and analyze data using data science and infusing trusted AI. The platform needs to deliver a broad range of core data microservices, with the option to add others from a growing services catalog. The platform should enable you to get the benefits of cloud while keeping data where it is. It should enable governed multi-cloud platform that delivers the flexibility you want, with the security and control you need. The key features of an AI platform are described below.
Figure 1: Key features of a governed and managed AI platform
We will now examine each of the previous steps to manage the governance of AI data. To do this, we use the capitalizing on data use case as an example.
Data is everywhere and growing exponentially in relevance and volume. The best businesses in the world today are data-driven. Businesses are collecting data from more and increasingly diverse sources to analyze and run their operations. By 2025, IDC predicts enterprises will produce around 60 percent of global data. Eighty percent of the companies are committing to multi-cloud, and about 71 percent use three or more clouds. This blend of public cloud, private cloud, and even on-premises data solutions is helping companies take advantage of the benefits of public cloud where they can, while keeping more sensitive and regulated data on premise behind firewalls. By diversifying, enterprises can minimize risk while remaining agile, increase innovation, and optimize costs. But figuring out how to best manage your corporate databases in cloud and hybrid-cloud environments can be a challenge. Data accessibility using a common federation engine for the data is a key aspect. A robust, flexible data repository that can ingest and persist massive volumes of data and data types – navigate the challenges of seamless accessibility – is required.
Both IBM Watson™ Studio and IBM Cloud Pak for Data provide a platform to manage all data types across various sources, incorporating all forms of data management (SQL, NoSQL) and all flavors of techniques (row store, column store, document store, Hadoop). Hybrid data management provides the ability to:
- Easily connect to your data, no matter where it lives
- Publish data to the enterprise catalog
- Create a single point of entry for all your data
Let’s go through an example of the collecting phase within capitalizing on data use case mentioned above. The data scientist has a Db2® database on the cloud with some marketing data used for the customer churn model and another SQL database on prem behind a firewall with sensitive subscriber information. You will be able to collect and connect all of these databases in different locations.
Figure 2: Connecting to a Db2 database on the cloud
The following steps connect to data sources from the IBM Cloud Pak for Data web client:
- From the main menu, click Connections to open the Connections page.
- If you need to upload a driver for a connection, such as for a Teradata data source, click Upload driver and follow the instructions.
- From the Connections page, click Add connection.
- In the Add connection window, specify the information for your connection. You must specify the name of the connection and a description, then select the connection type. Depending on the connection type you specify, you will need to provide a specific set of additional information – a host name, port, and credentials, for example. For additional security, you can optionally enable SSL and Kerberos. If you use Kerberos, you must specify the service principle name and a key tab.
- Click Add to add the connection to the Connections list. The new connection appears in the Connections list.
As mentioned, the subscriber data behind the firewall is highly sensitive data and a copy of it cannot be made. There are times when regulations, compliance, or other reasons do not allow data to be transferred. In other cases, it might not make sense to move large amounts of data. This is where data virtualization, a relatively new technology that connects various data sources into a single collection of data sources or databases – referred to as a constellation – comes in handy. Data virtualization eliminates the need to run analytics queries on data copied and stored in a centralized location. The analytics application submits a query processed on the server where the data source exists. Results of the query are consolidated within the constellation and returned to the original application. Data exists only at the source and is not copied.
IBM Cloud Pak for Data supports data virtualization where applications connect as if they are connecting to a single Db2 database. One can consolidate data that is available in multiple data sets into a single table by virtualizing the data and by joining the tables. When connected, applications can submit queries against the system as if they were querying a single data source database. The workload will be collaboratively distributed and computed by all participating data sources that have data relevant to the query.
Popular Db2 connection clients and applications can attach to IBM data virtualization and work without modification. This is the case even if the collection of data sources under query includes a mix of many types of data sources, such as PostgreSQL, Oracle, Netezza, Microsoft SQL Server.
The data scientist working on the capitalizing on data use case needs to ensure that the data is well organized. The data scientist needs to know what is in the tables and what policies are applied to the data. Creating a trusted business-ready analytics foundation with unified governance and integration that comply with organizational and regulatory requirements is essential. Having an integrated unified governance makes data easily and securely available across cloud platforms, enabling robust data preparation and model building. With proper governance, businesses can trust the data to make reliable decisions and drive good insights. Traditional metadata management systems, often referred to as the Information Governance Catalog, have focused on the collection of information for self-service access and consumption of data for enterprise governance and IT. However, in order to progress to self-service analytics and make the leap into AI, data catalogs, such as IBM Watson Knowledge Catalog, are critical because they provide self-service access and consumption of data for AI. Data catalogs enable enterprises to:
- Discover data assets using auto-discovery – IBM Cloud Pak for Data enables you to start automated discovery to start exploring and profiling your data from new data connections. The profiling process includes a quality analysis to determine the quality of the data, automatic term assignment to help assess and classify the data, and the publication of analysis results. After the data is discovered and imported into the catalog, the relationship graph shows how the data is connected. Data scientists and data engineers can rate assets and add comments about assets to enhance information catalog and help colleagues understand the data better.
- Profile and categorize data assets – A data dictionary with common business vocabulary helps define all important aspects of an enterprise and the industry it operates in. Categories provide the logical structure for the glossary so you can browse and understand the relationships among terms and categories in the glossary. Categories can be organized in a hierarchy based on their meaning and relationships to one another. A term is a word or phrase that describes a characteristic of the enterprise. Each term has a parent category, but it can also be referenced by other categories.
- Apply policy-based enforcement to access – Information governance rules and policies ensure compliance with business objectives. Data protection policies control access to data based on the content of the data. For example, as shown in the screenshot below, a rule is created such that data sets with the “Customer” tag are anonymized – redacted in this case (original data replaced with X) to the users listed. This means that if user Sharath Prasad tries to access a database tagged with the “Customer” tag, the data will be replaced with X when he tries to access the data.
Figure 3: Information governance rules and policies to ensure compliance
The data scientist working on capitalizing on data needs to design, build, and train data science and AI models, as well as seamlessly deploy, run, and retrain AI and machine learning models. The data scientist also needs to ensure that the data from the above phases is properly used to create machine learning models. In this use case, the data scientist has to create machine learning models like customer churn and next-best action, etc. Let’s go through an example of the analyze phase within the capitalizing on data use case mentioned above.
Build, train, and deploy AI and machine learning models
With Watson studio, you can use open source tools like Jupyter Notebook to programmatically create models using Python, R, or Scala and can use drag-and-drop tooling from IBM SPSS® Modeler. It’s easy to choose between code or no-code tools to build and train your own models or easily retrain and customize pre-trained Watson APIs.
Using SPSS Modeler flows, you can build various machine learning models by just dragging and dropping the nodes on to the model editor. The image below shows a customer churn model used in the capitalizing on data use case developed using SPSS Modeler flow.
Figure 4: Churn model developed using SPSS Modeler
After the churn model is built and trained, it can easily be published and deployed using the machine learning service integrated into Watson Studio.
Monitor and manage machine learning models
A unique Watson Studio feature called continuous learning enables you to automate the retraining of models with new data and to monitor how the performance of those models evolve over time. Threshold values can be assigned to specific metrics that can be used to trigger automatic retraining and redeployment activities. Models have to be dynamic and need to be updated periodically. Continuous learning enables version control to roll back to previous versions. The screenshot below describes performance monitoring options enabled for a churn model used in the capitalizing on data use case. According to the options set up here, the threshold value for accuracy is 0.9, and the model retrains itself whenever model performance goes below threshold and auto-deploys whenever model performance is better than previous version.
Figure 5: Continuous learning model evaluation of a machine learning model
The data scientist needs to infuse AI throughout its full lifecycle with trust and transparency, and ensure that AI models are fair, explainable and compliant, for AI models built and running anywhere. The data scientist needs to ensure that AI models are not deteriorating in fairness and accuracy, and are not biased against any gender or specific age group.
Let’s go through an example of the infuse phase within the capitalizing on data use case mentioned above using AI OpenScale.
IBM AI OpenScale allows enterprises to automate and operationalize the AI lifecycle in business applications, ensuring that AI models are free from bias and can be easily explained and understood by business users. AI OpenScale can monitor and track AI models, providing visibility into how AI is being built and used. It supports AI models built and run in the tools and frameworks of your choice. It provides businesses with confidence in AI decisions. Available on IBM Cloud and IBM Cloud for data, OpenScale infuses AI throughout its full lifecycle with trust and transparency, explains outcomes, and automatically mitigates bias.
AI OpenScale can monitor machine learning model deployments on private cloud and public cloud, whether deployed within Watson Machine Learning or Azure or Amazon. It constantly tracks all the transactions predicted by the machine learning models and monitors for fairness and accuracy in the model, raising alerts as soon as fairness or accuracy values goes below certain threshold values for any machine learning model. Based on the data being monitored at runtime by AI OpenScale, it can create a de-biased model, which would eliminate bias from the model. This de-biased model can be utilized using the de-biased endpoint.
Figure 6: AI OpenScale dashboard monitoring machine learning deployments
The previous image shows AI OpenScale dashboard monitoring five machine learning deployments, including one to determine the credit-worthiness of a customer. Having accurate credit-worthiness models is important for the capitalizing on the data use case because it often determines what offers to provide to the customer. Biases in such models can result in retaining the wrong customers and not targeting the key ones. We can see OpenScale has detected and triggered fairness and accuracy alerts for our models, which needs to be examined.
Managing AI at scale
Scaling AI with trust and transparency: As more applications make use of AI, businesses need visibility into the recommendations made by their AI applications. In the case of certain industries like telecommunications, finance, and healthcare, in which adherence to GDPR and other comprehensive regulations present significant barriers to widespread AI adoption, AI models must explain their outcomes in order to be used in production situations. It is critical to ensure that AI recommendations or decisions are fully traceable, enabling enterprises to audit the lineage of the models and the associated training data, along with the inputs and outputs for each AI recommendation.
AI OpenScale significantly expands the trust and transparency capabilities by introducing explainability for black-box models and functions, automatic bias detection and mitigation, auditability, and traceability on AI applications – regardless of whether they run on a company’s private cloud, IBM Cloud, or on another cloud environment.
The screenshot below shows the explainability feature of AI OpenScale. With the help of the explainability feature, OpenScale can explain the outcomes of every transaction for the models, which is important for the capitalizing on data use case. According to the screenshot below, the credit risk model prediction used in capitalizing on data is heavily dependent on gender and age. Having detailed explanation of a model in hand, the data scientist can make an informed business decision aligning with business policies.
Figure 7: Explainability of AI OpenScale explaining a machine learning prediction of a transaction
The data scientist working on the capitalizing on data use case has successfully completed collect, organize, analyze, and infuse phases, so let’s discuss an important concept called automation. So far, we’ve used AI to develop various models, but AI automating AI is a cutting-edge technology where AI develops AI. Let’s talk about it briefly using tools IBM developed.
The AutoAI graphical tool in Watson Studio automatically analyzes your data and generates candidate model pipelines customized for your classification or regression problem. Using AutoAI, you can build and deploy a machine learning model with sophisticated training features and no coding. The tool does most of the work for you.
With AutoAI, uploading and selecting an output column from the dataset is all that needs to be done. The tool does the remaining steps: data pre-processing, automated model selection, feature engineering, and hyperparameter optimization. Once the model pipelines are generated, you can save and deploy as a machine learning model. Let’s discuss an example of automating AI using the capitalizing on data use case.
Figure 8: AutoAI process steps
The data scientist can quickly upload a churn dataset, and by selecting churn as the output column, AutoAI refines and prepares the data, chooses a best machine learning model by comparing various models, and generates a pipeline of models to choose from. BelowThe screenshot below shows the output of the AutoAI tool and the ranked pipeline models generated. The data scientist can save the best model to a machine learning service, which can then be monitored using OpenScale, or it can be integrated in applications.
Figure 9: AutoAI tool displaying steps performed and models that were created
Open source topics
The data scientist working on the capitalizing on data use case might have some assets that can’t be brought into IBM Cloud or IBM Cloud Pak for data. Let’s discuss how open source tools can be leveraged to provide additional help in the operationalizing AI lifecycle.
AI Fairness 360 is an open source software toolkit that can help detect and remove bias in machine learning models. Containing more than 70 fairness metrics and 10 state-of-the-art bias-mitigation algorithms developed by the research community, it enables AI developers and data scientists to easily check for biases at multiple points along their machine learning pipeline, using the appropriate bias metric for their circumstances. It enables the developer or data scientist to reduce any discovered bias. These bias-detection techniques can be deployed automatically to enable an AI development team to perform systematic checking for biases similar to checks for development bugs or security violations in a continuous integration pipeline.
Bias might exist in the initial training data, in the algorithm that creates the classifier, or in the predictions the classifier makes. The AI Fairness 360 toolkit can measure and mitigate bias in all three stages of the machine learning pipeline. The steps involved in the process are:
- Step 1 – Setting bias detection options and loading dataset
- Step 2 – Computing fairness metric on original training dataset
- Step 3 – Mitigating bias by transforming the original dataset
- Step 4 – Computing fairness metric on transformed training dataset
The Model Asset eXchange is a one-stop place for developers to find and use free open source, state-of-the-art deep learning models for common application domains, such as text, image, audio, and video processing. It can enable data scientists and AI developers to easily discover, rate, train, and deploy machine learning and deep learning models in their AI applications. This includes deployable models that you can run as a microservice locally or in the cloud on Docker or Kubernetes, and trainable models where you can use your own data to train the models.
Some models cover ML/DL domains, including images, audio, text and time series: Facial Age Estimator, Image Segmenter, Audio Sample Generator, Named Entity Tagger. How to use these models:
- Models can easily be deployed by publishing pre-built Docker Hub images on a Kubernetes cluster.
- API endpoints of deployable models can be integrated in web applications.
This concludes our final article on cognitive computing in the telecommunications and media industries. As discussed in these articles, there are key areas you need to consider when integrating AI into your enterprise solutions:
- What is the AI platform that the solution will be built on?
- What are the key use cases to consider and the business benefits for implementing the use cases?
- How does your AI system learn from the vast amount of unstructured data?
- What is the interface you will provide so that users can effectively interact with what your system has been trained on?
- What are the AI models, including machine learning and deep learning models needed for the above, and how do you implement them?
- How do you manage and govern the above?
Implementing AI in the enterprise clearly involves much more than the previous information, but we hope it helps you get started or progress in a specific domain. This is an exciting area with continuous advancement in technology, and you can find additional information at Transformative technologies for your Telecommunications, Media & Entertainment business, IBM Developer, and The Data and AI Forum: Accelerate your journey to AI.