Instilling trust in AI

Artificial intelligence (AI) is everywhere. From Alexa in our homes to recommendations from Netflix to Google’s predictive search engine to surgical robots, these AI systems can sense, reason, understand, learn and interact.

With AI providing benefits across industries, enterprises and start-ups are constantly innovating on how to make applications smarter and better. The AI market is projected to become a $190 billion (USD) industry by 2025. However, advancements in AI can create new challenges and raise legal and ethical questions.

Today, AI is seen as a black box spitting out mysterious decisions, and businesses can struggle to explain or understand how these outcomes are reached.

For example, major cities in China are using smart cameras and AI-based facial recognition to detect and identify jaywalkers (people who walk in or cross a roadway that has traffic, other than at a suitable crossing point). The jaywalker’s partially obscured name and face then shows up on a public display screen. This system made an error in the city of Ningbo, where it falsely “recognized” a photo of Chinese billionaire Mingzhu Dong on an advertisement on a passing bus as a jaywalker. In this case, Dong called it a “trivial matter” because live detection and recognition are challenging.


But is it a trivial matter when the results of an AI system might help determine which criminal sentence is to be applied? In the United States, courtroom judges could base their decisions, including bail amounts and sentences, for defendants and people convicted of a crime, on the results of COMPAS. COMPAS is a risk assessment software powered by AI that is used to forecast which criminals are most likely to reoffend. The accuracy of this software was questioned when it showed bias in its results.

It is definitely not a trivial matter when human safety is at stake. In 2018, one of Uber’s self-driving cars, which was in autonomous mode and had a human safety driver at the wheel, struck and killed a female pedestrian in Arizona. The vehicle’s sensors detected the pedestrian but decided not to brake or take any action. This led to Uber, and other companies like Toyota and NVIDIA, to suspend their self-driving vehicle testing in the US.

As AI continues to expand its role in our lives, make decisions for us and on behalf of us, an important question comes to mind:

What level of trust can – and should – we place in these AI systems?

To try to answer this question, in April 2019, the European Union published its work on ethics guidelines for artificial intelligence where they outlined seven essentials for achieving trustworthy AI. These guidelines included requirements such as privacy and data governance, diversity, and safety.

Earlier this year, the Smart Dubai Government announced the Dubai AI Principles, which lay out the city’s aspirations and provide a roadmap for the behavior of AI systems. They also announced the Dubai AI Ethics Guidelines to provide detailed guidance on crucial issues like accountability, and others relating to AI algorithms. You can try out the Dubai AI Ethics self-assessment tool to self-evaluate the ethics level of an AI system using Dubai’s AI Ethics Guidelines.

Four pillars of trusted AI

Ethical AI is a huge area of research at IBM. And in this paper, four elements, or pillars, are described, which form the basis for trusted AI systems.

  1. Explainability: Knowing and understanding how AI models arrive to specific decisions. It’s having visibility on how they work. Right now, machine learning models are black boxes. We don’t really know why they make the choices they do.

    Although a key fundamental of trusted AI, explainability can be challenging to implement. When implementing it, the accuracy of the model is compromised. This is because simpler models that are easy to interpret are pinched in their predictive capacity (for example, linear regression or decision trees). More powerful and complex algorithms such as neural networks or random forests tend to be highly accurate but difficult to understand due to their complex nature.

    Another factor to consider is that explainability isn’t one dimensional. Different stakeholders would want to have insights on different aspects of the inner workings of an algorithm, based on purposes and objectives. Therefore, explanations must be tailored.

  2. Fairness: Ensuring fair AI systems means getting rid of (or at least minimizing) bias in the model or data. Bias can be described as the mismatch between the training data distribution and a wanted fair distribution. Bias can easily enter AI systems through the training data you create, collect, or process. The system can pick them up, encode them, and have the potential to scale, which can result in unfair results (as with COMPAS). Machine learning algorithms learn from the training data they are given. As they say – garbage in, garbage out.

    To encourage the adoption of AI, you must ensure that it does not take on and amplify biases and use equitable training data and models to avoid unfair treatment. Establishing tests for identifying, curating, and minimizing bias in training data sets should be a key element to establish fairness in AI systems.

  3. Robustness: Robustness consists of two factors, safety and security.

    AI safety is typically associated with the ability of an AI model to build knowledge that incorporates societal norms, policies, or regulations that correspond to well-established safe behaviors.

    AI security is defending AI systems from malicious attacks. Like any software system, AI systems are vulnerable to adversarial attacks. This raises security concerns as models can be tampered with or data can be compromised or poisoned.

    one pixel attack 1

    Attackers can steal AI models by studying their outputs or fool the algorithm by introducing noise or adversarial perturbation. If you think this is difficult, check out One Pixel Attack for Fooling Deep Neural Networks, which describes how a deep neural network identifies images incorrectly when one pixel – that’s right, only one pixel – is modified in the image.

    Security can be improved by exposing and fixing vulnerabilities in the system, identifying new attacks and defenses, designing new adversarial training methods to strengthen against attacks, and developing new metrics to evaluate robustness.

  4. Lineage: AI models are constantly evolving. Lineage means tracking and maintaining the provenance of data sets, metadata, models along with their hyperparameters, and test results. Traceability is crucial for regulators, appropriate organizations, third parties, and users to audit the system and be able to reproduce past outputs and track outcomes.

    Lineage for AI systems can help determine the exact version of the service that is deployed at any point in the past, how many times the service was retrained and associated details like hyperparameters that are used for each training episode, training data set used, how accuracy and safety metrics have evolved over time, the feedback data received by the service, and the triggers for retraining and improvement.


Now that you know the importance of engendering trust in AI systems and the factors influencing them, let’s take a look at the tools to help you achieve them.

Watson OpenScale

OpenScale image

Watson OpenScale is an enterprise-grade environment for AI-infused applications that gives enterprises visibility into how AI is being built and used as well as delivering ROI. OpenScale is open by design and can detect and mitigate bias, help explain AI outcomes, scale AI usage, and give insights into the health of the AI system – all within a unified management console.

AI Explainability 360

The AI Explainability 360 toolkit is an open source Python library that can help comprehend how machine learning models predict labels by various means throughout the AI application lifecycle. It includes algorithms that support interpretability and explainability of data sets and machine learning models.

Adversarial Robustness 360 Toolbox

The Adversarial Robustness 360 Toolbox is an open source Python library for adversarial machine learning and supports defending deep neural networks against adversarial attacks, making AI systems more secure. Its purpose is to allow rapid crafting and analysis of attacks and defense methods for machine learning models. The Adversarial Robustness 360 Toolbox provides an implementation for many state-of-the-art methods for attacking and defending classifiers.

AI Fairness 360

The AI Fairness 360 toolkit is an open source Python library to help detect and remove bias in machine learning models. The AI Fairness 360 Python package includes a comprehensive set of metrics for data sets and models to test for biases, explanations for these metrics, and algorithms to mitigate bias in data sets and models.


This article explained how bias and attacks can affect AI systems. It gave several examples of how this can and has happened, and provided some tools to help mitigate the issues.