The Blog

 

The human race continually faces a fundamental question: what values are we going to instill in the next generation?  Experts generally agree that instilling positive values in our kids at a young age helps them grow into successful, responsible individuals who can distinguish right from wrong. These kids have the best chance to develop a sense of justice, fairness, and unbiased thinking, and they have increased capacity to form meaningful and trusting relationships.

 

At this pivotal point in our history, the values we’re passing on to the next generation are once again in the spotlight. But this time, we’re trying to understand how to pass on the concept of fairness to the next generation of artificial intelligence systems.

Growing up fast

AI is experiencing a renaissance and, according to Gartner, it’s vital that we “build AI right,” “use AI right,” and “keep AI right.” The values adopted to build today’s AI systems will be reflected in the decisions those systems make for a decade or more.

The application of AI algorithms in domains such as criminal justice, credit scoring, and hiring holds unlimited promise. At the same time, it raises legitimate concerns about algorithmic fairness — AI systems are deciding everything from which resumes are considered, to which insurance claims will be accepted, to who gets their  mortgage loan approved, and even to who receives a parole. We need to guard against the misapplication of race, gender, religion or other characteristics in the decisions that AI systems make. It’s often a question of legality, but it’s also a question of basic fairness.

Accordingly, there’s now a growing demand for fairness, accountability, and transparency from machine learning (ML) systems. And we need to remember that training data isn’t the only source of possible bias. It can also be introduced through inappropriate data handling, inappropriate model selection, or incorrect algorithm design. Bias can also affect usage data. 

What we need is a “comprehensive bias pipeline” that fully integrates into the AI lifecycle. Such a pipleine requires a robust set of checkers, “de-biasing” algorithms, and bias explanations.

Announcing the AI Fairness 360 toolkit!

We’re happy to announce the launch of the open source AI Fairness 360 toolkit. The AI Fairness 360 toolkit is designed to help address problems of bias through fairness metrics and bias mitigators. The toolkit’s fairness metrics can be used to check for bias in machine learning workflows, while its bias mitigators can be used to overcome bias in a workflow to produce a more fair outcome. 

Let’s take a closer look at what we mean by bias. The loan scenario mentioned above, where AI might decide which mortgages to approve, shows an example of an actual illegal bias, assuming those decisions are somehow being made based on race, religion, or gender. However, not all undesirable bias in machine learning is illegal — it could also exist in more subtle ways. For example, a loan company might want a diverse portfolio of customers across all income levels, and could therefore deem it undesirable if they are making more loans to high income levels over low income levels. Although this scenario is not illegal or unethical, it’s undesirable for the company’s strategy.

Bias can enter the system anywhere in the data-gathering, model-training, and model-serving phases. The training data set might be biased towards particular types of instances. The algorithm that creates the model could be biased so that it generates models that are weighted towards particular variables in the input. The test data set might be biased in that it has expectations on correct answers that themselves are biased. Testing and mitigating bias should take place at each of these three steps in the machine learning process. In the AI Fairness 360 toolkit codebase, we call these points pre-processing, in-processing, and post_processing.

Diagram of AI workflow


Available as open source for maximum flexibility

IBM is committed to open source. Our contributions are unrivaled in the industry. We’ve worked extremely hard over the years to establish a respected reputation in open source circles, especially in those communities where we invest strategically. IBMers serve on major open source foundation boards, including Linux, Eclipse, Apache, CNCF, Node.js, Hyperledger, and many others, and thousands of IBMers contribute regularly to open source projects. We value and work toward open governance because we think it’s the best way to ensure the long-term success and viability of open source projects. 

IBM Research has a long history of releasing open source technologies in cutting-edge innovation areas. In partnership with IBM’s Center for Opensource Data and Artificial Intelligence Technologies (CODAIT), IBM Research recently released FfDL (Fabric for Deep Learning), which provides a consistent way to deploy, train, and visualize deep learning jobs across multiple frameworks such as TensorFlow, Caffe, PyTorch, and Keras. Additionally, we’ve launched the Adversarial Robustness Toolbox (ART) to ensure we can detect and fix security vulnerabilities in our models.

Carrying this partnership forward, we’re proud to be launching the AI Fairness 360 toolkit as an open source project. The toolkit’s Python package includes a comprehensive set of metrics for data sets and models for bias testing, explanations for these metrics, and algorithms to mitigate bias in other data sets and models. The AI Fairness 360 interactive demo provides a gentle introduction to the project’s concepts and capabilities. The tutorials and other notebooks offer a deeper, data scientist-oriented introduction. The complete API is also available.

As an example, if you run the AI Fairness 360 toolkit on the Statlog (German Credit Data) Data Set, you’ll discover bias on a set of metrics with respect to age. This data set classifies people described by a set of attributes as good or bad credit risks. If there is more than 80% adverse impact as defined by U.S law, it’s marked as biased. We notice that four out of five metrics for the age attribute are marked biased by the toolkit. You can quickly see the power and utility of AIF360.

Image of graphs from Statlog Data Set


Let’s raise our AI systems right!

With the launch of AI Fairness 360 toolkit, we’ve taken another step in our mission to democratize AI and bring it closer to developers. The AI Fairness 360 toolkit, in addition to the Adversarial Robustness Toolbox (ART), Fabric for Deep learning (FfDL), and Model Assset Exchange (MAX), provides a powerful set of tools to optimize your AI implementations. Each of them are available now on GitHub to deploy, use, and extend as you see fit.

It’s an exciting time. Our mission to grow the next generation of AI systems continues. We hope you’ll try the AI Fairness 360 toolkit today, and we look forward to hearing your experiences and feedback. Join us, and together let’s raise AI right!

Additional reading