Win $20,000. Help build the future of education. Answer the call. Learn more

TakeTwo: A quick and simple tool to help detect and eliminate racial bias

TakeTwo provides a quick and simple tool to help detect and eliminate racial bias — both overt and subtle — in written content. Using TakeTwo to detect phrases and words that can be seen as racially biased can assist content creators in proactively mitigating potential bias as they write. It enables a content creator to check content that they have created before publishing it, currently through an online text editor.

As you compose social media text, paragraphs, essays, and papers, the TakeTwo API scans the content for potentially racially biased language. The API works by flagging and classifying phrases and words that have a tendency of being perceived as racially biased within the United States. These phrases and words are then categorized by common types of detectable racially biased language.

TakeTwo is built using open source technologies. The API is built using Python and FastAPI, and the Chrome extension uses JavaScript. The tutorial provides instructions on running the application on the cloud using Docker for containerization. The racially biased terms are vetted and loaded into a back-end database. The code is set up to run the API locally with a CouchDB back-end database or an IBM Cloudant® database. There is a front-end HTML page that serves as an example text editor.

Prerequisites

To follow this tutorial and use the solution starter, you need an understanding of:

  • Python
  • FastAPI
  • JavaScript
  • Chrome Extension API
  • CouchDB
  • Docker

You also need the following to run the services on IBM Cloud®:

  • An IBM Cloud account. When you register you can also join the Call for Code community of over 400,000 developers to build new skills and contribute to racial justice open source projects.
  • IBM Cloud CLI installed and configured
  • The App ID Service

Estimated time

It should take you approximately 60 minutes to complete this tutorial.

Architecture diagram

The TakeTwo solution consists of multiple components, each of which has their own GitHub repository:

  • The TakeTwo Chrome extension lets authorized users highlight racially biased text from their browser and report back to the database.
  • The TakeTwo web API provides end points for authenticated users to submit racial words to the database.
  • The TakeTwo data science workstream is a natural language processing model that predicts whether a word or phrase contains racial bias and, if so, returns the category of racial bias. The model is trained using data crowdsourced from the chrome plug-in.

Architecture

  1. The Chrome extension enables an authenticated user to highlight content on the fly within their browser and categorize it as racially biased.
  2. An authenticated user can mark text and tag racially biased terms.
  3. The information is submitted to the TakeTwo API.
  4. The TakeTwo API writes data to the back-end database.
  5. The machine learning model reads data from the database to train and refine the model.
  6. The client app sends the content as a request to the model, and the model responds by flagging any text that could contain racially biased terms.

TakeTwo web back-end API

The web API is built-in Python code and handles the following:

  • Captures the data crowdsourced by contributors through the TakeTwo Chrome extension tool
    • Authenticates users that are allowed to create new highlights using OAuth 2.0
    • Allows authenticated users to post highlighted text back to the CouchDB database
    • Allows authenticated users to remove highlighted texts
  • Provides end points to populate the TakeTwo Chrome extension tool
  • Provides training data for the machine learning model that can detect racial bias

To set up the web back-end API, follow the instructions in the readme file.

TakeTwo browser extension

The TakeTwo Chrome JavaScript extension uses the Highlighter Chrome extension library to allow authenticated users to select and highlight text. The TakeTwo Chrome extension is a plug-in to facilitate through a browser the capture and categorization of words and phrases that could be racially biased.

This extension enables the crowdsourcing of data for use in training a machine learning model. The extension aims to make it as easy as possible for community members who would like to contribute to this initiative to do so quickly and privately.

To install the Chrome extension, follow the instructions in the readme file.

TakeTwo data science

The TakeTwo data science workstream uses data crowdsourced by the Chrome extension, and the data is sent to a back-end database. The machine learning model code is written in Python and runs in a Jupyter Notebook.

To build a machine learning model to predict whether a word or phrase contains racial bias, follow the instructions in the readme file.

Steps

To build and use the TakeTwo API:

  1. Clone the TakeTwo web API repository, and follow the instructions.
  2. Clone the TakeTwo chrome extension repository, and follow the instructions.
  3. Clone the TakeTwo data science workstream repository, and follow the instructions.

Summary

In this tutorial, you learned how the TakeTwo API provides a quick and simple tool to help detect and eliminate racial bias. Using TakeTwo to detect phrases and words that can be seen as racially biased can assist you in proactively mitigating potential bias as you write. We welcome your further involvement by becoming an active contributor of the GitHub repository for this open source project.

As a developer, you can take a stand and apply your skills and ingenuity to make a difference. Learn how you can be a part of a motivated community of developers and supporters working to evolve the Call for Code for Racial Justice open source solutions.