2021 Call for Code Awards: Live from New York, with SNL’s Colin Jost! Learn more

Open Sentencing: Identifying discrepancies in prison sentencing based on race


People in the Black community are faced with harsher downstream effects in the criminal justice system. They are charged at higher rates, assigned more significant charges, convicted at higher rates, given longer sentences, and denied parole more often than people of other races for similar offenses. This systemic bias in the justice system has a deep and lasting impact on Black families, Black communities, and the country. Open Sentencing exposes bias and empowers public defenders to directly address racial disparities in the judicial system.

The solution includes a user interface (UI) that feeds into a pretrained Bias & Disparity Detection Engine. This engine analyzes fact patterns and rapidly provides statistical analysis that highlights deviations from guidelines by race throughout an accused person’s judiciary process. The reports from Open Sentencing provide clear insights for the public defender to aid in defending against detected bias, ultimately, and ideally fairly, reducing incarceration for members of the Black community.

Learning objectives

In this tutorial, get a copy of the solution running on your local system, and explore the data science that goes into identifying disparity in prison sentencing for different racial groups. You see how the model was developed, including standard approaches to machine learning that include feature correlation studies and hyperparameter optimization approaches. Finally, you understand how the model can be packaged to provide a REST endpoint and function as an independent microservice in a larger data management system.


To follow this tutorial and use the solution, you need an understanding of:

  • Python
  • pandas
  • scikit-learn
  • scikit-optimization
  • Flask
  • Docker
  • Swagger

Estimated time

It should take you approximately 30 minutes to complete this tutorial.

Architecture diagram

The full solution includes several microservices, which include a front end and an IBM® Cloudant® database datastore for persistent data, with a central service called the aggregator that routes requests to various locations as needed. There are currently two machine learning microservice modules in development, each with their own repository, data sources, and modeling approaches. This tutorial focuses on both of the prediction services and discusses the unique approaches to both models.

Architecture Diagram

The project is discussed in the Open Sentencing solution starter GitHub repository.

The complete system runs four microservices and one JSON store. The repositories for each microservice are:

Running locally

Downloading repositories and launching servers

Launching Flask: Open Sentencing Model

  1. Download the Open Sentencing Model repository. You can install the required libraries by using the following command.

     pipenv install
  2. Launch the Flask server running the model with the following command. It serves on port localhost:3000.

     python manage.py run

    Use the following curl command to do a quick test of the server.


    The command returns the following result.

       "model_name": "sentence_pipe_mae1.555_2020-10-10_02h46m24s",
       "sentencing_discrepency": 0.211,
       "severity": 0.555

The model_name is the file name (with a .pkl extension) of the model file used to make the prediction. Discrepancy and Severity are discussed in the notebook. The following examples can be run on your local machine using jupyter-notebook.

Launching Flask: Bias Detection Engine

  1. Download the Bias Detection Engine repository. You can install the required libraries by using the following command:

     pipenv install
  2. Launch the Flask server running the model with the following command. It serves on port localhost:5000.

     python manage.py run

    Use the following curl command to do a quick test of the server.

     curl -X POST -H "Content-Type: application/json" -d '{"charge_code": "Drug trafficking", "race": "Black", "gender": "Male", "controlled_substance_quantity_level": 6}' localhost:5000/sentencing-disparity

    The command returns the following result.

     {'charge_code': 'Drug trafficking',
      'controlled_substance_quantity_level': 6,
      'deviations': [{'charge_code': 'Drug trafficking',
        'sentence_deviations': [{'commitmentTerm': 39.72043010752688,
          'commitmentUnit': 'Months',
          'sentence_type': 'Prison Only'}]}],
      'gender': 'Male',
      'race': 'Black',
      'success': True}

Understanding the APIs with notebooks

If you want to gain a deeper understanding of how the model is loaded, processed, and served, each project provides a notebook that illustrates the usage of the model from within Python. For the Bias Detection Engine, look at Calling the REST API. This notebook calls the live services on localhost:5000.

The Loading Data and Making Predictions Notebook guides you through the steps of loading the Cook County Sentencing Data, cleaning it as needed, and calling the Python functions directly to make predictions using the pretrained model.

This notebook is meant to be run locally while the Flask server is serving on localhost:5000 of the same machine. In this notebook, you see how a pretrained model is loaded, along with the raw input data used to train the model. You can see how the model is processed to return the results provided in the REST endpoint, and explore variables. You also see how the Python request library is used to make REST calls, and compare your results from the server to those you computed directly.

Understanding the model training and development with a notebook

Now, let’s take a look at how the model is trained.

The Bias Detection Engine

The criminal justice system is a complex system of states, as shown in the following figure. While the REST endpoints provide a simple measure, it only covers those who actually are convicted and sentenced. In reality, there are many other states that a person might go through after arrest, along with decisions that are made (such as plea bargains) that affect the person’s fate. An analysis of this process reveals that many Black people and other people of color are disproportionately treated with lower favor at nearly every step of the process.


To understand more about this analysis, review the Bias Detection Engine Demo, which is developed from a simulated version of data obtained from the United States Sentencing Commission. The guidelines are generated by the recommendations of the Sentencing Commission and are determined by the nature and severity of the crime. See Drug Trafficking Data for a discussion of how the data set is generated, along with the sentencing guidelines.

For cases where the individual is sentenced, a comparison of the actual sentence and the guidelines can be made, and you see that Black people are often given harsher sentences than White people committing similar crimes.

The Open Sentencing Model

While both of these services return similar predictions, they follow very different threads of development, data discovery, and model development. The previous model follows formal legal guidelines that are provided to generate the “ground truth” estimate of an appropriate sentence. In this model, we estimate the sentence for a given feature set (including race) and predict the sentence based on a model that was trained on a large set of example data. We can then estimate disparity, or difference, by comparing the model predictions when we change only race and keep all other features constant. In machine learning terminology, we refer to this as modifying a protected attribute.

You can look at the notebook used to build the Open Sentencing Model. This model predicts the difference between sentencing for a conviction if the person were of a different race. You can open the notebook locally with the command jupyter-notebook and explore the data locally.


Discrepancies in Prison Sentencing by Race, as produced in the notebook

You can use a pretrained model or train your own. Training a model from scratch takes roughly 30 minutes on a typical system.


This solution uses two models that directly and quantitatively address the question of systemic bias in sentencing, with recommendations for mitigation for future sentences. In addition to the data science, we explore the steps required to make these predictions available to a complete system that can support report generation and database maintenance with a UI that non-technical legal staff can use and gain important insights.

As a developer, you can take a stand and apply your skills and ingenuity to make a difference. Learn how you can be a part of a motivated community of developers and supporters working to evolve the Call for Code for Racial Justice open source solutions.