Digital Developer Conference: Hybrid Cloud. On Sep 22 & 24, start your journey to OpenShift certification. Free registration

Airline Reporting Carrier On-Time Performance Dataset

Overview

The Reporting Carrier On-Time Performance Dataset contains information on approximately 200 million domestic US flights reported to the United States Bureau of Transportation Statistics. The dataset contains basic information about each flight (such as date, time, departure airport, arrival airport) and, if applicable, the amount of time the flight was delayed and information about the reason for the delay. This dataset can be used to predict the likelihood of a flight arriving on time.

Dataset Metadata

Field Value
Format CSV
License CDLA-Sharing
Domain Time Series
Number of Records 194,385,636 flights
Data Split NA
Size 81 GB
Dataset Origin Bureau of Transportation Statistics
Dataset Version Update Version 1 – June 25, 2020
Data Coverage Location: United States
Dates: 1987 through 2020
Business Use Case Aviation: Predict which flights are likely to arrive on time

Dataset Archive Contents

File or Folder Description
airline.csv All records
airline_2m.csv Random 2 million record sample (approximately 1%) of the full dataset

Data Glossary and Preview

Click here to explore the data glossary, sample records, and additional dataset metadata.

Use the Dataset

This dataset is complemented by data exploration, data analysis, and modeling Python notebooks to help you get started:

Citation

This dataset was compiled from data available on the Bureau of Transportation Statistics website and is US Government work not subject to copyright.

Legend