Overview
TensorFlow Speech Command dataset is a set of one-second .wav audio files, each containing a single spoken English word. These words are from a small set of commands, and are spoken by a variety of different speakers. 20 of the words are core words, while 10 words are auxiliary words that could act as tests for algorithms in ignoring speeches that do not contain triggers. Included along with the 30 words is a collection of background noise audio files. The dataset was originally designed for limited vocabulary speech recognition tasks. The audio clips were originally collected by Google, and recorded by volunteers in uncontrolled locations around the world.
Dataset Metadata
Field | Value |
---|---|
Format | WAV |
License | CC BY 4.0 |
Domain | Audio |
Number of Records | 65,000 WAV files |
Data Split | Train – 51,094 audio clips, Validation – 6,798 audio clips, Test – 6,835 audio clips |
Size | 1.49 GB |
Dataset Origin | The audio clips were originally collected by Google. Recorded by volunteers in uncontrolled locations around the world. |
Dataset Version | Version 1 – March 17, 2020 |
Data Coverage | Core words: Yes, No, Up, Down, Left, Right, On, Off, Stop, Go, Zero, One, Two, Three, Four, Five, Six, Seven, Eight, and Nine. Auxiliary words: Bed, Bird, Cat, Dog, Happy, House, Marvin, Sheila, Tree, and Wow. Background noise: doing_the_dishes, dude_miaowing, exercise_bike, pink_noise, running_tap, and white_noise. To know more about the data collection process go through data archive’s README.md . |
Business Use Case | Build voice recognition systems that are widely used in the Internet of Things, Automotive, Security and UX/UI. Build voice based search applications and voice-activated assistants. |
Dataset Archive Contents
File or Folder | Description |
---|---|
31 Audio clip folders |
Folders containing audio clips |
testing_list.txt |
Path to all the files in the test set. |
validation_list.txt |
Path to all the files in the validation set. |
LICENSE.txt |
Terms of Use |
README.md |
Explains data collection, processing details, and steps for splitting dataset |
Data Glossary and Preview
Click here to explore the data glossary, sample records, and additional dataset metadata.
Use the Dataset
This dataset is complemented by starter notebooks that will help you get started:
Related Links
- Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition Describes how the data was collected and verified, what it contains, previous versions and properties.
Citation
@article{speechcommands, title={Speech Commands: A public dataset for
single-word speech recognition.}, author={Warden, Pete}, journal={Dataset
available from
http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz}, year={2017}
}