The dataset contains:
1. 132 concepts
2. 4603 Wikipedia categories and lists annotated for stance (Pro/Con) towards the concepts

The released data file has 4 columns:
Column A: the label
Column B: the concept
Column C: the page title of the category or list in Wikipedia
Column D: the URL of the category/list page

For each category, the label is one of the following:
1. “-”  – The category is not a person group category
2. “P” – Pro stance (supporting the concept)
3. “C” – Con stance (opposing the concept)
4. “?”  – The stance cannot be determined based on the category name, or the category is not relevant.
5. “X” –  Unresolved case: each of the 3 annotators gave a different label

Dataset Metadata

Format License Domain Number of Records Size Originally Published
CC-BY-SA 3.0 Natural Language Processing 4,603 records
525KB August 30, 2016

  • Project Debater Project Debater is the first AI system that can debate humans on complex topics. The goal is to help people build persuasive arguments and make well-informed decisions. This dataset contributed to training the models in Project Debater.