The dataset contains:
1. 132 concepts
2. 4603 Wikipedia categories and lists annotated for stance (Pro/Con) towards the concepts

The released data file has 4 columns:
Column A: the label
Column B: the concept
Column C: the page title of the category or list in Wikipedia
Column D: the URL of the category/list page

For each category, the label is one of the following:
1. “-”  – The category is not a person group category
2. “P” – Pro stance (supporting the concept)
3. “C” – Con stance (opposing the concept)
4. “?”  – The stance cannot be determined based on the category name, or the category is not relevant.
5. “X” –  Unresolved case: each of the 3 annotators gave a different label

Dataset Metadata

Format License Domain Number of Records Size Originally Published
CC-BY-SA 3.0 Natural Language Processing 4,603 records
525KB August 30, 2016

Example Records



title = "Expert Stance Graphs for Computational Argumentation",
author = "Toledo-Ronen, Orith  and
Bar-Haim, Roy and
Slonim, Noam",
booktitle = "Proceedings of the Third Workshop on Argument Mining ({A}rg{M}ining2016)",
month = aug,
year = "2016",
address = "Berlin, Germany",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/W16-2814",
doi = "10.18653/v1/W16-2814",
pages = "119--123",
  • Project Debater Project Debater is the first AI system that can debate humans on complex topics. The goal is to help people build persuasive arguments and make well-informed decisions. This dataset contributed to training the models in Project Debater.