Now available! Red Hat OpenShift Container Platform for Linux on IBM Z and LinuxONE Learn more

IBM Debater® Wikipedia Category Stance

Overview

The dataset contains:

  1. 132 concepts
  2. 4603 Wikipedia categories and lists annotated for stance (Pro/Con) towards the concepts

The released data file has 4 columns:

  • Column A: the label
  • Column B: the concept
  • Column C: the page title of the category or list in Wikipedia
  • Column D: the URL of the category/list page

For each category, the label is one of the following:

  1. “-” – The category is not a person group category
  2. “P” – Pro stance (supporting the concept)
  3. “C” – Con stance (opposing the concept)
  4. “?” – The stance cannot be determined based on the category name, or the category is not relevant.
  5. “X” – Unresolved case: each of the 3 annotators gave a different label

Dataset Metadata

Format License Domain Number of Records Size Originally Published
CSV
CC-BY-SA 3.0 Natural Language Processing 4,603 records
525KB 2016-08-30

Example Records

P,Abortion,Category:American_pro-choice_activists,https://en.wikipedia.org/wiki/Category:American_pro-choice_activists    
C,Abortion,Category:American_pro-life_activists,https://en.wikipedia.org/wiki/Category:American_pro-life_activists

Citation

@inproceedings{toledo-ronen-etal-2016-expert,
title = "Expert Stance Graphs for Computational Argumentation",
author = "Toledo-Ronen, Orith  and
Bar-Haim, Roy and
Slonim, Noam",
booktitle = "Proceedings of the Third Workshop on Argument Mining ({A}rg{M}ining2016)",
month = aug,
year = "2016",
address = "Berlin, Germany",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/W16-2814",
doi = "10.18653/v1/W16-2814",
pages = "119--123",
}
  • Project Debater Project Debater is the first AI system that can debate humans on complex topics. The goal is to help people build persuasive arguments and make well-informed decisions. This dataset contributed to training the models in Project Debater.