Overview
The dataset contains:
- 132 concepts
- 4603 Wikipedia categories and lists annotated for stance (Pro/Con) towards the concepts
The released data file has 4 columns:
- Column A: the label
- Column B: the concept
- Column C: the page title of the category or list in Wikipedia
- Column D: the URL of the category/list page
For each category, the label is one of the following:
- “-” – The category is not a person group category
- “P” – Pro stance (supporting the concept)
- “C” – Con stance (opposing the concept)
- “?” – The stance cannot be determined based on the category name, or the category is not relevant.
- “X” – Unresolved case: each of the 3 annotators gave a different label
Dataset Metadata
Field | Value |
---|---|
Format | CSV |
License | CC BY 3.0 |
Domain | Natural Language Processing |
Number of Records | 4,603 records |
Data Split | NA |
Size | 525 KB |
Authors | Orith Toledo-Ronen, Roy Bar-Haim |
Dataset Origin | IBM Research |
Dataset Version | Version 2 – August 1, 2019 Version 1 – August 30, 2016 |
Data Coverage | 132 concepts, 4603 Wikipedia categories and lists annotated for stance (Pro/Con) towards the concepts |
Business Use Case | Government – Analyze sentiment of political topics and conversations. |
Dataset Archive Contents
File or Folder | Description |
---|---|
WikipediaCategoriesResults.csv |
The dataset |
WikipediaCategoriesLabeling.docx |
The guidelines used for labeling the data |
LICENSE.txt |
Terms of Use |
ReleaseNotes.txt |
Release notes file describing the data |
Data Glossary and Preview
Click here to explore the data glossary, sample records, and additional dataset metadata.
Use the Dataset
This dataset is complemented by starter notebooks that will help you get started:
Related Links
- Project Debater Project Debater is the first AI system that can debate humans on complex topics. The goal is to help people build persuasive arguments and make well-informed decisions. This dataset contributed to training the models in Project Debater.
Citation
@inproceedings{toledo-ronen-etal-2016-expert,
title = "Expert Stance Graphs for Computational Argumentation",
author = "Toledo-Ronen, Orith and
Bar-Haim, Roy and
Slonim, Noam",
booktitle = "Proceedings of the Third Workshop on Argument Mining ({A}rg{M}ining2016)",
month = aug,
year = "2016",
address = "Berlin, Germany",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/W16-2814",
doi = "10.18653/v1/W16-2814",
pages = "119--123",
}
Legend