Overview

SLIDE (Sentiment Lexicon of IDiomatic Expressions) is a resource for sentiment analysis, created via crowdsourcing. The lexicon includes 5,000 frequently occurring idioms, as estimated from a large English corpus. The idioms were selected from Wiktionary, and over 40% of them were labeled as sentiment-bearing. Each idiom was annotated as positive, negative, neutral or inappropriate by at least ten annotators. The lexicon includes a sentiment label along with the distribution of sentiment annotations. Our labels are assigned by taking the label with the greatest number of votes from the crowdsourced annotation. In the case of ties between positive (or negative) and neutral, the label is positive (resp. negative). In the rare cases of ties between positive and negative, we use the neutral label. The resulting lexicon has 946 positive idioms, 1,108 negative, 2,945 neutral, and 1 inappropriate.

The released data file has 12 columns:
Column A: Idiom expression
Column B: Link to idiom in Wiktionary
Column C: Count of positive annotation
Column D: Count of negative annotation
Column E: Count of neutral annotation
Column F: Count of annotation where the expression was deemed vulgar or inappropriate
Column G: Total annotations
Column H: Percent positive
Column I: Percent negative
Column J: Percent neutral
Column K: Sentiment label
Column L: Ambiguous expression filter — ‘X’ indicates removal (see paper, Section 4)

Dataset Metadata

Format License Domain Number of Records Size
TSV
CC-BY-SA 3.0 Natural Language Processing 5000 sentiment-annotated idioms
67KB

Example Records

alive and kicking    https://en.wiktionary.org/wiki/alive_and_kicking    10    0    0    0    10    1.000    0.000    0.000    positive   

Citation

@inproceedings{jochim-etal-2018-slide,
title = "{SLIDE}---a Sentiment Lexicon of Common Idioms",
author = "Jochim, Charles and Bonin, Francesca and Bar-Haim, Roy and Slonim, Noam",
booktitle = "Proceedings of the Eleventh International Conference on Language Resources and Evaluation ({LREC}-2018)",
month = may,
year = "2018",
address = "Miyazaki, Japan",
publisher = "European Languages Resources Association (ELRA)",
url = "https://www.aclweb.org/anthology/L18-1379",
}
  • Project Debater Project Debater is the first AI system that can debate humans on complex topics. The goal is to help people build persuasive arguments and make well-informed decisions. This dataset contributed to training the models in Project Debater.