WebQSP Relation Detection

Overview

The WebQSP Relation Detection dataset is a set of relation extraction annotations derived from the WebQuestionsSP dataset. Each entry in this dataset follows the order of questions listed in the WebQuestionsSP dataset and corresponds to the following format: gold_relations \t negative_relation_pool \t question. The relation ids are mapped in a separate file titled relations.txt where the index of the ids starts at 1. The dataset is split into train and test sets to match the split used by the WebQuestionsSP data.

The relationship extraction task deals with generating semantic relationships between entities in a text. Relationships generally connect two entities via a certain affiliation. Examples of entities for instance can be types of people, organizations, or locations while relationships among these entities can be for instance types of spatial, social, or hierarchical relations. The entities “Steve Jobs” and “Apple” for instance may have the relation of “Founder”. Relation extraction is important in the field of machine reading and provides a necessary input into more complicated tasks for computers such as answering questions, acting as conversational agents, or summarizing text.

The original WebQuestionsSP dataset was developed by Microsoft and consists of full semantic parses in SPARQL queries for 4,737 questions, as well as partial annotations for 1,073 questions. For more information or for access to the WebQuestionsSP dataset you can visit the dataset’s homepage linked below in the Related Links section.

Dataset Metadata

Format License Domain Number of Records Size Originally Published
TSV
TXT
CDLA-Permissive Natural Language Processing 1,649 questions 2.3MB 2017-05-26

Example Records

150        3330 3341 3533 150 3534 101 3535 2368 3339 102 159 30 158 160        $ARG1 where is the <e> located $ARG2

location.location.containedby

Citation

@inproceedings{yu2017improved,
 title={Improved Neural Relation Detection for Knowledge Base Question Answering},
 author={Yu, Mo and Yin, Wenpeng and Hasan, Kazi Saidul and dos Santos, Cicero and Xiang, Bing and Zhou, Bowen},
 booktitle={Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
 pages={571--581},
 year={2017}
}
Legend