微软WikiQA corpus 简介

本资料介绍了一个新的公开可用的问题和句子对集合——WikiQA语料库,该语料库收集并注释了大量的开放域问题及其可能的答案。通过使用Bing查询日志作为问题来源,并以Wikipedia页面作为潜在答案来源,此语料库包含3,047个问题及29,258个候选答案句子,其中1,473个被标记为正确答案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >



太简单了,没什么好翻译的


The WikiQA corpus is a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering. Last published: August 28, 2015.

Details
  • Version:

    1.0

    File Name:

    WikiQACorpus.zip

    Date Published:

    7/14/2016

    File Size:

    6.8 MB

    • The WikiQA corpus is a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering. In order to reflect the true information need of general users, we used Bing query logs as the question source. Each question is linked to a Wikipedia page that potentially has the answer. Because the summary section of a Wikipedia page provides the basic and usually most important information about the topic, we used sentences in this section as the candidate answers. With the help of crowdsourcing, we included 3,047 questions and 29,258 sentences in the dataset, where 1,473 sentences were labeled as answer sentences to their corresponding questions. More detail of this corpus can be found in our EMNLP-2015 paper, "WikiQA: A Challenge Dataset for Open-Domain Question Answering" [Yang et al. 2015]. In addition, this download also includes the experimental results in the paper, an evaluation script for judging the "answer triggering" task, as well as the answer phrases labeled by the authors of the paper.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值