NLP:MRC常用数据集

一个benchmarks
 
 
汇总了以下的数据集:
 

汇总了22个数据集

ORB(An Open Reading Benchmark) is an evaluation server which tests a single reading comprehension model's performance on diverse datasets. It contains a suite of seven existing datasets (DROP, ROPES, SQuAD1.1, SQuAD2.0 Quoref, NewsQA, NarrativeQA) and synthetic augmentations from various adversarial models, which test a model's capabilties to learn various lingusitic artifacts in a single unified model.

 
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
Context:Huguenot numbers peaked near an estimated two million by 1562, concentrated mainly in the southern and central parts of France, about one-eighth the number of French Catholics. As Huguenots gained influence and more openly displayed their faith, Catholic hostility grew, in spite of increasingly liberal political concessions and edicts of toleration from the French crown. A series of religious conflicts followed, known as the Wars of Religion, fought intermittently from 1562 to 1598. The wars finally ended with the granting of the Edict of Nantes, which granted the Huguenots substantial religious, political and military autonomy.
Question: Where was France's Huguenot population largely centered?

Answer:Ground Truth Answers: the southern and central parts of Francesouthern and central parts of France,about one-eighth

 
 
CoQA is a large-scale dataset for building Conversational Question Answering systems. The goal of the CoQA challenge is to measure the ability of machines to understand a text passage and answer a series of interconnected questions that appear in a conversation.
Context:Once upon a time, in a barn near a farm house, there lived a little white kitten named Cotton. Cotton lived high up in a nice warm place above the barn where all of the farmer's horses slept. But Cotton wasn't alone in her little home above the barn, oh no. She shared her hay bed with her mommy and 5 other sisters. All of her sisters were cute and fluffy, like Cotton. But she was the only white one in the bunch. The rest of her sisters were all orange with beautiful white tiger stripes like Cotton's mommy. Being different made Cotton quite sad. She often wished she looked like the rest of her family. So one day, when Cotton found a can of the old farmer's orange paint, she used it to paint herself like them. When her mommy and sisters found her they started laughing. \n\n\"What are you doing, Cotton?!\" \n\n\"I only wanted to be more like you\". \n\nCotton's mommy rubbed her face on Cotton's and said \"Oh Cotton, but your fur is so pretty and special, like you. We would never want you to be any other way\". And with that, Cotton's mommy picked her up and dropped her into a big bucket of water. When Cotton came out she was herself again. Her sisters licked her face until Cotton's fur was all all dry. \n\n\"Don't ever do that again, Cotton!\" they all cried. \"Next time you might mess up that pretty white fur of yours and we wouldn't want that!\" \n\nThen Cotton thought, \"I change my mind. I like being special
Question:What color was Cotton
Answer:white
 
 
 

A Large-Scale Person-Centered Cloze Dataset. We have constructed a new "Who-did-What" dataset of over 200,000 fill-in-the-gap (cloze) multiple choice reading comprehension problems constructed from the LDC English Gigaword newswire corpus. The WDW dataset has a variety of novel features. First, in contrast with the CNN and Daily Mail datasets (Hermann et al., 2015) we avoid using article summaries for question formation. Instead, each problem is formed from two independent articles --- an article given as the passage to be read and a separate article on the same events used to form the question. Second, we avoid anonymization --- each choice is a person named entity. Third, the problems have been filtered to remove a fraction that are easily solved by simple baselines, while remaining 84% solvable by humans. We report performance benchmarks of standard systems and propose the WDW dataset as a challenge task for the community. ( ARTICLE HERE )

Context:Britain's decision on Thursday to drop extradition proceedings against Gen. Augusto Pinochet and allow him to return to Chile is understandably frustrating ... Jack Straw, the home secretary, said the 84-year-old former dictator's ability to understand the charges against him and to direct his defense had been seriously impaired by a series of strokes. ... Chile's president-elect, Ricardo Lagos, has wisely pledged to let justice run its course. But the outgoing government of President Eduardo Frei is pushing a constitutional reform that would allow Pinochet to step down from the Senate and retain parliamentary immunity from prosecution. ...
Question:Sources close to the presidential palace said that Fujimori declined at the last moment to leave the country and instead he will send a high level delegation to the ceremony , at which Chilean President Eduardo Frei will pass the mandate to XXX.
Options:(1) Augusto Pinochet (2) Jack Straw (3) Ricardo Lagos
 
 
 
A Dataset for Diverse, Explainable Multi-hop Question Answering
Context:"Ed Wood is a 1994 American biographical period comedy-drama film directed and produced by Tim Burton, and starring Johnny Depp as cult filmmaker Ed Wood."," The film concerns the period in Wood's life when he made his best-known films as well as his relationship with actor Bela Lugosi, played by Martin Landau."," Sarah Jessica Parker, Patricia Arquette, Jeffrey Jones, Lisa Marie, and Bill Murray are among the supporting cast."]],["Scott Derrickson",["Scott Derrickson (born July 16, 1966) is an American director, screenwriter and producer."," He lives in Los Angeles, California."," He is best known for directing horror films such as \"Sinister\", \"The Exorcism of Emily Rose\", and \"Deliver Us From Evil\", as well as the 2016 Marvel Cinematic Universe installment, \"Doctor Strange.\""]],["Woodson, Arkansas",["Woodson is a census-designated place (CDP) in Pulaski County, Arkansas, in the United States."," Its population was 403 at the 2010 census."......
Question:Were Scott Derrickson and Ed Wood of the same nationality?
Answer(s):yes
 
 
 
The MS MARCO datasets are intended for non-commercial research purposes only to promote advancement in the field of artificial intelligence and related areas, and is made available free of charge without extending any license or other intellectual property rights. The dataset is provided “as is” without warranty and usage of the data has risks since we may not own the underlying rights in the documents. We are not be liable for any damages related to use of the dataset. Feedback is voluntarily given and can be used as we see fit. Upon violation of any of these terms, your rights to use the dataset will end automatically.
 
 
 
TriviaQA is a reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions. The details can be found in our ACL 17 paper TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
 
 
 

GLUE Tasks:

NameDownloadMore InfoMetric
The Corpus of Linguistic Acceptability  Matthew's Corr
The Stanford Sentiment Treebank  Accuracy
Microsoft Research Paraphrase Corpus  F1 / Accuracy
Semantic Textual Similarity Benchmark  Pearson-Spearman Corr
Quora Question Pairs  F1 / Accuracy
MultiNLI Matched  Accuracy
MultiNLI Mismatched  Accuracy
Question NLI  Accuracy
Recognizing Textual Entailment  Accuracy
Winograd NLI  Accuracy
Diagnostics Main  Matthew's Corr

SuperGLUE:https://super.gluebenchmark.com/

SuperGLUE Tasks:

NameIdentifierDownloadMore InfoMetric
Broadcoverage DiagnosticsAX-b  Matthew's Corr
CommitmentBankCB  Avg. F1 / Accuracy
Choice of Plausible AlternativesCOPA  Accuracy
Multi-Sentence Reading ComprehensionMultiRC  F1a / EM
Recognizing Textual EntailmentRTE  Accuracy
Words in ContextWiC  Accuracy
The Winograd Schema ChallengeWSC  Accuracy
BoolQBoolQ  Accuracy
Reading Comprehension with Commonsense ReasoningReCoRD  F1 / Accuracy
Winogender Schema DiagnosticsAX-g  Gender Parity / Accuracy
 
 
百度的中文数据集
DuReader 2.0 is a large-scale open-domain Chinese dataset for Machine Reading Comprehension (MRC) and Question Answering (QA). It contains more than 300K questions, 1.4M evident documents and corresponding human generated answers.
example:
{
  "question_type": "YES_NO",
  "question": "上海迪士尼可以带吃的进去吗",
  "documents": [
    {
      'paragraphs': ["text paragraph 1", "text paragraph 2"]
    },
  "answers": [
    "完全密封的可以,其它不可以。",                                  // answer1
    "可以的,不限制的。只要不是易燃易爆的危险物品,一般都可以带进去的。",  //answer2
    "罐装婴儿食品、包装完好的果汁、水等饮料及包装完好的食物都可以带进乐园,但游客自己在家制作的食品是不能入园,因为自制食品有一定的安全隐患。"        // answer3
  ],
  "yesno_answers": [
    "Depends",                      // corresponding to answer 1
    "Yes",                          // corresponding to answer 2
    "Depends"                       // corresponding to asnwer 3
  ]
}

 

CJRC:http://cogskl.iflytek.com/2019/11/25/ccl-2019-%E4%B8%AD%E6%96%87%E6%B3%95%E5%BE%8B%E9%98%85%E8%AF%BB%E7%90%86%E8%A7%A3%E6%95%B0%E6%8D%AE%E9%9B%86cjrc/

本文提出了首个中文法律阅读理解数据集,该数据集包含约10,000篇文档,主要涉及民事一审判决书和刑事一审判决书,数据来源于中国裁判文书网。通过抽取裁判文书的事实描述内容(“经审理查明”或者“原告诉称”部分),针对事实描述内容标注问题,最终形成约50,000个问答对。该数据集涉及多种问题类型,包括片段抽取型问题(Span-Extraction)、是否类问题(YES/NO)、拒答类问题(Unanswerable),期望可以覆盖真实场景中大多数类型的问题。我们希望通过该数据集,可以进一步促进法律领域相关任务的技术研究,例如要素抽取、问答系统、推荐系统等。以要素抽取为例,传统的要素抽取需要预定义大量标签,而由于裁判文书种类以及涉及案由(罪名)的多样性,使得标签定义工作比较繁重,通过阅读理解技术能够一定程度上避免这个问题。
本文的主要贡献有如下几点:

  1. 本文提出首个中文法律阅读理解数据集,填补了法律领域阅读理解研究的空白;
  2. 本文提出的数据集涉及范围较广,包含约188种民事案由,138种刑事罪名;涉及问题种类较多,包括片段型、是否类以及拒答类问题;应用前景较广阔,比如要素抽取、信息检索、问答系统等;
  3. 通过基线系统、参赛系统与人类指标的对比,说明在该数据集上仍存在较大的提升空间。

 

A large data set of natural language queries with corresponding SPARQL queries for Wikidata and Dbpedia2018
example:
{
        "NNQT_question": "What is the {periodical literature} for {mouthpiece} of {Delta Air Lines}",
        "uid": 19719,
        "subgraph": "simple question right",
        "template_index": 65,
        "question": "What periodical literature does Delta Air Lines use as a moutpiece?",
        "sparql_wikidata": " select distinct ?obj where { wd:Q188920 wdt:P2813 ?obj . ?obj wdt:P31 wd:Q1002697 } ",
        "sparql_dbpedia18": "select distinct ?obj where { ?statement <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> <http://wikidata.dbpedia.org/resource/Q188920> . ?statement <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> <http://www.wikidata.org/entity/P2813> . ?statement <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> ?obj . ?obj <http://www.wikidata.org/entity/P31> <http://wikidata.dbpedia.org/resource/Q1002697> } ",
        "template": " <S P ?O ; ?O instanceOf Type>",
        "answer": [],
        "template_id": 1,
        "paraphrased_question": "What is Delta Air Line's periodical literature mouthpiece?"
    }
 
 
QALD is a series of evaluation campaigns on question answering over linked data. So far, it has been organized as an ESWC workshop and an ISWC workshop as well as a part of the Question Answering lab at CLEF.
 
 
 

一系列现成的数据集。

TensorFlow Datasets 是可用于 TensorFlow 或其他 Python 机器学习框架(例如 Jax)的一系列数据集。 所有数据集都作为 tf.data.Datasets 提供,助您实现易用且高性能的输入流水线。 要开始使用,请参阅这份指南以及我们的数据集列表

All Datasets

Note: The datasets documented here are from HEAD and so not all are available in the current tensorflow-datasets package. They are all accessible in our nightly package tfds-nightly.

Audio

Image

Image classification

Object detection

Question answering

Structured

Summarization

Text

Translate

Video

 
给定第一个句子和下一个句子的主语,从四个选项中选择下一句的后半句。
Context:Members of the procession walk down the street holding small horn brass instruments.
Question:A drum line
Options:
passes by walking down the street playing their instruments.
has heard approaching them.
arrives and they're outside dancing and asleep.
turns the lead singer watches the performance.
Answer(s):0
 
类似英语阅读理解和完形填空的结合,从四个选项中选择合适的选项填到“_”上。
Context:
Drinking water is good for your health. There are some scientific ways of drinking water.\n1. It is the best medicine to drink two glasses of water in the morning.\n2. Drink clean water.\n3. Drink the water that has not been boiled.\nMany people think boiled water is safe and good to people's health. In fact, it is not true. The boiling point of water is 100degC. By boiling it, most bacteria in water can be killed. In the past, the water was less polluted. So boiling was a good way to make clean water. But heavy metals and other dangerous things in today's water are much more terrible than bacteria. Boiling doesn't fix that problem. And boiling water may give us more of the dangerous things in our glass.\n4. Never use soft drinks to take the place of water.\n5. Water is also needed in winter.\n6. Drink water at the right time.\n1) After getting up in the morning, you have less water in your body, because you weren't drinking for the whole night. So you should drink some water to keep your health after getting up in the morning. That can prevent high blood pressure, cerebral hemorrhages and so on.\n2) Drinking water at about 10 am helps your body keep enough water.\n3) Drinking water at about 3 pm can clean out the wastes in your body.\n4) About eight o'clock in the evening is the best time to drink water. Your blood gets thicker when you sleep. Water will make your blood less thick.\nBesides, we should drink 2L of water every day. Water is so important for our life. We should drink water often.
Question:
["According to the passage,  _  in the morning is the best  _  .", "
_  can prevent high blood pressure, cerebral hemorrhages and so on.", "
At about 3 pm, drinking water can clean out the  _  in your body.", "
From the passage, we can see the best time to drink water is  _  .", "
The best title to this passage is  _  ."]
Options:
[["drinking some hot soup; medicine", "
drinking some porridge; breakfast", "
drinking some water; medicine", "
Drinking some soft drinks; medicine"],
["Drinking some water after getting up in the morning", "
Drinking some water before going to bed", "
Drinking some soft drinks after getting up", "
Drinking some milk before going to bed"],
["oil", "food", "wastes", "fat"],
["about eight o'clock in the morning", "
about eight o'clock in the evening", "
before supper", "
at night"],
["Drink clean water", "
Don't drink the boiled water", "
The use of water", "
Scientific water drinking"]]
Answer(s):["C", "A", "C", "B", "D"]
 
典型的完形填空。
"article": "From Monday to Friday most people are busy working or studying, but in the evenings and weekends they are free and _ themselves. Some watch television or go to the movies, others take part in sports. This is decided by their own _ .There are many different ways to spend our _ time. Almost everyone has some kind of _ : it may be something from collecting stamps to _ model planes. Some hobbies are very _ , but others don't cost anything at all. Some collections are worth a lot of money, others are valuable only to their owners. I know a man who has a coin collection worth several thousand dollars. A short time ago he bought a rare fifty-cent piece which _ him $250!He was very happy about this collection and thought the price was all right . On the other hand, my youngest brother collects match boxes. He has almost 600 of them, but I wonder _ they are worth any money. However, _ my brother they are quite valuable . _ makes him happier than to find a new match box for his collection. That's what a hobby means, I think. It is something we _ to do in our free time just for the _ of it . The value in dollars is not important. but the pleasure it gives us is."
"options": [
["love", "work", "enjoy", "play"],
["lives", "interests", "jobs", "things"],
["working", "free", "own", "whole"],
["hobby", "thing", "job", "way"],
["driving", "making", "buying", "selling"],
["interesting", "exciting", "cheap", "expensive"],
["paid", "cost", "took", "spent"],
["that", "if", "what", "why"],
["to", "on", "with", "in"],
["Everything", "Anything", "Nothing", "Something"],
["have", "need", "refuse", "like"],
["money", "work", "fun", "time"]
"answers": ["C", "B", "B", "A", "B", "D", "B", "B", "A", "C", "D", "C"]
 
 
单选题、推断。
Context:无
Question:
Which factor will most likely cause a person to develop a fever? 
Options:
(A) a leg muscle relaxing after exercise
(B) a bacterial population in the bloodstream
(C) several viral particles on the skin
(D) carbohydrates being digested in the stomach
Answer(s):B
 
不定项选择。
Context:
<b>Sent 1: </b>Animated history of the US.<br><b>Sent 2: </b>Of course the cartoon is highly oversimplified, and most critics consider it one of the weakest parts of the film.<br><b>Sent 3: </b>But it makes a valid claim which you ignore entirely: That the strategy to promote \"gun rights\" for white people and to outlaw gun possession by black people was a way to uphold racism without letting an openly terrorist organization like the KKK flourish.<br><b>Sent 4: </b>Did the 19th century NRA in the southern states promote gun rights for black people?<br><b>Sent 5: </b>I highly doubt it.<br><b>Sent 6: </b>But if they didn't, one of their functions was to continue the racism of the KKK.<br><b>Sent 7: </b>This is the key message of this part of the animation, which is again being ignored by its critics.<br><b>Sent 8: </b>Buell shooting in Flint.<br><b>Sent 9: </b>You write: \"Fact: The little boy was the class thug, already suspended from school for stabbing another kid with a pencil, and had fought with Kayla the day before\".<br><b>Sent 10: </b>This characterization of a six-year-old as a pencil-stabbing thug is exactly the kind of hysteria that Moore's film warns against.<br><b>Sent 11: </b>It is the typical right-wing reaction which looks for simple answers that do not contradict the Republican mindset.<br><b>Sent 12: </b>The kid was a little bastard, and the parents were involved in drugs -- case closed.<br><b>Sent 13: </b>But why do people deal with drugs?<br><b>Sent 14: </b>Because it's so much fun to do so?<br><b>Sent 15: </b>It is by now well documented that the CIA tolerated crack sales in US cities to fund the operation of South American \"contras\" It is equally well known that the so-called \"war on drugs\" begun under the Nixon administration is a failure which has cost hundreds of billions and made America the world leader in prison population (both in relative and absolute numbers).<br>
Questions and Answers:
[{ "question":"Does the author claim the animated films message is that the NRA upholds racism?","sentences_used":[1,2,3,5],
"answers":[
{ "text":"Yes","isAnswer":true,"scores":{}},
{ "text":"Uphold,andcontinue","isAnswer":true,"scores":{}},
{ "text":"No","isAnswer":false,"scores":{}}
],
"idx":"0","multisent":true},
{ "question":"Which key message(s) do(es) this passage say the critics ignored?","sentences_used":[2,6],
"answers":[
{ "text":"The strategy to promote \"gun rights\" for white people while outlawing it for black people allowed racisim to continue without allowing to KKK to flourish","isAnswer":true,"scores":{}},
{ "text":"That it antagonized the KKK","isAnswer":false,"scores":{}},
{ "text":"That the KKK was a terrorist organization","isAnswer":false,"scores":{}},
{ "text":"The strategy to promote the KKK","isAnswer":false,"scores":{}}
],
"idx":"1","multisent":true},
{ "question":"What type of the film is being discussed and what is on of the key messages?","sentences_used":[0,5],
"answers":[{"text":"Animated history of the US and one of the key messages is continuing the .......
 
二选一。
Premise: The man broke his toe. What was the CAUSE of this?
Alternative 1: He got a hole in his sock.
Alternative 2: He dropped a hammer on his foot.
 
Premise: I tipped the bottle. What happened as a RESULT?
Alternative 1: The liquid in the bottle froze.
Alternative 2: The liquid in the bottle poured out.
 
Premise: I knocked on my neighbor's door. What happened as a RESULT?
Alternative 1: My neighbor invited me in.
Alternative 2: My neighbor left his house.
 
上面的好像没有数据,可以来这里下载: https://download.csdn.net/download/weixin_43975374/12678029
判断对错。
{
   "question": "is france the same timezone as the uk",
   "passage": "At the Liberation of France in the summer of 1944, Metropolitan France kept GMT+2 as it was the time then used by the Allies (British Double Summer Time). In the winter of 1944--1945, Metropolitan France switched to GMT+1, same as in the United Kingdom, and switched again to GMT+2 in April 1945 like its British ally. In September 1945, Metropolitan France returned to GMT+1 (pre-war summer time), which the British had already done in July 1945. Metropolitan France was officially scheduled to return to GMT+0 on November 18, 1945 (the British returned to GMT+0 in on October 7, 1945), but the French government canceled the decision on November 5, 1945, and GMT+1 has since then remained the official time of Metropolitan France."
   "answer": false,
   "title": "Time in France",
}
 
一个需要使用常识知识推理的机器理解任务
T I wanted to plant a tree. I went to the home and garden store and picked a nice oak. Afterwards, I planted it in my garden.
Q1 What was used to dig the hole?
a. a shovel b. his bare hands
Q2 When did he plant the tree?
a. after watering it  b. after taking it home
 
典型的阅读理解。
James the Turtle was always getting in trouble.
Sometimes he'd reach into the freezer and empty out
all the food. Other times he'd sled on the deck and get
a splinter. His aunt Jane tried as hard as she could to
keep him out of trouble, but he was sneaky and got
into lots of trouble behind her back.
One day, James thought he would go into town and
see what kind of trouble he could get into. He went to
the grocery store and pulled all the pudding off the
shelves and ate two jars. Then he walked to the fast
food restaurant and ordered 15 bags of fries. He didn't pay, and instead headed home.
His aunt was waiting for him in his room. She told
James that she loved him, but he would have to start
acting like a well-behaved turtle.
After about a month, and after getting into lots of
trouble, James finally made up his mind to be a better
1) What is the name of the trouble making turtle?
A) Fries
B) Pudding
C) James
D) Jane
2) What did James pull off of the shelves in the grocery store?
A) pudding
B) fries
C) food
D) splinters
3) Where did James go after he went to the grocery
store?
A) his deck
B) his freezer
C) a fast food restaurant
D) his room
4) What did James do after he ordered the fries?
A) went to the grocery store
B) went home without paying
C) ate them
D) made up his mind to be a better turtle
 
对话。
M: I am considering dropping my dancing class. I am not making any progress.",
W: If I were you, I stick with it. It's definitely worth time and effort."
question: "What does the man suggest the woman do?",
choice:
Consult her dancing teacher.
Take a more interesting class.
Continue her dancing class."
answer: "Continue her dancing class."
 
 
  • 4
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值