NLP 自然语言处理数据集 粗略

版权声明:版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/DarrenXf/article/details/87939033

收集匆忙,并不保证准确

dataset

index dataset Abbreviation task note
1 LiBriSpeech Automatic speech recogniton
2 WSJ Automatic speech recogniton
3 Hub5’00 Evaluation Automatic speech recogniton
4 Rich Transcriptions Automatic speech recogniton
5 Fisher RT03S FSH Automatic speech recogniton
6 TED-LIUM Automatic speech recogniton
7 CHiME CHiME Automatic speech recogniton noisy speech
8 TIMIT Automatic speech recogniton
9 CCGBank CCG supertagging
10 Event2Mind Common sense
11 Situations with Adversarial Generations SWAG Common sense
12 Winograd Schema Challenge Common sense
13 Visual Commonsense Reasoning VCR Common sense
14 Penn Treebank Constituency parsing
15 CoNLL 2012 Coreference resolution
16 Penn Treebank Dependency parsing
17 Penn Treebank Unsupervised dependency parsing
18 Switchboard corpus Dialogue Dialogue act classification
19 Switchboard Dialogue Act Corpus SwDA Dialogue Dialogue act classification
20 ICSI Meeting Recorder Dialog Act corpus MRDA Dialogue Dialogue act classification
21 Second dialogue state tracking challenge DSTC2 Dialogue Dialogue state tracking
22 Wizard-of-Oz Dialogue Dialogue state tracking
23 Ubuntu Corpus Dialogue Retrieval-based Chatbot
24 Multi-Domain Sentiment Dataset Domain adaptation Sentiment analysis
25 AIDA CoNLL-YAGO Dataset Entity Linking
26 TAC KBP English Entity Linking Comprehensive and Evaluation Data 2010 Entity Linking
27 CoNLL-2014 Shared Task Grammatical Error Correction
28 CoNLL-2014 10 Annotations Grammatical Error Correction
29 JFLEG Grammatical Error Correction
30 Base Information Extraction Open Knowledge Graph Canonicalization
31 Ambigous Information Extraction Open Knowledge Graph Canonicalization
32 ReVerb45K Information Extraction Open Knowledge Graph Canonicalization
33 Penn Treebank Language modeling Word Level Models
34 WikiText-2 Language modeling Word Level Models
35 WikiText-103 Language modeling Word Level Models
36 1B Words / Google Billion Word benchmark Language modeling Word Level Models
37 Hutter Prize Language modeling Character Level Models
38 Text8 Language modeling Character Level Models
39 Penn Treebank Language modeling Character Level Models
40 LexNorm Lexical Normalization
41 LexNorm2015 Lexical Normalization
42 WMT 2014 EN-DE Machine translation
43 WMT 2014 EN-FR Machine translation
44 DecalNLP Multi-task learning
45 GLUE Multi-task learning
46 IEMOCAP Multimodal Multimodal Emotion Recognition
47 Multimodal Multimodal Metaphor Recognition
48 MOSI Multimodal Multimodal Sentiment Analysis
49 CoNLL 2003(English) Named entity recognition
50 Long-tail emerging entities Named entity recognition
51 Ontonotes v5 Named entity recognition
52 Stanford Natural Language Inference Corpus SNLI Natural language inference
53 Multi-Genre Natural Langeuage Inference corpus MultiNLI Natural language inference
54 SciTail Natural language inference
55 Penn Treebank Part-of-speech tagging
56 Social media Part-of-speech tagging
57 Universal Dependencies Part-of-speech tagging
58 AI2 Reasoning Challenge ARC Question answering
59 ShARC ShARC Question answering
60 CLiCR CLiCR Question answering Reading comprehension
61 CNN/Daily Mail Question answering Reading comprehension
62 CoQA Question answering Reading comprehension
63 HotpotQA Question answering Reading comprehension
64 MS MARCO Question answering Reading comprehension
65 MultiRC Question answering Reading comprehension
66 NewsQA Question answering Reading comprehension
67 QAngaroo Question answering Reading comprehension
68 QuAC Question answering Reading comprehension
69 RACE Question answering Reading comprehension
70 Stanford Question Answering Dataset SQuAD Question answering Reading comprehension
71 Story Cloze Test Question answering Reading comprehension
72 RecipeQA Question answering Reading comprehension
73 NarrativeQA Question answering Reading comprehension
74 DuoRC Question answering Reading comprehension
75 DuReader Question answering Open-domain Question Answering
76 Quasar Question answering Open-domain Question Answering
77 SearchQA Question answering Open-domain Question Answering
78 Freebase-15K-238 FB15K-237 Relation Prediction
79 WordNet-18-RR WN18RR Relation Prediction
80 New York Times Corpus Relationship Extraction
81 SemEval-2010 Task 8 Relationship Extraction
82 TACRED TACRED Relationship Extraction
83 Few-Shot Relation Classification Dataset FewRel Relationship Extraction
84 SentEval Semantic textual similarity
85 Quora Question Pairs Semantic textual similarity Paraphrase identification
86 LDC2014T12 Semantic parsing AMR parsing
87 LDC2015E86 Semantic parsing AMR parsing
88 LDC2016E25 Semantic parsing AMR parsing
89 ATIS Semantic parsing SQL parsing
90 Advising Semantic parsing SQL parsing
91 GeoQuery Semantic parsing SQL parsing
92 Scholar Semantic parsing SQL parsing
93 Spider Semantic parsing SQL parsing
94 WikiSQL Semantic parsing SQL parsing
95 Smaller Datasets Semantic parsing SQL parsing
96 OntoNotes Semantic role labeling
97 IMDb Sentiment analysis
98 Stanford Sentiment Treebank SST Sentiment analysis
99 Yelp Review dataset Yelp Sentiment analysis
100 SemEval Sentiment analysis
101 Sentihood Sentiment analysis Aspect-based sentiment analysis
102 SemEval-2014 Task 4 Sentiment analysis Aspect-based sentiment analysis
103 Subjectivity dataset SUBJ Sentiment analysis Subjectivity analysis
104 Penn Treebank Shallow syntax Chunking
105 Main-Simple English Wikipedia Simplification Sentence Simplification
106 PWKP/WikiSmall Simplification Sentence Simplification
107 Coster and Kauchack Simplification Sentence Simplification
108 Turk Corpus Simplification Sentence Simplification
108 Newsela Simplification Sentence Simplification
109 RumourEval Stance detection
110 CNN/Daily Mail Summarization
110 Gigaword Summarization
111 DUC 2004 Task 1 Summarization
112 Webis-TLDR-17 Corpus Summarization
113 Google Dataset Summarization Sentence Compression
114 SemEval 2018 Taxonomy Learning Hypernym Discory
115 APW Temporal Processing Document Dating(Time-stamping)
116 NYT Temporal Processing Document Dating(Time-stamping)
117 TimeBank Temporal Processing Temporal Information Extraction
118 TempEval-3 Temporal Processing Temporal Information Extraction
119 TimeBank Temporal Processing Timex normalisation
120 PNT Temporal Processing Timex normalisation
121 AG News corpus Text classification
122 DBpedia Text classification
123 TREC Text classification
124 Fine-grained WSD Word Sense Disambiguation
125 AIDA CoNLL-YAGO Dataset Entity linking
126 Chinese Treebank 6 Chinese Word Segmentation
127 Chinese Treebank 7 Chinese Word Segmentation
128 AS Chinese Word Segmentation
129 CityU Chinese Word Segmentation
130 PKU Chinese Word Segmentation
131 MSR Chinese Word Segmentation

没有更多推荐了,返回首页