benchmark: GLUE ,CoQA,SQuAD

===================================================================

General Language Understanding Evaluation (GLUE) benchmark

STB 回归问题,其余皆为单句子,或句子对分类问题。

MNLI是三分类,其余皆为二分类。

 

Ref

GLUE: A MULTI-TASK BENCHMARK AND ANALYSIS PLATFORM FOR NATURAL LANGUAGE UNDERSTANDING
https://openreview.net/pdf?id=rJ4km2R5t7

任务  https://gluebenchmark.com/tasks
排行榜 https://gluebenchmark.com/leaderboard

GLUE: 自然语言理解的标杆
https://blog.csdn.net/weixin_43269174/article/details/106382651

Ref

===================================================================

CoQA ,A Conversational Question Answering Challenge 问答系统数据集
paper   https://arxiv.org/pdf/1808.07042v1.pdf

github  https://stanfordnlp.github.io/coqa/

CoQA 基于对话的问答系统
https://blog.csdn.net/cindy_1102/article/details/88560048

===================================================================

SQuAD2.0,The Stanford Question Answering Dataset  阅读理解数据集

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset,
SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions
written adversarially by crowdworkers to look similar to answerable ones.
https://rajpurkar.github.io/SQuAD-explorer/

===================================================================

 

===================================================================

 


 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值