NLP-问答-榜单

有维护

A/B:榜单

榜单top1 模型em(exact match)f1accmrrscore
GrailQAOverallReTraCk58.13665.285
-Compositional GeneralizationReTraCk61.49970.911
-Zero-shot GeneralizationArcaneQ49.96458.844
PubMedQA-Baseline Model52.7268.08
AmbigQAStandard settingRefuel44.3(all) 34.8(multi) 15.9(bleu) 10.1
-Zero-shot settingSpanSeqGen42.230.8(all) 20.7(multi)
DREAM-ALBERT-xxlarge + DUMA + Multi-Task Learning91.8
MathQA-Seq2Prog+Cat37.4
LC-QuAD 2.0
ComQA-22.4
QASC-UnifiedQA0.8957
Quoref-CorefRoBERTa0.80610.8670
Physical IQA-UNICORN0.9013
Social IQA-UNICORN0.8315
CoQA-RoBERTa + AT + KD91.4(in-domain) 89.2(out-of-domain) 90.7(overall)
DROP-QDGAT - ALBERT0.87040.9010
ARC-UnifiedQA + ARC MC/DA + IR0.8140
CommonsenseQA
ComplexWebQuestions
HotpotQADistractor SettingS2G+70.72{ans) 64.30(sup) 48.60(joint)83.53(ans) 88.72(sup) 75.45(joint)
-Fullwiki SettingTPRR66.95(ans) 59.43(sup) 44.37(joint)79.50(ans) 84.25(sup) 70.83(joint)
OpenBookQA-UnifiedQA0.872
ProPara Dataset-KOALA0.7040.777
QuAC-RoR74.9
RACE-ALBERT-SingleChoice + transfer learning91.4
ReCoRD-LUKE90.6491.21
QAngarooWikiHopRealFormer-large84.4
-MedHopMedKGQA64.8
ShARCEnd-to-end TaskDGM0.774(micro) 0.812(macro)
SWAG-DeBERTa0.9171
SQuAD2.0FPNet90.87193.183
-1.1LUKE90.20295.379
TriviaQA
Who-did-What-GA with word features0.712(who-did-what) 0.77(cnn)
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值