perplexity 衡量指标_对于机器翻译衡量指标的批判性思考

标题:On The Evaluation of Machine Translation Systems Trained With Back-Translation

摘要:Back-translation is a widely used data augmentation technique which leverages target monolingual data. However, its effectiveness has been challenged since automatic metrics such as BLEU only show significant improvements for test examples where the source itself is a translation, or translationese. This is believed to be due to translationese inputs better matching the back-translated training data. In this work, we show that this conjecture is not empirically supported and that backtranslation improves translation quality of both naturally occurring text as well as translationese according to professional human translators. We provide empirical evidence to support the view that back-translation is preferred by humans because it produces more fluent outputs. BLEU cannot capture human preferences because references are translationese when source sentences are natural text. We recommend complementing BLEU with a language model score to measure fluency.

链接:

https://arxiv.org/pdf/1908.05204.pdf​arxiv.org

要点:

  1. 在NLP领域,反译(Back-Translation)是一种数据增强的方式。如下图,

e0a177f831b6f649009b234f334283aa.png
X是source, Y是target,一个*代表一次翻译。

2. BLEU是一种定量衡量机器翻译质量的标准。是以n-gram的衍生。此文作者发现,用了BT作为数据增强方式能提高X*->Y (逆向反译) (表1)。同时翻译腔(translationese)相对于原始文本是更容易翻译的(表2).

655365cb3398af3447ec81769d249b12.png
OP代表On parallel data only,a mix of direct and reversedata

3. 真人检查翻译质量更偏好BT的,但是未在BLEU上得到体现。

4. 作者检查BLEU的失败原因:可能是由于翻译腔之间比较接近。以迷惑度perplexity(详见https://www.zhihu.com/question/58482430 解释)表示,迷惑度越低代表模型越好。

64b0d8df55c9866151a8f8104ec4a250.png

5. 母语水平者因为翻译腔的不流畅,肯定不会偏好翻译腔。但是目前BLEU作为指标并不能体现对流畅度的偏好。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值