[文献阅读] A Study of Translation Edit Rate with Targeted Human Annotation

A Study of Translation Edit Rate with Targeted Human Annotation


Matthew Snover and Bonnie Dorr
Institute for Advanced Computer Studies
University of Maryland
College Park, MD 20742
{snover,bonnie}@umiacs.umd.edu


本文重要信息摘要:

1、Translation Edit Rate (TER) measures the amount of editing that a human would have to perform to change a system output so it exactly matches a reference translation.

2、The methods of automatic machine translation consist of BLEU, METEOR,NIST,TER and so on.

3、We define a new, more intuitive measure of “goodness” of MT output—specifically, the number of edits needed to fix the output so that it semantically matches a correct translation.  

4、Recently the GALE (Olive, 2005) (Global Autonomous Language Exploitation) research program introduced a new error measure called Translation Edit Rate (TER)  that was originally designed to count the number of edits (including phrasal shifts) performed by a human to change a hypothesis so that it is both fluent and has the correct meaning. This was then decomposed into two steps: defining a new reference and finding the minimum number
of edits so that the hypothesis exactly matches one of the references. This measure was defined such that all edits, including shifts, would have a cost of one. Finding only the minimum number of ed-its, without generating a new reference is the measure defined as TER; finding the minimum of edits to a new targeted references is defined as human-targeted TER (or HTER). 

5、BLEU (Papineni et al., 2002) calculates the score of a translation by measuring the number of n-grams, of varying length, of the system output that occur within the set of references.

6、METEOR (Banerjee and Lavie, 2005) is an evaluation measure that counts the number of exact word matches between the system output and reference. Unmatched words are then stemmed and matched. Additional penalities are assessed for reordering the words between the hypothesis and reference. This method has been shown to correlate very well with human judgments.

7、TER is defined as the minimum number of edits needed to change a hypothesis so that it exactly matches one of the references, normalized by the average length of the references.

8、Possible edits include the insertion, deletion, and substitution of single words as well as shifts of word sequences.

9、 

10、The number of insertions, deletions, and substitutions is calculated using dynamic programming. A greedy search is used to find the set of shifts, by repeatedly selecting the shift that most reduces the number of insertions, deletions and substitutions, until no more beneficial shifts remain. 

11、

12、In both TER and HTER, the majority of the edits were substitutions and deletions.

13、 In an analysis of shift size and distance, we found that most shifts are short in length (1 word) and are
by less than 7 words.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
数据免费知识蒸馏与软目标传输集合合成是一种通过利用现有数据集来提高深度神经网络的性能的方法。这种方法主要包括两个步骤:知识蒸馏和软目标传输集合合成。 首先,知识蒸馏是指将一个已经训练好的大型模型的知识转移到一个小型模型中。这样做的好处是,小型模型可以通过利用大型模型的知识来提高其性能。知识蒸馏的过程包括将大型模型的输出(一般是概率分布)作为目标分布,然后使用目标分布和小型模型的输出之间的交叉熵作为损失函数进行训练。通过这种方式,小型模型可以学习到大型模型的知识,并提高其性能。 其次,软目标传输集合合成是指通过合成新的目标数据集来进一步提高小型模型的性能。这是通过将已有数据集中的样本与大型模型的输出结合起来产生的。具体而言,对于每个样本,使用大型模型进行预测,并根据预测结果以及训练集中的标签来合成一个新的目标分布。然后,再次使用目标分布和小型模型的输出之间的交叉熵作为损失函数进行训练。通过这种方式,小型模型可以进一步学习到大型模型的知识,并提高其性能。 总之,数据免费知识蒸馏与软目标传输集合合成是一种提高深度神经网络性能的有效方法。通过利用已有的数据集和大型模型的知识,可以帮助小型模型更好地学习并提高其性能。这种方法在许多领域中都有广泛的应用,例如计算机视觉、自然语言处理等。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值