【计算机科学】【2015】用机器学习方法评估翻译技术文件的质量

本文为(作者:Michael LUCKERT)的计算机科学硕士论文,共102页。

这里写图片描述

在日益网络化的世界中,国际竞争也越来越激烈,高质量的文件翻译对于成功至关重要。要求大型国际公司和中型公司为顾客提供翻译质量高的技术文件,不仅要在市场上取得成功,而且要符合相关法律规定,避免诉讼。

因此,本文着重于翻译质量的评价,具体涉及各类技术文献,并回答了两个中心问题:

  1. 在原始文件可用的情况下,如何评价技术文件的翻译质量?

  2. 如果原始文件不可用,如何评价技术文件的翻译质量?

在从知识发掘过程的上下文中,使用最新的机器学习算法和翻译评估度量来回答这些问题。这些质量评价是在语句层次上进行的,并通过将语句二元分类为自动翻译和专业翻译,从而在文档层次上重新组合。本研究基于一个包含22327条语句和32个翻译评价属性的数据库,这些属性用于优化五种不同的机器学习方法。由795000个评估组成的优化过程显示,二进制分类的预测精度高达72.24%。在开发的基于语句的分类系统的基础上,利用关联句的重组对文档进行分类,并介绍了文档质量评级框架。因此,本文所采取的方法成功地创建了一个翻译文件的分类和评价系统。

In the context of an increasingly networked world, the availability ofhigh quality translations is critical for success in the context of the growinginternational competition. Large international companies as well as mediumsized companies are required to provide well translated, high quality technicaldocumentation for their customers not only to be successful in the market butalso to meet legal regulations and to avoid lawsuits.

Therefore,this thesis focuses on the evaluation of translation quality, specificallyconcerning technical documentation, and answers two central questions:
• How can thetranslation quality of technical documents be evaluated, given the
originaldocument is available?
• How can thetranslation quality of technical documents be evaluated, given the
originaldocument is not available?

Thesequestions are answered using state-of-the-art machine learning algorithms and
translationevaluation metrics in the context of a knowledge discovery process. Theevaluations are done on a sentence level and recombined on a document level bybinarily classifying sentences as automated translation andprofessionaltranslation. The research is based on a database containing 22327sentences and 32 translationevaluation attributes, which are used for optimizations of five different machinelearning approaches. An optimization process consisting of 795000 evaluations shows aprediction accuracy of up to72.24% for the binary classification. Based on thedeveloped sentence-based classification systems, documents are classified usingrecombination of the affiliated sentences and a framework for rating documentquality is introduced. Therefore, the taken approach successfully creates aclassification and evaluation system.

这里写图片描述

原文下载地址:

http://page2.dfpan.com/fs/2l2cdj52722172d9160/

更多精彩文章请关注微信号:这里写图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值