【cs224n-16】Low Resource Machine Translation

         神经机器翻译(NMT)的成功往往依赖于大量高质量的双语语料作为训练数据。如果是蒙古语、尼泊尔语这些小语种,无法提供足够多的双语数据,更极端的现实情况是,有些语言几乎没有任何双语预料,这种情况下NMT就无能为力了。

    松散定义:当并行句子数量在10,000或更少时,可以认为语言对资源不足。注:现代NMT系统现在有数亿个参数!

    挑战: 

            数据:  来源数据、评估数据集

            建模:不清晰的学习范式、领域适应、模型泛化能力

  Why Low Resource MT Is Interesting?

  •    它是关于用较少标记的数据进行学习。
  •    它是关于建模结构化输出和组合学习。
  •    这确实是一个需要解决的问题

  数据收集的挑战

  •  非常昂贵和缓慢。
  •  很难产生高质量的翻译

 监督式学习

半监督学习

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Machine Translation is the author’s comprehensive view of machine translation (MT) from the perspective of a participant in its history and development. The text considers MT as a fundamental part of Artificial Intelligence and the ultimate test-bed for all computational linguistics, covering historical and contemporary systems in Europe, the US and Japan. The author describes and contrasts a range of approaches to MT’s challenges and problems, and shows the evolution of conflicting approaches to MT towards some kind of skeptical consensus on future progress. The volume includes historic papers, updated with commentaries detailing their significance both at the time of their writing and now. The book concludes with a discussion of the most recent developments in the field and prospects for the future, which have been much changed by the arrival of the World Wide Web. Anyone interested in the progress of science and technology, particularly computer scientists and students, will find this a fascinating exploration of MT technology. Yorick Wilks is a Professor of Computer Science at the University of Sheffield, where he directs the Institute for Language, Speech and Hearing. He received his M.A. and Ph.D. (1968) from Pembroke College, Cambridge. He has also taught or researched at Stanford, Edinburgh, Geneva, Essex and New Mexico State Universities. His interests are artificial intelligence and the computer processing of language, knowledge and belief. He is a Fellow of the European and American Societies for Artificial Intelligence, a Fellow of the EPSRC College of Computing and a member of the UK Computing Research Council. Wilks was awarded the Antonio Zampolli prize by the European Language Resources Association in 2008. This prize is given to individuals whose work lies within the areas of Language Resources and Language Technology Evaluation with acknowledged contributions to their advancements. He was also the recipient of an ACL Life Achievement Award at the 46th Annual Meeting of the Association for Computational Linguistics in 2008.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值