[zz]Statistical Machine Translation Tutorial Reading(new)

转载 2006年05月17日 08:46:00

Statistical Machine Translation Tutorial Reading

The following is a list of papers that I think are worth reading for our
discussion of machine translation. I've tried to give a short blurb about
each of the papers to put them in context. I've included a number of papers
that I marked "OPTIONAL" that I think are interesting, but are either
supplementary or the material is more or less covered in the other papers.

If anyone would like more information on a particular topic or would
like to discuss any of these papers, feel free to e-mail me

Part 1 (Jan. 19)
A Statistical MT Tutorial Workbook. Kevin Knight. 1999.
Very good introduction to word-based statistical machine translation.
Written in an informal, understandable, tutorial oriented style.

Automating Knowledge Acquisition for Machine Translation.
Kevin Knight. 1997.
(OPTIONAL) Another tutorial oriented paper that steps through
how one can learn from bilingual data. Also introduces a number of
important concepts for MT.

Foundations of Statistical NLP, chapter 13. Manning and Schutze. 1999.
(OPTIONAL) Must be accessed from UCSD. Overview of statistical MT.
Spends a lot of time on sentence and word alignment of bilingual data.

Foundations of Statistical NLP, chapter 6. Manning and Schutze. 1999.
(OPTIONAL) Must be accessed from UCSD. Discusses n-gram language
modeling. Language modeling is crucial for SMT and many other natural
language applications. I won't spend much time discussing language
modeling, but for those that are interested this is a good introduction.

Part 2 (Jan. 26)
Word models:
The Mathematics of Statistical Machine Translation:
Parameter Estimation
. P. F. Brown, S. A. Della Pietra,
V. J. Della Pietra and R.L. Mercer. 1993.
(OPTIONAL) All you ever wanted to know about word level
models. Describes IBM models 1-5 and parameter estimation
for these models. It's about 50 pages and contains a lot of
material for the interested reader.

Word model decoding:
Decoding Algorithm in Statistical Machine Translation.
Ye-Yi Wand and Alex Waibel. 1997.
Early paper discussing decoding of IBM model 2. The paper
provides a fairly good introduction to word-level decoding
including multi-stack search (i.e. multiple beams) and rest
cost estimation (heuristic functions).

An Efficient A* Search Algorithm for Statistical Machine Translation.
Franz Josef Och, Nicola Ueffing, Hermann Ney. 2001.
(OPTIONAL) One of many papers on decoding with word-based SMT. They
discuss the basic idea of viewing decoding as state space search and
provide one method for doing this. They describe decoding for Model 3
and suggest a few different heuristics that are admissible, leading to few search errors.

Phrase based statistical MT:
Statistical Phrase-Based Translation.
Philipp Koehn, Franz Jasof Ock and Daniel Marcu. 2003.
Good, short overview of phrased based systems. If you want more
details, see the paper below.

The Alignment Template Approach to Statistical Machine Translation.
Franz Josef Och and Hermann Ney. 2004.
(OPTIONAL) This is a journal paper discussing one phrase based statistical system
including decoding. This is more or less the system used at ISI and
is probably the best current system (though syntax based systems my beat
these in the next few years). Requires acrobat 5 and to be at UCSD.

Part 3 (Feb. 2)
Phrase-based decoding:
See the previous paper.

Syntax based translation:
What's in a Translation Rule? Galley, Hopkins, Knight and Marcu. 2004.
This is the current system being investigated at ISI and the hope is that
these syntax based systems will perform better than phrase based systems.
The paper is a bit tough to read since it's a conference paper.

A Syntax-Based Statistical Translation Model. Yamada and Knight. 2001.
(OPTIONAL) Predecessor model to Galley et al., but similar.

Syntax based decoding:
Foundations of Statistical NLP, chapter 12. Manning and Schutze. 1999.
Must be on campus. This is a chapter on parsing (not actually decoding)
However, since the above rules are very similar to PCFGs, then decoding
is very similar to parsing... just with more complications.

A Decoder for Syntax-Based Statistical MT. Kenji Yamada and Kevin Knight. 2001.
(OPTIONAL) Decoder for the above Yamada and Knight model.

Part 4 (Feb. 9)
Discriminative Training:
Discriminative Training and Maximum Entropy Models for Statistical Machine Translation.
Och and Ney. 2002.
Learning how the best models for combining the different models (traslation
model, language model, etc.) using maximum entropy parameter estimation.
This line of research is still very important and my be interesting to
many of you since it's very machine learningy.

AnotherPaper:Minimum Error Rate Training in Statistical Machine Translation

Och Acl-03

Discriminative Reranking for Machine Translation.
Shen, Sarkar and Och. 2004.(HLT/NAACL'04)
(OPTIONAL) Given a ranked output of possible translations from the
translation system, this paper uses the perceptron algorithm to learn
a reranking of the sentences to improves the top translation.

MT Evaluation:
BLEU: A Method for Automatic Evaluation of Machine Translation.
Papineni, Roukos, Ward and Zhu. 2001.
Foundational method for evaluating MT methods and still used currently.



好像只有och相关的一些人在尝试,应该算是比较新的方向。另外,在ebmt,rbmt中,discriminative training的方法好像还没有人尝试引入。我们再看这些文章的时候,关键要看一下,如何把一种思想model进现有的框架中。比如,现在我们想尝试用discriminative training 的方法在EBMT上作些工作,那么什么地方是切入点,如何model,如何实验?衡量性能的方法又是什么,这些都是应该考虑的问题。

Statistical Machine Translation Tutorial Reading

The following is a list of papers that I think are worth reading for our discussion of machine tran...
  • davidcqw
  • davidcqw
  • 2014年08月26日 11:44
  • 371

Neural Machine Translation and Sequence-to-sequence Models: A Tutorial

Graham Neubig (Submitted on 5 Mar 2017) This tutorial introduces a new and powerful set of techniq...
  • AMDS123
  • AMDS123
  • 2017年04月14日 11:24
  • 6683

三大机器翻译技术的high-level概述:Neural, Rule-Based and Phrase-Based Machine Translation

http://blog.systransoft.com/how-does-neural-machine-translation-work/ In this issue of step-...
  • mmc2015
  • mmc2015
  • 2017年06月17日 14:32
  • 601

Neural Machine Translation(NMT)技术概述

  • guoyuhaoaaa
  • guoyuhaoaaa
  • 2017年02月11日 16:30
  • 690

NLP 学习笔记 04 (Machine Translation)

all is based on the open course nlp on coursera.org week 5,week 6 lecture -----------------------...
  • Dark_Scope
  • Dark_Scope
  • 2013年04月09日 19:39
  • 8456

神经网络机器翻译Neural Machine Translation(2): Attention Mechanism

端到端的神经网络机器翻译(End-to-End Neural Machine Translation)是近几年兴起的一种全新的机器翻译方法。前篇NMT介绍的基本RNN Encoder-Decoder结...
  • u011414416
  • u011414416
  • 2016年04月04日 18:48
  • 13695

【论文阅读】Neural Machine Translation By Jointly Learning To Align and Translate

Neural Machine Translation By Jointly Learning To Align and Translate二作与三作 Universite de Montreal ...
  • chengdezhi2011
  • chengdezhi2011
  • 2017年06月13日 22:55
  • 1618

阅读小结:Google's Neural Machine Translation System

自然语言处理中很多思想对cv也有用,所以决定看这篇paper。 然后我会从几篇前置的paper看起。 讲CharCNN的文章: https://zhuanlan.zhihu.com/p/21242...
  • Layumi1993
  • Layumi1993
  • 2016年10月01日 18:18
  • 1269

Moses-a statistical machine translation system

http://www.statmt.org/moses/ Moses is a statistical machine translation system that allows you to...
  • iiiiiiiiiiii9
  • iiiiiiiiiiii9
  • 2018年02月03日 12:00
  • 12

Neural Machine Translation论文阅读笔记

Massive Exploration of Neural Machine Translation Architectures, Google Brain2017该文章主要做了大量的实验,可做为ove...
  • u014025868
  • u014025868
  • 2017年06月23日 12:05
  • 1864
您举报文章:[zz]Statistical Machine Translation Tutorial Reading(new)