1. extract-rules
1.1 Span Size Limit : The limit on span sizes can be set with max-chart-span
. In fact its default is 10
, which is not a useful setting for syntax models.
from http://www.statmt.org/moses/?n=Moses.SyntaxTutorial
1.2 max-phrase-length
from http://comments.gmane.org/gmane.comp.nlp.moses.user/4194
1.3 max-phrase-length 在chart抽短语时候指的是初始短语长度
在training中设置max-phrase-length为5。extract rule显示:
而extract-rules后面的参数为:
默认:--MaxSpan[10] --MinWords[1] | --MaxSymbolsTarget[999] | --MaxSymbolsSource[5] | --MaxNonTerm[2]
1.4 Glue rules
<s> [X] ||| <s> [S] ||| 1 ||| ||| 0
[X][S] </s> [X] ||| [X][S] </s> [S] ||| 1 ||| 0-0 ||| 0
[X][S] [X][X] [X] ||| [X][S] [X][X] [S] ||| 2.718 ||| 0-0 1-1 ||| 0
这几条规则的含义见:http://comments.gmane.org/gmane.comp.nlp.moses.user/9253
1.5 Rule format
http://www.statmt.org/moses/manual/manual.pdf
关键点:非终结符对应关系不看编号,看alignment。