- 安装步骤
- 安装一些必备包
sudo apt-get install build-essentialgit-core pkg-config automake libtool wget zlib1g-dev python-dev libbz2-dev
- clone moses code
git clone https://github.com/moses-smt/mosesdecoder.git
cd mosesdecoder
- 安装最新版本的Boost, cmph, irslm, xmlrpc-c
默认将把上述软件Boost, cmph, irslm, xmlrpc-c安装到mosesdecoder/opt目录
make -fcontrib/Makefiles/install-dependencies.gmake
- 编译 moses
使用compile.sh 编译,常见参数是:
--prefix=/destination/path --install-scripts,指明安装路径
--with-mm ,enable suffix array-based phrase tables
例如:
./compile.sh --prefix=/mnt/share --install-scripts--with-mm
如果compile.sh 有问题或者需要更多参数编译,可以使用bjam命令
例如
./bjam--with-boost=~/workspace/temp/boost_1_55_0 -j8
- 额外的配置
2.1 Word Alignment
由于word alignment软件没有包含在自动build和安装脚本内,需要手动安装。常用word alignment工具有giza++, mgiza, Fast Align。
此处选择mgiza. 安装方法如下:
- install
git clone https://github.com/moses-smt/mgiza.git
cd mgiza/mgizapp
cmake .
make
make install
- compile giza
manual-compile/compile.sh
- Copy mgiza bnary & script merge_aliangment.py to moses folder
exportBINDIR=~/workspace/bin/training-tools
cp bin/* $BINDIR/mgizapp
cpscripts/merge_alignment.py $BINDIR
- Use mgiza with train-model.perl
MGIZA works with the training script train-model.perl. You indicateits use (opposed to regular GIZA++) with the switch -mgiza. The switch-mgiza-cpus NUMBER allows you to specify the number of CPUs.