在Ubuntu 18.04上安装Kaldi时,按照《Kaldi 语音识别实战》书中的步骤进行,但发现在安装过程中碰到了很多问题。基本上每一步都有问题,所以就总结了一下碰到的一些坑,朋友们可以参考一下。
(在进行安装前要保证显卡的驱动和Cuda都已经安装好)
- 下载Kaldi代码,Github: Kaldi
- 切换到下载的
kaldi
文件夹,并查看依赖库是否安装:
cd kaldi
tools/extras/check_dependencies.sh
由于是第一次安装,所以很多依赖库应该都是没有的,会出现下面的提示:
tools/extras/check_dependencies.sh: zlib is not installed.
tools/extras/check_dependencies.sh: automake is not installed.
tools/extras/check_dependencies.sh: autoconf is not installed.
tools/extras/check_dependencies.sh: sox is not installed.
tools/extras/check_dependencies.sh: gfortran is not installed.
tools/extras/check_dependencies.sh: subversion is not installed
tools/extras/check_dependencies.sh: WARNING python 2.7 is not the default python. We fixed this by adding a correct symlink more prominently on the path.
... If you really want to use python 3.7.6 as default, add an empty file /home/haoliang/code_pro/kaldi/python/.use_default_python and run this script again.
tools/extras/check_dependencies.sh: Intel MKL is not installed. Run extras/install_mkl.sh to install it.
... You can also use other matrix algebra libraries. For information, see:
... http://kaldi-asr.org/doc/matrixwrap.html
tools/extras/check_dependencies.sh: Some prerequisites are missing; install them using the command:
sudo apt-get install zlib1g-dev automake autoconf sox gfortran subversion
可以看到最后一句,提示我们使用sudo apt-get install zlib1g-dev automake autoconf sox gfortran subversion
来安装。
安装完上面这些库后,再执行tools/extras/check_dependencies.sh
指令,此时提示
tools/extras/check_dependencies.sh: Intel MKL is not installed. Run extras/install_mkl.sh to install it.
... You can also use other matrix algebra libraries. For information, see:
... http://kaldi-asr.org/doc/matrixwrap.html
依旧是按照提示,执行tools/extras/install_mkl.sh
,此时terminal中会提示推荐我们加上-sp debian
选项,如下:
tools/extras/install_mkl.sh: You must be root to install MKL.
Restart this script using the 'sudo' command, as:
sudo tools/extras/install_mkl.sh -sp debian intel-mkl-64bit-2020.0-088
执行上面这条指令,会打印很长的信息,在最后一段,会提示安装Failed
E: The repository 'http://ppa.launchpad.net/webupd8team/sublime-text-3/ubuntu bionic Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
tools/extras/install_mkl.sh: MKL package intel-mkl-64bit-2020.0-088 installation FAILED.
Please open an issue with us at https://github.com/kaldi-asr/kaldi/ if you
believe this is a bug.
这个问题在网上查了很久,最后的解决方案如下:
- 打开
Software & Updates
,切换到Other Software
选项卡,找到http://ppa.launchpad.net/webupd8team/sublime-text-3/ubuntu/dists/
这一项,并将前面的勾去掉 - 保存后会提示Reload
也有网友觉得这种方法并不是根本解决,可能会导致其他的问题,但确实是解决了我安装的问题,而且也没有找到更好的方法,就先用这种方法。
再次执行sudo tools/extras/install_mkl.sh -sp debian intel-mkl-64bit-2020.0-088
,这次就提示MKL安装成功。
到这里,依赖库就安装完成了。
- 安装第三方工具
- OpenFst
Kaldi使用FST作为状态图的表现形式,安装方式如下:
cd tools
make openfst
此时提示
openfst-1.6.7.tar.gz: Permission denied
Cannot write to ‘openfst-1.6.7.tar.gz’ (Success).
Makefile:97: recipe for target 'openfst-1.6.7.tar.gz' failed
make: *** [openfst-1.6.7.tar.gz] Error 3
加上管理员权限运行:
sudo make openfst
提示
extras/check_dependencies.sh: Some prerequisites are missing; install them using the command:
sudo apt-get install libtool
Makefile:38: recipe for target 'check_required_programs' failed
make: *** [check_required_programs] Error 1
根据提示,运行
sudo apt-get install libtool
安装完成后,再运行
sudo make openfst
- cub
cub
是NVIDIA官方提供的CUDA核函数开发库,是目前Kaldi编译的必选工具,安装方式如下:
sudo make cub
- sclite
sclite
是NIST SCTK打分工具的一部分,用于生成符合NIST评测规范的统计文件;如果只是计算识别率,那这个工具不是必须的,Kaldi自身就包括了一个简单的计算WER的工具compute-wer
,安装sclite
的基本为:
sudo make sclite
- sph2pipe
sph2pip2
是用来对SPH音频格式进行转换的,使用LDC数据的示例都要用到这个工具,安装方式如下:
sudo make sph2pip2
- 安装三种语言模型
首先安装irstlm
,执行
sudo extras/install_irstlm.sh
如果提示
****() Installing IRSTLM
Cloning into 'irstlm'...
remote: Enumerating objects: 1781, done.
error: RPC failed; curl 18 transfer closed with outstanding read data remaining
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
****() Error getting the IRSTLM sources. The server hosting it
****() might be down.
应该是网速不太好,国内的网络仿真GitHub还是有点慢的。要么就多试几次,要么就搭个梯子。
下面安装srilm
,执行:
sudo extras/install_srilm.sh
安装成功后,会提示
make[2]: Leaving directory '/home/haoliang/code_pro/kaldi/tools/liblbfgs-1.10'
make[1]: Leaving directory '/home/haoliang/code_pro/kaldi/tools/liblbfgs-1.10'
This script cannot install SRILM in a completely automatic
way because you need to put your address in a download form.
Please download SRILM from http://www.speech.sri.com/projects/srilm/download.html
put it in ./srilm.tgz, then run this script.
这是因为SRILM用于商业用途不是免费的,需要到SRILM网站上注册、接受许可协议,才能下载源码包,到Terminal中提示的网站上下载,并保存到tools
目录下,我下载的文件名叫srilm-1.73.tar.gz
,将名字修改为srilm.tgz
再执行:
sudo extras/install_kaldi_lm.sh
到这里,Kaldi就算是安装完成了,默认的矩阵运算库是MKL,如果想选择其他的矩阵运算库,也可以手动再进行安装,比如我们还想安装OpenBLAS
:
sudo extras/install_openblas.sh
当然也可以选择其他的库,比如ATLAS
、CLAPACK
等。
欢迎关注微信公众号:Quant_Times