

1. 下载

直接到Github上下载就可以了,https://github.com/kaldi-asr/kaldi,可以Git clone下来,也可以直接download zip文件到本地。

2. 需要预安装的包










make -j 4

有些教程可能会让你执行make -j 4,采用4核同时build,但是可能确实的包就不会提示了。这里我们保险起见,可以直接用make。比如你没有安装成功openfst,那么你到src目录下执行./configure命令时就会报错。

$ ./configure
Configuring KALDI to use MKL.
Checking compiler g++ ...
Checking OpenFst library in  ...
***configure failed: Could not find file /include/fst/fst.h:
  you may not have installed OpenFst. See ../tools/INSTALL ***


3. 安装Kaldi


$ ./configure  --mathlib=ATLAS
Configuring KALDI to use ATLAS.
Backing up kaldi.mk to kaldi.mk.bak ...
Checking compiler g++ ...
Checking OpenFst library in /home/kkm/work/kaldi2/tools/openfst ...
Checking cub library in /home/kkm/work/kaldi2/tools/cub ...
Doing OS specific configurations ...
On Linux: Checking for linear algebra header files ...
Using ATLAS as the linear algebra library.
Could not find libatlas.a in any of the generic-Linux places, but we'll try other stuff...
** Failed to configure ATLAS libraries ***
**  ERROR   **
** Configure cannot proceed automatically.
**  If you know that you have ATLAS installed somewhere on your machine, you
** may be able to proceed by replacing [somewhere] in kaldi.mk with a directory.
**  If you have sudo (root) access you could install the ATLAS package on your
** machine, e.g. 'sudo apt-get install libatlas-dev libatlas-base-dev' or
** 'sudo yum install atlas.x86_64' or 'sudo zypper install libatlas3-devel',
** or on cygwin, install atlas from the installer GUI; and then run ./configure
** again.
**  Otherwise (or if you prefer OpenBLAS for speed), you could go the OpenBLAS
** route: cd to ../tools, type 'extras/install_openblas.sh', cd back to here,
** and type './configure  --openblas-root=../tools/OpenBLAS/install'


$ sudo apt install -y libatlas-base-dev
. . .
$ ./configure  --mathlib=ATLAS
Configuring KALDI to use ATLAS.
Backing up kaldi.mk to kaldi.mk.bak ...
Checking compiler g++ ...
Checking OpenFst library in /data00/home/liuwenchuang/cpphere/kaldi/source/kaldi-master/tools/openfst-1.6.7 ...
Checking cub library in /data00/home/liuwenchuang/cpphere/kaldi/source/kaldi-master/tools/cub-1.8.0 ...
Doing OS specific configurations ...
On Linux: Checking for linear algebra header files ...
Using ATLAS as the linear algebra library.
Successfully configured ATLAS with ATLASLIBS=/usr/lib/libatlas.so.3 /usr/lib/libf77blas.so.3 /usr/lib/libcblas.so.3 /usr/lib/liblapack_atlas.so.3
WARNING: CUDA will not be used! If you have already installed cuda drivers
         and CUDA toolkit, try using the --cudatk-dir= option. A GPU and CUDA
         are required to run neural net experiments in a realistic time.
INFO: Configuring Kaldi not to link with Speex. Don't worry, it's only needed if
      you intend to use 'compress-uncompress-speex', which is very unlikely.
Kaldi has been successfully configured. To compile:

  make -j clean depend; make -j <NCPU>

where <NCPU> is the number of parallel builds you can afford to do. If unsure,
use the smaller of the number of CPUs or the amount of RAM in GB divided by 2,
to stay within safe limits. 'make -j' without the numeric value may not limit
the number of parallel jobs at all, and overwhelm even a powerful workstation,
since Kaldi build is highly parallelized.

$ make -j clean depend; make -j 4
. . .

$ sudo apt remove -y libatlas-base-dev --auto-remove

但是用make j 4的话,有错误看不错咋回事。然后你去egs下面去测试会报错提示fstaddselfloops找不到,搜索后发现是kaldi没有build成功: https://github.com/uhh-lt/kaldi-tuda-de/issues/10

回来用sudo make跑,问题暴露了。

make[2]: Entering directory '/home/srinivas/Downloads/kaldi/src/online2'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/home/srinivas/Downloads/kaldi/src/online2'
make -C bin
make[2]: Entering directory '/home/srinivas/Downloads/kaldi/src/bin'
g++-4.9  -Wl,-rpath=/home/srinivas/Downloads/kaldi/tools/openfst/lib -rdynamic  align-equal.o ../decoder/kaldi-decoder.a ../lat/kaldi-lat.a ../lm/kaldi-lm.a ../fstext/kaldi-fstext.a ../hmm/kaldi-hmm.a ../transform/kaldi-transform.a ../gmm/kaldi-gmm.a ../tree/kaldi-tree.a ../util/kaldi-util.a ../thread/kaldi-thread.a ../matrix/kaldi-matrix.a ../base/kaldi-base.a   /home/srinivas/Downloads/kaldi/tools/openfst/lib/libfst.so /usr/lib/libatlas.so.3 /usr/lib/libf77blas.so.3 /usr/lib/libcblas.so.3 /usr/lib/liblapack_atlas.so.3 -lm -lpthread -ldl -o align-equal
align-equal.o: In function `fst::internal::FstImpl<fst::ArcTpl<fst::TropicalWeightTpl<float> > >::WriteFstHeader(fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > > const&, std::ostream&, fst::FstWriteOptions const&, int, std::string const&, unsigned long, fst::FstHeader*)':
/home/srinivas/Downloads/kaldi/tools/openfst/include/fst/fst.h:745: undefined reference to `fst::FstHeader::Write(std::ostream&, std::string const&) const'
../decoder/kaldi-decoder.a(training-graph-compiler.o): In function `fst::internal::FstImpl<fst::ReverseArc<fst::ArcTpl<fst::TropicalWeightTpl<float> > > >::WriteFstHeader(fst::Fst<fst::ReverseArc<fst::ArcTpl<fst::TropicalWeightTpl<float> > > > const&, std::ostream&, fst::FstWriteOptions const&, int, std::string const&, unsigned long, fst::FstHeader*)':
/home/srinivas/Downloads/kaldi/tools/openfst/include/fst/fst.h:745: undefined reference to `fst::FstHeader::Write(std::ostream&, std::string const&) const'
../decoder/kaldi-decoder.a(training-graph-compiler.o): In function `fst::internal::FstImpl<fst::ArcTpl<fst::LogWeightTpl<float> > >::WriteFstHeader(fst::Fst<fst::ArcTpl<fst::LogWeightTpl<float> > > const&, std::ostream&, fst::FstWriteOptions const&, int, std::string const&, unsigned long, fst::FstHeader*)':
/home/srinivas/Downloads/kaldi/tools/openfst/include/fst/fst.h:745: undefined reference to `fst::FstHeader::Write(std::ostream&, std::string const&) const'
../fstext/kaldi-fstext.a(kaldi-fst-io.o): In function `fst::ReadFstKaldi(std::string)':
/home/srinivas/Downloads/kaldi/src/fstext/kaldi-fst-io.cc:34: undefined reference to `fst::FstHeader::Read(std::istream&, std::string const&, bool)'
/home/srinivas/Downloads/kaldi/src/fstext/kaldi-fst-io.cc:37: undefined reference to `fst::FstReadOptions::FstReadOptions(std::string const&, fst::FstHeader const*, fst::SymbolTable const*, fst::SymbolTable const*)'
../fstext/kaldi-fstext.a(kaldi-fst-io.o): In function `fst::internal::FstImpl<fst::ArcTpl<fst::TropicalWeightTpl<float> > >::ReadHeader(std::istream&, fst::FstReadOptions const&, int, fst::FstHeader*)':
/home/srinivas/Downloads/kaldi/tools/openfst/include/fst/fst.h:796: undefined reference to `fst::FstHeader::Read(std::istream&, std::string const&, bool)'
collect2: error: ld returned 1 exit status
<builtin>: recipe for target 'align-equal' failed
make[2]: *** [align-equal] Error 1
make[2]: Leaving directory '/home/srinivas/Downloads/kaldi/src/bin'
Makefile:142: recipe for target 'bin' failed
make[1]: *** [bin] Error 2
make[1]: Leaving directory '/home/srinivas/Downloads/kaldi/src'
Makefile:35: recipe for target 'all' failed
make: *** [all] Error 2




$ g++
g++      g++-4.9


这里强行设置一下CXX的版本为g++(当然也可以设置为g++-4.9,两个都做了测试,都能把kaldi build成功)

$ CXX=g++

然后再执行build相关命令竟然echo Done了。

4. 做个简单的测试

1. 回到kaldi-master目录,然后到egs目录下,这个目录里面都是sample,我们也简单用yesno这个测试来简单看下。进去后再到s5目录下,至于为啥取名s5,我也不知道,先搁置一下,到s5目录下,然后执行

$ sudo ./run.sh


2. 我们也可以来个复杂点的,可以参考http://xuxping.com/2019/06/16/20190616_ASR_from_begin_to_abandoned/搞个中文thchs30的测试。测试的时候注意一下,需要自己去openslr.org拉数据,然后解压。你解压的目录要更新到run.sh里面的thchs这个变量。然后你发现跑起来,还会有错误。

creating data/{train,dev,test}
cleaning data/train
preparing scps and text in data/train
cleaning data/dev
preparing scps and text in data/dev
cleaning data/test
preparing scps and text in data/test
creating test_phone for phone decoding
steps/make_mfcc.sh --nj 8 --cmd queue.pl data/mfcc/train exp/make_mfcc/train mfcc/train
utils/validate_data_dir.sh: Successfully validated data-directory data/mfcc/train
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
queue.pl: Error submitting jobs to queue (return status was 32512)
queue log file is exp/make_mfcc/train/q/make_mfcc_train.log, command was qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* -o exp/make_mfcc/train/q/make_mfcc_train.log   -t 1:8 /data00/home/liuwenchuang/cpphere/kaldi/source/kaldi-master/egs/thchs30/s5/exp/make_mfcc/train/q/make_mfcc_train.sh >>exp/make_mfcc/train/q/make_mfcc_train.log 2>&1
Output of qsub was: sh: 1: qsub: not found


# you can change cmd.sh depending on what type of queue you are using.
# If you have no queueing system and want to run on a local machine, you
# can change all instances 'queue.pl' to run.pl (but be careful and run
# commands one by one: most recipes will exhaust the memory on your
# machine).  queue.pl works with GridEngine (qsub).  slurm.pl works
# with slurm.  Different queues are configured differently, with different
# queue names and different ways of specifying things like memory;
# to account for these differences you can create and edit the file
# conf/queue.conf to match your queue's configuration.  Search for
# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.

#export train_cmd=queue.pl
#export decode_cmd="queue.pl --mem 4G"
#export mkgraph_cmd="queue.pl --mem 8G"
#export cuda_cmd="queue.pl --gpu 1"
export train_cmd=run.pl
export decode_cmd=run.pl
export mkgraph_cmd="run.pl"


