Kaldi单步完美运行AIShell v1 S5之三:三音tri1 2 3 4 5
致谢
感谢AIShell在商业化道路上的探索。期待着v3的到来。
机器配置
sv@HP:~$ sudo lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic
sv@HP:~$ cat /proc/cpuinfo | grep model\ name
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
sv@HP:~$ cat /proc/meminfo | grep MemTotal
MemTotal: 16321360 kB
sv@HP:~$ lspci | grep 'VGA'
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
Kaldi下AIShell v1详细输出之三:三音triphone
一网打尽。
第五部分:三音结果更新
先看结果。tri4a的时候,还没上DNN之前,错词率已经从Mono的37%降到14%了。
sv@HP:~/lkaldi/egs/aishell/s5$ for x in exp/ */decode_test; do [ -d $x ] && grep WER $x/cer_* | utils/best_wer.sh; done 2>/dev/null
%WER 36.59 [ 38335 / 104765, 849 ins, 3183 del, 34303 sub ] exp/mono/decode_test/cer_10_0.0
%WER 18.83 [ 19727 / 104765, 971 ins, 1161 del, 17595 sub ] exp/tri1/decode_test/cer_13_0.5
%WER 18.79 [ 19684 / 104765, 957 ins, 1142 del, 17585 sub ] exp/tri2/decode_test/cer_14_0.5
%WER 16.84 [ 17643 / 104765, 791 ins, 991 del, 15861 sub ] exp/tri3a/decode_test/cer_14_0.5
%WER 13.63 [ 14277 / 104765, 762 ins, 639 del, 12876 sub ] exp/tri4a/decode_test/cer_13_0.5
%WER 12.02 [ 12594 / 104765, 745 ins, 496 del, 11353 sub ] exp/tri5a/decode_test/cer_13_0.5
第六部分:三音Tri1训练、解码、校准
sv@HP:~/lkaldi/egs/aishell/s5$ # train tri1 [first triphone pass]
sv@HP:~/lkaldi/egs/aishell/s5$ steps/train_deltas.sh --cmd "$train_cmd" 2500 20000 data/train data/lang exp/mono_ali exp/tri1 || exit 1;
steps/train_deltas.sh --cmd run.pl --mem 8G 2500 20000 data/train data/lang exp/mono_ali exp/tri1
steps/train_deltas.sh: accumulating tree stats
steps/train_deltas.sh: getting questions for tree-building, via clustering
steps/train_deltas.sh: building the tree
steps/train_deltas.sh: converting alignments from exp/mono_ali to use current tree
steps/train_deltas.sh: compiling graphs of transcripts
steps/train_deltas.sh: training pass 1
steps/train_deltas.sh: training pass 2
steps/train_deltas.sh: training pass 3
steps/train_deltas.sh: training pass 4
steps/train_deltas.sh: training pass 5
steps/train_deltas.sh: training pass 6
steps/train_deltas.sh: training pass 7
steps/train_deltas.sh: training pass 8
steps/train_deltas.sh: training pass 9
steps/train_deltas.sh: training pass 10
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 11
steps/train_deltas.sh: training pass 12
steps/train_deltas.sh: training pass 13
steps/train_deltas.sh: training pass 14
steps/train_deltas.sh: training pass 15
steps/train_deltas.sh: training pass 16
steps/train_deltas.sh: training pass 17
steps/train_deltas.sh: training pass 18
steps/train_deltas.sh: training pass 19
steps/train_deltas.sh: training pass 20
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 21
steps/train_deltas.sh: training pass 22
steps/train_deltas.sh: training pass 23
steps/train_deltas.sh: training pass 24
steps/train_deltas.sh: training pass 25
steps/train_deltas.sh: training pass 26
steps/train_deltas.sh: training pass 27
steps/train_deltas.sh: training pass 28
steps/train_deltas.sh: training pass 29
steps/train_deltas.sh: training pass 30
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 31
steps/train_deltas.sh: training pass 32
steps/train_deltas.sh: training pass 33
steps/train_deltas.sh: training pass 34
steps/diagnostic/analyze_alignments.sh --cmd run.pl --mem 8G data/lang exp/tri1
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri1/log/analyze_alignments.log
877 warnings in exp/tri1/log/acc.*.*.log
1 warnings in exp/tri1/log/build_tree.log
1 warnings in exp/tri1/log/compile_questions.log
3146 warnings in exp/tri1/log/align.*.*.log
exp/tri1: nj=10 align prob=-79.45 over 150.17h [retry=0.5%, fail=0.0%] states=2080 gauss=20046 tree-impr=4.49
steps/train_deltas.sh: Done training system with delta+delta-delta features in exp/tri1
sv@HP:~/lkaldi/egs/aishell/s5$
sv@HP:~/lkaldi/egs/aishell/s5$ # decode tri1
sv@HP:~/lkaldi/egs/aishell/s5$ utils/mkgraph.sh data/lang_test exp/tri1 exp/tri1/graph || exit 1;
tree-info exp/tri1/tree
tree-info exp/tri1/tree
fstcomposecontext --context-size=3 --central-position=1 --read-disambig-syms=data/lang_test/phones/disambig.int --write-disambig-syms=data/lang_test/tmp/disambig_ilabels_3_1.int data/lang_test/tmp/ilabels_3_1.10620 data/lang_test/tmp/LG.fst
fstisstochastic data/lang_test/tmp/CLG_3_1.fst
0 -0.0666824
[info]: CLG not stochastic.
make-h-transducer --disambig-syms-out=exp/tri1/graph/disambig_tid.int --transition-scale=1.0 data/lang_test/tmp/ilabels_3_1 exp/tri1/tree exp/tri1/final.mdl
fstrmepslocal
fsttablecompose exp/tri1/graph/Ha.fst data/lang_test/tmp/CLG_3_1.fst
fstrmsymbols exp/tri1/graph/disambig_tid.int
fstminimizeencoded
fstdeterminizestar --use-log=true
fstisstochastic exp/tri1/graph/HCLGa.fst
0.000487832 -0.178947
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri1/final.mdl exp/tri1/graph/HCLGa.fst
sv@HP:~/lkaldi/egs/aishell/s5$ steps/decode.sh --cmd "$decode_cmd" --config conf/decode.config --nj 10 exp/tri1/graph data/dev exp/tri1/decode_dev
steps/decode.sh --cmd run.pl --mem 8G --config conf/decode.config --nj 10 exp/tri1/graph data/dev exp/tri1/decode_dev
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl --mem 8G exp/tri1/graph exp/tri1/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,4,30) and mean=12.1
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_dev/log/analyze_lattice_depth_stats.log
+ steps/score_kaldi.sh --cmd 'run.pl --mem 8G' data/dev exp/tri1/graph exp/tri1/decode_dev
steps/score_kaldi.sh --cmd run.pl --mem 8G data/dev exp/tri1/graph exp/tri1/decode_dev
steps/score_kaldi.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ steps/scoring/score_kaldi_cer.sh --stage 2 --cmd 'run.pl --mem 8G' data/dev exp/tri1/graph exp/tri1/decode_dev
steps/scoring/score_kaldi_cer.sh --stage 2 --cmd run.pl --mem 8G data/dev exp/tri1/graph exp/tri1/decode_dev
steps/scoring/score_kaldi_cer.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ echo 'local/score.sh: Done'
local/score.sh: Done
sv@HP:~/lkaldi/egs/aishell/s5$ steps/decode.sh --cmd "$decode_cmd" --config conf/decode.config --nj 10 exp/tri1/graph data/test exp/tri1/decode_test
steps/decode.sh --cmd run.pl --mem 8G --config conf/decode.config --nj 10 exp/tri1/graph data/test exp/tri1/decode_test
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl --mem 8G exp/tri1/graph exp/tri1/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,4,40) and mean=15.4
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_test/log/analyze_lattice_depth_stats.log
+ steps/score_kaldi.sh --cmd 'run.pl --mem 8G' data/test exp/tri1/graph exp/tri1/decode_test
steps/score_kaldi.sh --cmd run.pl --mem 8G data/test exp/tri1/graph exp/tri1/decode_test
steps/score_kaldi.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ steps/scoring/score_kaldi_cer.sh --stage 2 --cmd 'run.pl --mem 8G' data/test exp/tri1/graph exp/tri1/decode_test
steps/scoring/score_kaldi_cer.sh --stage 2