解析kaldi中yesno的hmm的用法

本文解释hmm这个fst的使用方法,如何将特征向量映射到元音phone的。

只解释用法,不解释如何生成hmm和model。

在yesno/s5/exp/mono0a/graph_tgpr目录下打印Ha.fst的内容
boystray@boystray-All-Series:~/kaldi/egs/yesno/s5/exp/mono0a/graph_tgpr$ fstprint  Ha.fst 
0	1	0	1
0	7	20	2
0	10	26	3
0	13	31	4
0
1	2	2	0	3.26078463
1	3	3	0	0.771307588
1	4	4	0	0.694674611
2	3	6	0	1.60320568
2	4	7	0	0.514737606
2	5	8	0	1.60320568
3	2	9	0	2.19029689
3	4	11	0	0.253189087
3	5	12	0	2.19029689
4	2	13	0	2.36310244
4	3	14	0	2.36310244
4	5	16	0	0.208477736
5	6	18	0
6	0	0	0
7	8	22	0
8	9	24	0
9	0	0	0
10	11	28	0
11	12	30	0	-2.38418579e-07
12	0	0	0
13	0	0	0

这里的第1列是源节点,第二列是目标节点,第3列是Transition-id,第4列是phone id。

Transition-id可以通过show-transitions获得
boystray@boystray-All-Series:~/kaldi/egs/yesno/s5/exp/mono0a$ ~/kaldi/src/bin/show-transitions phones.txt 0.mdl
/home/boystray/kaldi/src/bin/show-transitions phones.txt 0.mdl 
Transition-state 1: phone = SIL hmm-state = 0 pdf = 0
 Transition-id = 1 p = 0.25 [self-loop]
 Transition-id = 2 p = 0.25 [0 -> 1]
 Transition-id = 3 p = 0.25 [0 -> 2]
 Transition-id = 4 p = 0.25 [0 -> 3]
Transition-state 2: phone = SIL hmm-state = 1 pdf = 1
 Transition-id = 5 p = 0.25 [self-loop]
 Transition-id = 6 p = 0.25 [1 -> 2]
 Transition-id = 7 p = 0.25 [1 -> 3]
 Transition-id = 8 p = 0.25 [1 -> 4]
Transition-state 3: phone = SIL hmm-state = 2 pdf = 2
 Transition-id = 9 p = 0.25 [2 -> 1]
 Transition-id = 10 p = 0.25 [self-loop]
 Transition-id = 11 p = 0.25 [2 -> 3]
 Transition-id = 12 p = 0.25 [2 -> 4]
Transition-state 4: phone = SIL hmm-state = 3 pdf = 3
 Transition-id = 13 p = 0.25 [3 -> 1]
 Transition-id = 14 p = 0.25 [3 -> 2]
 Transition-id = 15 p = 0.25 [self-loop]
 Transition-id = 16 p = 0.25 [3 -> 4]
Transition-state 5: phone = SIL hmm-state = 4 pdf = 4
 Transition-id = 17 p = 0.75 [self-loop]
 Transition-id = 18 p = 0.25 [4 -> 5]
Transition-state 6: phone = Y hmm-state = 0 pdf = 5
 Transition-id = 19 p = 0.75 [self-loop]
 Transition-id = 20 p = 0.25 [0 -> 1]
Transition-state 7: phone = Y hmm-state = 1 pdf = 6
 Transition-id = 21 p = 0.75 [self-loop]
 Transition-id = 22 p = 0.25 [1 -> 2]
Transition-state 8: phone = Y hmm-state = 2 pdf = 7
 Transition-id = 23 p = 0.75 [self-loop]
 Transition-id = 24 p = 0.25 [2 -> 3]
Transition-state 9: phone = N hmm-state = 0 pdf = 8
 Transition-id = 25 p = 0.75 [self-loop]
 Transition-id = 26 p = 0.25 [0 -> 1]
Transition-state 10: phone = N hmm-state = 1 pdf = 9
 Transition-id = 27 p = 0.75 [self-loop]
 Transition-id = 28 p = 0.25 [1 -> 2]
Transition-state 11: phone = N hmm-state = 2 pdf = 10
 Transition-id = 29 p = 0.75 [self-loop]
 Transition-id = 30 p = 0.25 [2 -> 3]


而phone id在phones.txt文件中。
phones.txt文件如下
<eps> 0
SIL 1
Y 2
N 3
#0 4
#1 5

有了上面的基础,再看看Ha.fst最开始的几行

源节点 目标节点 Transition-id   phone id

0      7          20       2 识别出Y
0      10         26       3 识别出N
0      13         31       4  识别出#0

那么就识别出了元音phone,后续再通过HCLG,依次识别出word和句子。

 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值