Kaldi 学习总结

最新推荐文章于 2024-06-08 10:02:44 发布

会飞行的小蜗牛

最新推荐文章于 2024-06-08 10:02:44 发布

阅读量2.8k

点赞数 1

分类专栏：语音识别

本文链接：https://blog.csdn.net/dearwind153/article/details/55261649

版权

语音识别专栏收录该内容

24 篇文章 2 订阅

订阅专栏

0. 看语音识别相关英文著作时, word 的理解

1. 声学训练时，HMM，GMM 都用在什么地方

2. P(W|O) 的深入理解

likelihood 的简单理解:

P(O|W): 给定 O, 调整 W，使得 P(O|W) 最大

3. 语音识别过程理解

参看这个链接就可以了! 点击打开链接

3.1 解码阶段的总结

解码阶段可总结为：

教材上的总结

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

只看标出色彩后的部分

In the decoding phase, we take the acoustic model (AM), which consists of this sequence of acoustic likelihoods, plus an HMM dictionary of word pronunciations, combined with the language model (LM) (generally an N-gram grammar), and output the most likely sequence of words.

acoustic model (AM): consists of this sequence of acoustic likelihoods

HMM dictionary of word pronunciations: is lexicon

An HMM dictionary -> is a list of word pronunciations

Each pronunciation represented by a string of phones.

Each word can then be thought of as an HMM, where the phones (or sometimes subphones) are states in the HMM, and the Gaussian likelihood estimators supply the HMM output likelihood function for each state.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

4. HMM 在语音识别中 Self-loop state 有什么意义？

因为被识别的语音需要分帧（大约20ms/帧），帧与帧之间还需要有（10ms）重叠，这样同样的 phone 可能出现在连续的多帧中，而 phone / subphone 就是 HMM 的对应State，这时Self-loops 就派上用场了，这段语音中部分语音片段对应的 HMM 的状态就是 Self-loops 。

5. 为什么简单的语言识别可以 “build an HMM whose states correspond to entire words”？