CrossWord of AM training

最新推荐文章于 2024-08-18 05:00:00 发布

小排_611

最新推荐文章于 2024-08-18 05:00:00 发布

阅读量316

点赞数

分类专栏：语音识别

本文链接：https://blog.csdn.net/weixin_37355348/article/details/77415724

版权

语音识别专栏收录该内容

6 篇文章 0 订阅

订阅专栏

General Framework for Acoustic Modeling

Building ASR system incrementally：

Context-independent ➔ Context-dependent modeling
Mono-phone ➔ Tri-phone HMM
Single Gaussian mixture per state ➔ Multiple Gaussian mixtures per state

Context-independent Modeling 上下文无关建模

Flowchart for Crossword Modeling：

Forced Alignment：

Input:
Word level transcription 词汇转录
Lexicon/Dictionary 词汇、字典
Multiple pronunciations 多重发音
Z. (z eh d vs. z iy)
HMMs

Output:
Phoneme level transcription of actual pronunciation with time boundary 具有时间边界的实际发音转换

To deal with the issue of imprecise transcription 处理不精确转录的问题最初，

Initially HMMs are trained on the basis of one fixed pronunciation per word HMM是根据每个单词一个固定的发音进行训练的

To determine the actual pronunciations in the utterances used to train the HMM system 确定用于训练HMM系统的话语中的实际发音
HVite is used in forced alignment mode to select the best matching pronunciations. HVite用于强制对齐模式，以选择最佳匹配发音。
The new phone level transcriptions can then be used to retrain the HMMs 然后可以使用新的phone级转录来重新训练HMM

Transcription snippets：转录片段