CrossWord of AM training

General Framework for Acoustic Modeling


Building ASR system incrementally:

Context-independent ➔ Context-dependent modeling
Mono-phone ➔ Tri-phone HMM
Single Gaussian mixture per state ➔ Multiple Gaussian mixtures per state

Context-independent Modeling   上下文无关建模



Flowchart for Crossword Modeling:


Forced Alignment:



Input:
Word level transcription  词汇转录
Lexicon/Dictionary        词汇、字典
Multiple pronunciations   多重发音
Z. (z eh d vs. z iy)
HMMs

Output:
Phoneme level transcription of actual pronunciation with time boundary 具有时间边界的实际发音转换


To deal with the issue of imprecise transcription     处理不精确转录的问题最初,

Initially HMMs are trained on the basis of one fixed pronunciation per word   HMM是根据每个单词一个固定的发音进行训练的

To determine the actual pronunciations in the utterances used to train the HMM system 确定用于训练HMM系统的话语中的实际发音
HVite is used in forced alignment mode to select the best matching pronunciations.  HVite用于强制对齐模式,以选择最佳匹配发音。
The new phone level transcriptions can then be used to retrain the HMMs  然后可以使用新的phone级转录来重新训练HMM


Transcription snippets:转录片段


Workflow of Crossword Acoustic Modeling in Autotrain

Input of Crossword Training:



Stage 1-Generate phone-based trans:



Stage 2 Generate monophone HMMs:


Stage 3-Generate triphone HMMs and trans




Stage 4-Bulid fully-trained triphone HMMs


Stage 5- TrainingPriors



Stage 6- Gender-specific HMMs



Output of Crossword Training







评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值