1. 相关部分包含的主要任务
1.1 WFST Key Concepts
- determinization
- minimization
- composition
- equivalent
- epsilon-free
- functional
- on-demand algorithm
- weight-pushing
- epsilon removal
1.2 HMM Key Concepts
- Markov Chain
- Hidden Markov Model
- Forward-backward algorithm
- Viterbi algorithm
- E-M for mixture of Gaussians
2. HCLG
L.fst: The Phonetic Dictionary FST
L maps monophone sequences to words.
The file L.fst is the Finite State Transducer form of the lexicon with phone symbols on the input and word symbols on the output.
L_disambig.fst:The Phonetic Dictionary with Disambiguation Symbols FST
A lexicon with disambiguation symbols
G.fst:The Language Model FST
FSA grammar (can be built from an n-gram grammar).
C.fst:The Context FST
C maps triphone sequences to monophones.
Expands the phones into context-dependent phones.
H.fst:The HMM FST
H maps multiple HMM states (a.k.a. transition-ids in Kaldi-speak) to context-dependent triphones.
Expands out the HMMs. On the right are the context-dependent phones and on the left are the pdf-ids.
HCLG.fst: final graph
总结一下:
构图过程 G -> L -> C -> H
G: 作为 acceptor (输入 symbol 与输出相同),用于对grammar 或者 language model 进行编码
L:Lexicon, 其输出 symbol 是 words, 输入 symbol 是 phones
C:context-dependency 其输出 symbol 是 phones, 其输入 symbol 为表示context-dependency phones
如: vector<int32> ctx_window = { 12, 15, 21 };
含义:id = 15 的 phone 为 中心 phone, left phone id = 12, right phone id = 21
H: 包括HMM definitions,其输出 symbol 为 context-dependency phones, 其输入 symbol 为 transitions-ids(即 对 pdf-id 和 其它信息编码后的 id)
asl=="add-self-loops”
rds=="remove-disambiguation-symbols”,
and H' is H without the self-loops:
HCLG = asl(min(rds(det(H' o min(det(C o min(det(L o G))))))))
转自:http://blog.csdn.net/dearwind153/article/details/70053704