1.Jensen's inequality
If is convex function and is a random variable, then .
If is continuous and derivable,then (In the case of taking vector-valued inputs,the Hessian matrix is positive semi-definite)
In the following, the function we will use is . It is a concave function,since .Using Jensen's inequality,we get : .
2.EM algorithm
Suppose probability density function of is parameterized by ,i.e..We want to estimate the value of .
the likelihood function is :,where is the latent random variable.
Because z is not observed, explicitly finding the maximum likelihood estimates of the parameters is very hard.So we will use EM algorithm.The related derivation is as follows:
is another random variable and it has distribution .
So far, the aim of this derivation is to construct a lower bound for ,the function we want to maximize.
To find a tighter bound is to let this item equal to a constant-valued random variable .I.e. Let
we get . Further,since ,we can get :
3.The possess of EM
Repeat until convergence{
E-step: For each i,set
M-step: Set
}
PS:文章的要点终于拼凑出来了,自己还要结合混合高斯模型和混合贝叶斯模型更加深入理解这个算法。
但是就这一篇博客来说,写的如一坨狗屎般。这是第一次尝试用英文写点东西,之前也没有输出英语,深感自己使用英语的能力的差劲,但是很多一手和先进的资料都是英文的,自己以后也要发表论文,英语还是要苦练啊。加油吧