1.16 The EM algorithm(Expectation-Maximization)

1.Jensen's inequality

If  \large f is convex function and \large X is a random variable, then \large E(f(X))\geq f(EX).

If  \large f is continuous and derivable,then \large f'' \geqslant 0(In the case of \large f taking vector-valued inputs,the Hessian matrix \large H is positive semi-definite)

In the following, the function we will use is \large f = log(x). It is a concave function,since \large f''=-\frac{1}{x^2}< 0.Using Jensen's inequality,we get : \large log(E(X))\geq E(log(X)).

2.EM algorithm

Suppose probability density function of \large x is \large p(x) parameterized by \large \theta,i.e.\large p(x;\theta).We want to estimate the value of \large \theta.

the likelihood function is :\large l(\theta)=\sum_{i}^{m} log(p(x;\theta))=\sum_{i=1}^{m}log(\sum_{z}p(x,z;\theta)),where \large z is the latent random variable.

Because z is not observed, explicitly finding the maximum likelihood estimates of the parameters \large \theta is very hard.So we will use EM  algorithm.The related derivation is as follows:

\large z^i is another random variable and it has distribution \large Q(z^i).\large \sum_{z}Q_i(z)=1,Q_i(z)\geq 0

\large \sum_ilog(p(x^i;\theta))=\sum_i log(\sum_z p(x^i,z^i;\theta))=\sum_i log(\sum_z Q_i(z^i)\frac{p(x^i,z^i;\theta)}{Q_i(z^i)})

\large =\sum_i log(E(\frac{p(x^i,z^i;\theta)}{Q_i(z^i)}))\large \geq \sum_i E(log(\frac{p(x^i,z^i;\theta)}{Q_i(z^i)}))=\sum_i Q_i(z^i)log(\frac{p(x^i,z^i;\theta)}{Q_i(z^i)})

So far, the aim of this derivation is to construct a lower bound for \large l(\theta),the function we want to maximize.

To find a tighter bound is to let  this item \large \frac{p(x^i,z^i;\theta)}{Q_i(z^i)}equal to a constant-valued random variable .I.e. Let \large \frac{p(x^i,z^i;\theta)}{Q_i(z^i)}=c

we get \large Q_i(z^i) \propto p(x^i,z^i;\theta).  Further,since \large \sum_z Q_i(z^i)=1,we can get :\large Q_i(z^i)=\frac{p(x^i,z^i;\theta)}{\sum_zp(x^i,z;\theta)}=\frac{p(x^i,z^i;\theta)}{p(x^i;\theta)}=p(z^i|x^i;\theta)

3.The possess of EM

Repeat until convergence{

 E-step: For each i,set \large Q_i(z^i)=p(z^i|x^i;\theta)

M-step: Set \large \theta =arg\max_{\theta}\sum_i Q_i(z^i)log(\frac{p(x^i,z^i;\theta)}{Q_i(z^i)})

}

 

PS:文章的要点终于拼凑出来了,自己还要结合混合高斯模型和混合贝叶斯模型更加深入理解这个算法。

但是就这一篇博客来说,写的如一坨狗屎般。这是第一次尝试用英文写点东西,之前也没有输出英语,深感自己使用英语的能力的差劲,但是很多一手和先进的资料都是英文的,自己以后也要发表论文,英语还是要苦练啊。加油吧

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值