EM Algorithm

Expectation Maximization Algorithm

EM算法和之前学的都不太一样,EM算法更多的是一种思想,所以后面用几个例子讲解,同时也会重点讲解GMM高斯混合模型。

①极大似然估计

极大似然估计这里面用的比较多。假设我们想要知道我们学生身高的分布,首先先假设这些学生都是符合高斯分布我们要做的就是要估计这两个参数到底是多少。学生这么多,挨个挨个来肯定是不切实际的,所以自然就是抽样了。
为了统计学生身高,我们抽样200个人组成样本
我们需要估计的参数首先估计一下抽到这两百人的概率一共是多少,抽到男生A的概率抽到学生B的概率所以同时抽到这两个学生的概率就是那么同时抽到这200个学生的G概率
最后再取一个对数就好了:

notation和log

上面有一条公式里面是同时存在了;和|,这两个符号差别其实有点大的。|一般我们是用来表示条件概率,比如就是表示x在θ的条件下发生的概率。也是一个意思。
分号;表示的就是表示后面的是待估计的参数,也就是说P(x;θ)意思就是后面的θ是需要估计的参数而不是条件,所以|也有另一层意思,如果不是表示条件概率,那么就是表示后面有待估计参数。
当然是在|不表示条件概率的情况下。

这两种表示法是源于两种学派的不同理解:
频率派认为参数为固定的值,是指真实世界中,参数值就是某个定值。
贝叶斯派认为参数是随机变量,是指取这个值是有一定概率的。当然,无论是;还是|,他们都是表示条件概率的意思,只不过这两个学派理解不一样而已。
notation的问题解决了之后就是log的问题了,为什么需要log化,讲道理,是不需要的。但是求log有这么几个好处:
1.计算简单,累乘是很难计算的,log之后可以变换成累加。
2.概率累乘是会出现数字非常小的情况,log之后就可以避免这种情况。
3.log之后函数的梯度方向是没有变化的,对于函数优化的方向影响很小。

似然函数的执行步骤:
1.得到似然函数
2.取对数整理
3.求导数,另导数为零
4.解方程得到解

②Jensen不等式

首先引出凸函数的概念那么就是凸函数,所以它的图像就是一个勾形的,看起来是一个凹函数,实际上是凸函数。

10624272-b1efeb23d409f01d.png

凸函数的性质很显而易见了

10624272-06507731dc83f7ca.png

其实很明显的,看图就知道了,E(x)其实就是a到b的平均值,上面的结论很容易证实。那么如果想要取到等号,需要改变上面?取等号的意思就是相切,相切的意思就是a = b,既然a = b了,那么x自然就是常数了,所以当且仅当
如果是凹函数,那么就是方向相反了。

③EM算法的推导

正常来看先是要引入一个最大似然函数:但这样其实是和难求的,P(x|θ)完全混在了一起,根本求不出来,所以我们要引入一个辅助变量z。

隐变量Z

隐变量是观测不到的,比如做一个抽样,从3个袋子里面抽取小球。而抽取这些小球的过程是不可见的,抽取的过程其实就是隐变量,而这些隐变量,也就是过程可以告诉我们这个x是从哪个袋子来的,由此来区分。这个隐变量和HMM里面的隐含序列不是一个东西,隐含序列是当前x的状态,而不是抽取x的过程。所以在这里,通俗点讲,这个z就是用来找到x的组类的,也就是说z来告诉这个x你是属于哪一组的。
另外需要注意的是,隐变量是不可以随便添加的,添加隐变量之后不能影响边缘概率。也就是说,原来是P(x),添加之后就是P(x,z),那么必须有:

所以我们引入隐变量的原因是为了转化成和这几个高斯模型相关的式子,否则无从下手。化简一下上式子:既然z可以指定x,那么我们只需要求解出z就好了。
注意上面凸函数所提到的一个期望性质,这里就可以使用了。因为虽然优化了上面的式子,还是不能求出来,因为z变量实在是太抽象了,找不到一个合适的公式来表示它。EM的一个方法就是用优化下界函数的方法来达到优化目标函数的目的。
既然z很抽象,那么我们就需要一个转变一下。对于每一个样例x都会对应一个z,那么假设一个分布Q(z)是满足了z的分布的,而Q(z)满足的条件是Qi意味着每一个x对应的z都会对应着一个Q了,这里有点复杂,再详细解释一下。一个x对应一组z,z是一个向量,但是每一个z又会分别对应一个一个分布Q。以为最后得到的z不会是一个数字,而是一个概率,也就是说Q(z)得到的是这个x样例属于这个类别的概率是多少。而z的数量,一个是当前有多少个分布混合在一起的数量。
再梳理一下:现在的样本是xi,那么每一个xi将会对应着一组的z,每一个xi同时也会对应

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
刚找到的书,第二版的.. 【原书作者】: Geoffrey J. McLachlan, Thriyambakam Krishnan 【ISBN 】: ISBN-10: 0471201707 / ISBN-13: 978-0471201700 【页数 】:360 【开本 】 : 【出版社】 :Wiley-Interscience 【出版日期】:March 14, 2008 【文件格式】:DJVU(请去网上下载windjview阅读 【摘要或目录】: Review "...should be comprehensible to graduates with statistics as their major subject." (Quarterly of Applied Mathematics, Vol. LIX, No. 3, September 2001) --This text refers to the Hardcover edition. Book Description The EM Algorithm and Extensions remains the only single source to offer a complete and unified treatment of the theory, methodology, and applications of the EM algorithm. The highly applied area of statistics here outlined involves applications in regression, medical imaging, finite mixture analysis, robust statistical modeling, survival analysis, and repeated-measures designs, among other areas. The text includes newly added and updated results on convergence, and new discussion of categorical data, numerical differentiation, and variants of the EM algorithm. It also explores the relationship between the EM algorithm and the Gibbs sampler and Markov Chain Monte Carlo methods. About Authors Geoffrey J. McLachlan, PhD, DSc, is Professor of Statistics in the Department of Mathematics at The University of Queensland, Australia. A Fellow of the American Statistical Association and the Australian Mathematical Society, he has published extensively on his research interests, which include cluster and discriminant analyses, image analysis, machine learning, neural networks, and pattern recognition. Dr. McLachlan is the author or coauthor of Analyzing Microarray Gene Expression Data, Finite Mixture Models, and Discriminant Analysis and Statistical Pattern Recognition, all published by Wiley. Thriyambakam Krishnan, PhD, is Chief Statistical Architect, SYSTAT Software at Cranes Software International Limited in Bangalore, India. Dr. Krishnan has over forty-five years of research, teaching, consulting, and software development experience at the Indian Statistical Institute (ISI). His research interests include biostatistics, image analysis, pattern recognition, psychometry, and the EM algorithm. 目录 Preface to the Second Edition. Preface to the First Edition. List of Examples. 1. General Introduction. 1.1 Introduction. 1.2 Maximum Likelihood Estimation. 1.3 Newton-Type Methods. 1.4 Introductory Examples. 1.5 Formulation of the EM Algorithm. 1.6 EM Algorithm for MAP and MPL Estimation. 1.7 Brief Summary of the Properties of EM Algorithm. 1.8 History of the EM Algorithm. 1.9 Overview of the Book. 1.10 Notations. 2. Examples of the EM Algorithm. 2.1 Introduction. 2.2 Multivariate Data with Missing Values. 2.3 Least Square with the Missing Data. 2.4 Example 2.4: Multinomial with Complex Cell Structure. 2.5 Example 2.5: Analysis of PET and SPECT Data. 2.6 Example 2.6: Multivariate t-Distribution (Known D.F.). 2.7 Finite Normal Mixtures. 2.8 Example 2.9: Grouped and Truncated Data. 2.9 Example 2.10: A Hidden Markov AR(1) Model. 3. Basic Theory of the EM Algorithm. 3.1 Introduction. 3.2 Monotonicity of a Generalized EM Algorithm. 3.3 Monotonicity of a Generalized EM Algorithm. 3.4 Convergence of an EM Sequence to a Stationary Value. 3.5 Convergence of an EM Sequence of Iterates. 3.6 Examples of Nontypical Behavior of an EM (GEM) Sequence. 3.7 Score Statistic. 3.8 Missing Information. 3.9 Rate of Convergence of the EM Algorithm. 4. Standard Errors and Speeding up Convergence. 4.1 Introduction. 4.2 Observed Information Matrix. 4.3 Approximations to Observed Information Matrix: i.i.d. Case. 4.4 Observed Information Matrix for Grouped Data. 4.5 Supplemented EM Algorithm. 4.6 Bookstrap Approach to Standard Error Approximation. 4.7 Baker’s, Louis’, and Oakes’ Methods for Standard Error Computation. 4.8 Acceleration of the EM Algorithm via Aitken’s Method. 4.9 An Aitken Acceleration-Based Stopping Criterion. 4.10 conjugate Gradient Acceleration of EM Algorithm. 4.11 Hybrid Methods for Finding the MLE. 4.12 A GEM Algorithm Based on One Newton-Raphson Algorithm. 4.13 EM gradient Algorithm. 4.14 A Quasi-Newton Acceleration of the EM Algorithm. 4.15 Ikeda Acceleration. 5. Extension of the EM Algorithm. 5.1 Introduction. 5.2 ECM Algorithm. 5.3 Multicycle ECM Algorithm. 5.4 Example 5.2: Normal Mixtures with Equal Correlations. 5.5 Example 5.3: Mixture Models for Survival Data. 5.6 Example 5.4: Contingency Tables with Incomplete Data. 5.7 ECME Algorithm. 5.8 Example 5.5: MLE of t-Distribution with the Unknown D.F. 5.9 Example 5.6: Variance Components. 5.10 Linear Mixed Models. 5.11 Example 5.8: Factor Analysis. 5.12 Efficient Data Augmentation. 5.13 Alternating ECM Algorithm. 5.14 Example 5.9: Mixtures of Factor Analyzers. 5.15 Parameter-Expanded EM (PX-EM) Algorithm. 5.16 EMS Algorithm. 5.17 One-Step-Late Algorithm. 5.18 Variance Estimation for Penalized EM and OSL Algorithms. 5.19 Incremental EM. 5.20 Linear Inverse problems. 6. Monte Carlo Versions of the EM Algorithm. 6.1 Introduction. 6.2 Monte Carlo Techniques. 6.3 Monte Carlo EM. 6.4 Data Augmentation. 6.5 Bayesian EM. 6.6 I.I.D. Monte Carlo Algorithm. 6.7 Markov Chain Monte Carlo Algorithms. 6.8 Gibbs Sampling. 6.9 Examples of MCMC Algorithms. 6.10 Relationship of EM to Gibbs Sampling. 6.11 Data Augmentation and Gibbs Sampling. 6.12 Empirical Bayes and EM. 6.13 Multiple Imputation. 6.14 Missing-Data Mechanism, Ignorability, and EM Algorithm. 7. Some Generalization of the EM Algorithm. 7.1 Introduction. 7.2 Estimating Equations and Estimating Functions. 7.3 Quasi-Score and the Projection-Solution Algorithm. 7.4 Expectation-Solution (ES) Algorithm. 7.5 Other Generalization. 7.6 Variational Bayesian EM Algorithm. 7.7 MM Algorithm. 7.8 Lower Bound Maximization. 7.9 Interval EM Algorithm. 7.10 Competing Methods and Some Comparisons with EM. 7.11 The Delta Algorithm. 7.12 Image Space Reconstruction Algorithm. 8. Further Applications of the EM Algorithm. 8.1 Introduction. 8.2 Hidden Markov Models. 8.3 AIDS Epidemiology. 8.4 Neural Networks. 8.5 Data Mining. 8.6 Bioinformatics. References. Author Index. Subject Index

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值