一.回顾
此部分为原文https://blog.csdn.net/qq_16600319/article/details/121880698的回顾与补充:
-
目标:根据观测到的数据假设服从某分布,求其分布的参数。
-
输入: X = { x 1 , x 2 , … , x n } X=\{x_1,x_2,\dots,x_n\} X={ x1,x2,…,xn};
-
输出: θ \theta θ
-
M . L . E : θ ^ = arg max θ log P ( X ∣ θ ) M.L.E:\hat \theta=\arg \max_\theta\log P(X\vert \theta) M.L.E:θ^=argmaxθlogP(X∣θ)
-
存在的困难:难以求出其解析解。
-
相关概念:
- Z = { z 1 , z 2 , … , z n } Z=\{z_1,z_2,\dots,z_n\} Z={ z1,z2,…,zn}:隐变量
- z i z_i zi为 K K K维向量, i = 1 , 2 , … , n i=1,2,\dots,n i=1,2,…,n,分量的值为 K K K个分布所占的比例
-
EM: θ ( g + 1 ) = arg max θ ∫ Z log P ( X , Z ∣ θ ) P ( Z ∣ X , θ g ) d Z \theta^{(g+1)} =\arg\max_\theta \int_Z\log P(X,Z \vert\theta)P(Z \vert X,\theta^{g})\mathrm{d}Z θ(g+1)=argmaxθ∫ZlogP(X,Z∣θ)P(Z∣X,θg)dZ
-
混合高斯举例:
-
求 P ( X , Z ∣ θ ) P(X,Z \vert\theta) P(X,Z∣θ)
P ( X , Z ∣ θ ) = ∏ i = 1 n P ( x i , z i ∣ θ ) = ∏ i = 1 n P ( x i ∣ z i , θ ) P ( z i ∣ θ ) = ∏ i = 1 n λ z i N ( x i ∣ μ z i , σ z i 2 ) P(X,Z \vert\theta)=\prod_{i=1}^nP(x_i,z_i|\theta)=\prod_{i=1}^nP(x_i|z_i,\theta)P(z_i|\theta)=\prod_{i=1}^n\lambda_{z_i}N(x_i|\mu_{z_i},\sigma^2_{z_i}) P(X,Z∣θ)=i=1∏nP(xi,zi∣θ)=i=1∏nP(xi∣zi,θ)P(zi∣θ)=i=1∏nλziN(xi∣μzi,σzi2) -
求 P ( Z ∣ X , θ g ) P(Z \vert X,\theta^{g}) P(Z∣X,θg)
P ( Z ∣ X , θ g ) = ∏ i = 1 n P ( z i ∣ x i , θ g ) = ∗ ∗ λ z i N ( x i ∣ μ z i , σ z i 2 ) ∑ z i = 1 k λ z i N ( x i ∣ μ z i , σ z i 2 ) \begin{aligned}P(Z \vert X,\theta^{g})& =\prod_{i=1}^n P(z_i \vert x_i,\theta^{g})\\ &\overset{**}{=}\frac{\lambda_{z_i}N(x_i|\mu_{z_i},\sigma^2_{z_i})}{\sum_{z_i=1}^k\lambda_{z_i}N(x_i|\mu_{z_i},\sigma^2_{z_i})}\end{aligned} P(Z∣X,θ
-