Gaussian Mixture Model using EM Algorithm
E-M 算法有个典型的应用,就是高斯混合模型,适用于对多个高斯分布的数据进行分类。
Gaussian Mixture Model (k-mixture):
p(X|θ)=∑i=1kαlN(X|μl,Σl)(14)
(14)
p
(
X
|
θ
)
=
∑
i
=
1
k
α
l
N
(
X
|
μ
l
,
Σ
l
)
∑l=1kαl=1(15)
(15)
∑
l
=
1
k
α
l
=
1
θ={α1,...αk,μ1,...μk,Σ1,...Σk}(16)
(16)
θ
=
{
α
1
,
.
.
.
α
k
,
μ
1
,
.
.
.
μ
k
,
Σ
1
,
.
.
.
Σ
k
}
由E-M算法我们有:
θ(g+1)=argmaxθ∫zlog(P(X,Z|θ))∗P(Z|X,θ(g))dz(17)
(17)
θ
(
g
+
1
)
=
arg
max
θ
∫
z
l
o
g
(
P
(
X
,
Z
|
θ
)
)
∗
P
(
Z
|
X
,
θ
(
g
)
)
d
z
又由混合高斯分布,我们有:
P(X,Z|θ)=∏i=1nP(xi,zi|θ)=∏i=1nP(xi|zi,θ)⋅P(zi|θ)=∏i=1nαziN(xi|μzi,Σzi)(18)
(18)
P
(
X
,
Z
|
θ
)
=
∏
i
=
1
n
P
(
x
i
,
z
i
|
θ
)
=
∏
i
=
1
n
P
(
x
i
|
z
i
,
θ
)
⋅
P
(
z
i
|
θ
)
=
∏
i
=
1
n
α
z
i
N
(
x
i
|
μ
z
i
,
Σ
z
i
)
P(Z|X,θ)=∏i=1nP(zi|xi,θ)=∏i=1nP(xi|zi,θ)∑kl=1P(xi|zl,θ)=∏i=1nαziN(xi|μzi,Σzi)∑kl=1αlN(xi|μl,Σl)(19)
(19)
P
(
Z
|
X
,
θ
)
=
∏
i
=
1
n
P
(
z
i
|
x
i
,
θ
)
=
∏
i
=
1
n
P
(
x
i
|
z
i
,
θ
)
∑
l
=
1
k
P
(
x
i
|
z
l
,
θ
)
=
∏
i
=
1
n
α
z
i
N
(
x
i
|
μ
z
i
,
Σ
z
i
)
∑
l
=
1
k
α
l
N
(
x
i
|
μ
l
,
Σ
l
)
可以看到,虽然增加了隐变量Z, 但是 P(X,Z|θ) P ( X , Z | θ ) 比 P(X|θ) P ( X | θ ) 显得更简单。
把上面两个式子代入EM迭代式子里,有:
θ(g+1)=∑zi=1k∑i=1nlogP(xi,zi|θ)⋅P(zi|xi,θ(g))=∑l=1k∑i=1nlog[αlN(xi|μl,Σl)]⋅P(l|xi,θ(g))(20)
(20)
θ
(
g
+
1
)
=
∑
z
i
=
1
k
∑
i
=
1
n
l
o
g
P
(
x
i
,
z
i
|
θ
)
⋅
P
(
z
i
|
x
i
,
θ
(
g
)
)
=
∑
l
=
1
k
∑
i
=
1
n
l
o
g
[
α
l
N
(
x
i
|
μ
l
,
Σ
l
)
]
⋅
P
(
l
|
x
i
,
θ
(
g
)
)
代入高斯混合模型的前提,最后针对各个参数求导,令其为0. 最后有以下公式:
α(g+1)l=1N∑i=1NP(l|xi,θ(g))(21)
(21)
α
l
(
g
+
1
)
=
1
N
∑
i
=
1
N
P
(
l
|
x
i
,
θ
(
g
)
)
μ(g+1)l=∑Ni=1xiP(l|xi,θ(g))∑Ni=1P(l|xi,θ(g))(12)
(12)
μ
l
(
g
+
1
)
=
∑
i
=
1
N
x
i
P
(
l
|
x
i
,
θ
(
g
)
)
∑
i
=
1
N
P
(
l
|
x
i
,
θ
(
g
)
)
Σ(g+1)l=∑Ni=1[xi−μ(g+1)l][xi−μ(g+1)l]TP(l|xi,θ(g))∑Ni=1P(l|xi,θ(g))(13)
(13)
Σ
l
(
g
+
1
)
=
∑
i
=
1
N
[
x
i
−
μ
l
(
g
+
1
)
]
[
x
i
−
μ
l
(
g
+
1
)
]
T
P
(
l
|
x
i
,
θ
(
g
)
)
∑
i
=
1
N
P
(
l
|
x
i
,
θ
(
g
)
)