这个是概率密度。
p
(
x
i
∣
z
i
,
Θ
)
p\left(x_{i} \mid z_{i}, \Theta\right)
p(xi∣zi,Θ)是来自第
Z
i
Z_{i}
Zi个模型下,
x
i
x_{i}
xi的分布。所以第
Z
i
Z_{i}
Zi个模型的参数是:
N
(
μ
z
i
,
Σ
z
i
)
\mathcal{N}\left(\mu_{z_{i}}, \Sigma_{z_{i}}\right)
N(μzi,Σzi)
p
(
z
i
∣
Θ
)
p\left(z_{i} \mid \Theta\right)
p(zi∣Θ)是给定参数下选第
Z
i
Z_{i}
Zi个模型的概率。很明显这个就是
α
i
\alpha_{i}
αi的概率。用
α
z
i
\alpha_{z_{i}}
αzi表示。
得到:
p
(
X
,
Z
∣
Θ
)
=
∏
i
=
1
n
p
(
x
i
,
z
i
∣
Θ
)
=
∏
i
=
1
n
p
(
x
i
∣
z
i
,
Θ
)
⏟
N
(
μ
z
i
,
Σ
z
i
)
p
(
z
i
∣
Θ
)
⏟
α
z
i
=
∏
i
=
1
n
α
z
i
N
(
μ
z
i
,
Σ
z
i
)
p(X, Z \mid \Theta)=\prod_{i=1}^{n} p\left(x_{i}, z_{i} \mid \Theta\right)=\prod_{i=1}^{n} \underbrace{p\left(x_{i} \mid z_{i}, \Theta\right)}_{\mathcal{N}\left(\mu_{z_{i}}, \Sigma_{z_{i}}\right)} \underbrace{p\left(z_{i} \mid \Theta\right)}_{\alpha_{z_{i}}}=\prod_{i=1}^{n} \alpha_{z_{i}} \mathcal{N}\left(\mu_{z_{i}}, \Sigma_{z_{i}}\right)
p(X,Z∣Θ)=∏i=1np(xi,zi∣Θ)=∏i=1nN(μzi,Σzi)
p(xi∣zi,Θ)αzi
p(zi∣Θ)=∏i=1nαziN(μzi,Σzi)
借下来写:
p
(
Z
∣
X
,
Θ
)
p(Z \mid X, \Theta)
p(Z∣X,Θ)
p
(
Z
∣
X
,
Θ
)
=
∏
i
=
1
n
p
(
z
i
∣
x
i
,
Θ
)
=
∏
i
=
1
n
α
z
i
N
(
μ
z
i
,
Σ
z
i
)
∑
l
=
1
k
α
I
N
(
μ
l
,
Σ
l
)
p(Z \mid X, \Theta)=\prod_{i=1}^{n} p\left(z_{i} \mid x_{i}, \Theta\right)=\prod_{i=1}^{n} \frac{\alpha_{z_{i}} \mathcal{N}\left(\mu_{z_{i}}, \Sigma_{z_{i}}\right)}{\sum_{l=1}^{k} \alpha_{I} \mathcal{N}\left(\mu_{l}, \Sigma_{l}\right)}
p(Z∣X,Θ)=∏i=1np(zi∣xi,Θ)=∏i=1n∑l=1kαIN(μl,Σl)αziN(μzi,Σzi)
这里求的是在给定数据和参数下,来自
Z
Z
Z模型的概率,分母是第
i
i
i个数据集来自全部模型的概率。分子是来自
Z
Z
Z模型的概率。然后用连乘表示每个数据来自
Z
Z
Z模型的概率。
写出Q函数。这里是多元密度函数求积分。
p
(
z
i
∣
x
i
,
Θ
(
g
)
)
p\left(z_{i} \mid x_{i}, \Theta^{(g)}\right)
p(zi∣xi,Θ(g))可写成
p
(
z
1
,
.
.
.
.
,
z
N
)
p\left(z_{1},...., z_{N}\right)
p(z1,....,zN)
∏
i
=
1
n
ln
p
(
z
i
,
x
i
∣
Θ
)
\prod_{i=1}^{n}\ln p\left(z_{i}, x_{i} \mid \Theta\right)
∏i=1nlnp(zi,xi∣Θ)可以用
f
(
z
i
)
f(z_{i})
f(zi)表示。所以可以写程
∫
z
1
⋯
∫
z
n
(
(
f
(
z
1
)
+
f
(
z
2
)
+
.
.
.
,
f
(
z
N
)
)
∏
i
=
1
n
p
(
z
i
∣
x
i
,
Θ
(
g
)
)
)
d
z
1
,
…
d
z
n
\int_{z_{1}} \cdots \int_{z_{n}}\left(\ (f(z_{1})+f(z_{2})+...,f(z_{N}))\prod_{i=1}^{n} p\left(z_{i} \mid x_{i}, \Theta^{(g)}\right)\right) d z_{1}, \ldots d z_{n}
∫z1⋯∫zn( (f(z1)+f(z2)+...,f(zN))∏i=1np(zi∣xi,Θ(g)))dz1,…dzn
所以可以写成
∫
z
1
⋯
∫
z
n
(
(
f
(
z
1
)
+
f
(
z
2
)
+
.
.
.
,
f
(
z
N
)
)
p
(
z
1
,
.
.
.
.
,
z
N
)
)
d
z
1
,
…
d
z
n
\int_{z_{1}} \cdots \int_{z_{n}}\left(\ (f(z_{1})+f(z_{2})+...,f(z_{N}))p\left(z_{1},...., z_{N}\right)\right) d z_{1}, \ldots d z_{n}
∫z1⋯∫zn( (f(z1)+f(z2)+...,f(zN))p(z1,....,zN))dz1,…dzn.。
根据积分性质。积分
d
z
2
d_{z_{2}}
dz2只
z
2
z_{2}
z2有关。所以
图片来自:https://github.com/roboticcam/machine-learning-notes/blob/master/files/em.pdf