Let
x
be observable data and
p(x,z|θ)p(x|θ)=p(z|x,θ)
Take log on both sides:
⇒logp(x,z|θ)−logp(x|θ)=logp(z|x,θ)logp(x|θ)=logp(x,z|θ)−logp(z|x,θ)
Take conditional expectation with respect to z|θ′,x on both sides:
⇒ε[logp(x|θ)|θ′,x]=ε[logp(x,z|θ)|θ′,x]−ε[logp(z|x,θ)|θ′,x]logp(x|θ)=ε[logp(x,z|θ)|θ′,x]−ε[logp(z|x,θ)|θ′,x]
Choose
⇒θ(i+1)=argmaxθε[logp(x,z|θ)|θ(i),x]θ(i+1)=argmaxθ∑zp(z|θ(i),x)logp(x,z|θ)
Prove that
p(x|θ(i))
is increasing as
i
increasing, i.e.,
- Because of the choice of θ(i+1) , we have
ε[logp(x,z|θ(i+1))|θ(i),x]≥ε[logp(x,z|θ(i))|θ(i),x]
- We only need to show that
ε[logp(z|x,θ(i+1))|θ(i),x]≤ε[logp(z|x,θ(i+1))|θ(i),x]
This is true because of following.
If ε is taken with respect to p(x) , we have ε[logp(x)]≥εlogp′(x) , where p′(x) is any pdf (not identical as p(x) ).
p.f.
⇒⇒⇒ε[logp′(x)p(x)]≤logε[p′(x)p(x)](by Jensen's inequality)ε[logp′(x)]−ε[logp(x)]≤log∫p′(x)p(x)⋅p(x)dx=1ε[logp′(x)]−ε[logp(x)]≤0ε[logp′(x)]≤ε[logp(x)]
p.s.
Jensen’s inequality:
For a convex function ϕ ,
ε[ϕ(x)]≥ϕ(ε[x])
and let ϕ=−log ,
ε[log(x)]≤log(ε[x])