EM for PCA
With complete information
- If we knew z z z for each x x x, estimating A A A and D D D would be simple
x = A z + E x=A z+E x=Az+E
P ( x ∣ z ) = N ( A z , D ) P(x \mid z)=N(A z, D) P(x∣z)=N(Az,D)
- Given complete information ( x 1 , z 1 ) , ( x 2 , z 2 ) \left(x_{1}, z_{1}\right),\left(x_{2}, z_{2}\right) (x1,z1),(x2,z2)
argmax A , D ∑ ( x , z ) log P ( x , z ) = argmax A , D ∑ ( x , z ) log P ( x ∣ z ) \underset{A, D}{\operatorname{argmax}} \sum_{(x, z)} \log P(x, z)=\underset{A, D}{\operatorname{argmax}} \sum_{(x, z)} \log P(x \mid z) A,Dargmax(x,z)∑logP(x,z)=A,Dargmax(x,z)∑logP(x∣z)
= argmax A , D ∑ ( x , Z ) log 1 ( 2 π ) d ∣ D ∣ exp ( − 0.5 ( x − A z ) T D − 1 ( x − A z ) ) =\underset{A, D}{\operatorname{argmax}} \sum_{(x, Z)} \log \frac{1}{\sqrt{(2 \pi)^{d}|D|}} \exp \left(-0.5(x-A z)^{T} D^{-1}(x-A z)\right) =A,Dargmax(x,Z)∑log(2π)d∣D∣1exp(−0.5(x−Az)TD−1(x−Az))
- We can get a close form solution: A = X Z + A = XZ^{+} A=XZ+
- But we don’t have Z Z Z => missing
With incomplete information
- Initialize the plane
- Complete the data by computing the appropriate z z z for the plane
- P ( z ∣ X ; A ) P(z|X;A) P(z∣X;A) is a delta, because E E E is orthogonal to A A A
- Reestimate the plane using the z z z
- Iterate
Linear Gaussian Model
- PCA assumes the noise is always orthogonal to the data
- Not always true
- The noise added to the output of the encoder can lie in any direction (uncorrelated)
- We want a generative model: to generate any point
- Take a Gaussian step on the hyperplane
- Add full-rank Gaussian uncorrelated noise that is independent of the position on the hyperplane
- Uncorrelated: diagonal covariance matrix
- Direction of noise is unconstrained
With complete information
x = A z + e x=A z+e x=Az+