机器学习-白板推导 P5_6 (P-PCA)

机器学习-白板推导 P5_6

P-PCA

x ∈ R p z ∈ R q q &lt; p x \in R^p \quad z \in R^q \quad q&lt;p xRpzRqq<p

x , o b s e r v e &ThickSpace; d a t a x,observe\;data x,observedata
z , l a t e n t &ThickSpace; v a r i a b l e z,latent\;variable z,latentvariable

降维的目的是从 p p p维降到 q q q维。

z z z的先验:
z ∼ N ( 0 q , I q ) z \sim N(0_q,I_q) zN(0q,Iq)
x = w z + μ + ϵ x = wz + \mu + \epsilon x=wz+μ+ϵ
ϵ ∼ N ( 0 , σ 2 I p ) \epsilon \sim N(0,\sigma^2I_p) ϵN(0,σ2Ip)

σ 2 I p = [ σ 2 0 . . . 0 0 σ 2 . . . 0 ⋮ ⋮ ⋱ ⋮ 0 0 . . . σ 2 ] 各 向 同 性 \sigma^2 I_p= \begin{bmatrix}\sigma^2&amp; 0 &amp;...&amp; 0 \\ 0 &amp; \sigma^2&amp;...&amp; 0 \\ \vdots &amp; \vdots &amp; \ddots &amp; \vdots \\ 0 &amp; 0 &amp;...&amp;\sigma^2 \\ \end{bmatrix} \quad 各向同性 σ2Ip=σ2000σ20.........00σ2

L i n e a r &ThickSpace; G a u s s i o n &ThickSpace; M o d e l Linear\;Gaussion\;Model LinearGaussionModel

P − P C A = { I n f e r e n c e , p ( z ∣ x ) L e a r i n g , w , μ , σ 2 → E M P-PCA= \begin{cases} Inference, &amp; \text {$p(z|x)$} \\ Learing, &amp; \text{$w,\mu,\sigma^2 \rightarrow EM$} \end{cases} PPCA={Inference,Learing,p(zx)w,μ,σ2EM

在这里插入图片描述

{ z ∼ N ( 0 , I ) x = w z + μ + ϵ ϵ ∼ N ( 0 , σ 2 I ) ϵ ⊥ z E [ x ∣ z ] = E [ w z + μ + ϵ ] = w z + μ V a r [ x ∣ z ] = V a r [ w z + μ + ϵ ] = σ 2 I x ∣ z ∼ N ( w z + u , σ 2 I ) \begin{cases} z \sim N(0,I) \\ x = wz + \mu + \epsilon \\ \epsilon \sim N(0,\sigma^2I) \\ \epsilon \bot z \\ E[x|z]=E[wz+\mu+\epsilon]=wz+\mu\\ Var[x|z]=Var[wz+\mu+\epsilon]=\sigma^2I \\ x|z \sim N(wz+u,\sigma^2I) \end{cases} zN(0,I)x=wz+μ+ϵϵN(0,σ2I)ϵzE[xz]=E[wz+μ+ϵ]=wz+μVar[xz]=Var[wz+μ+ϵ]=σ2IxzN(wz+u,σ2I)
E [ x ] = E [ w z + μ + ϵ ] = E [ w z + μ ] + E [ ϵ ] = μ E[x]=E[wz+\mu+\epsilon]=E[wz+\mu]+E[\epsilon]=\mu E[x]=E[wz+μ+ϵ]=E[wz+μ]+E[ϵ]=μ
V a r [ x ] = V a r [ w z + μ + ϵ ] = V a r [ w z ] + V a r [ ϵ ] = w I w T + σ 2 I = w w T + σ 2 I Var[x]=Var[wz+\mu+\epsilon]=Var[wz]+Var[\epsilon]=wIw^T+\sigma^2I=ww^T+\sigma^2I Var[x]=Var[wz+μ+ϵ]=Var[wz]+Var[ϵ]=wIwT+σ2I=wwT+σ2I
x ∼ N ( μ , w w T + σ 2 ) x \sim N(\mu,ww^T+\sigma^2) xN(μ,wwT+σ2)

之前的公式:
x = [ x a x b ] μ = [ μ 0 μ 1 ] Σ = [ Σ a a Σ a b Σ b a Σ b b ] x= \begin{bmatrix} x_{a} \\ x_{b} \end{bmatrix} \qquad \mu= \begin{bmatrix} \mu_{0} \\ \mu_{1} \end{bmatrix} \qquad \Sigma= \begin{bmatrix} \Sigma_{aa} &amp; \Sigma_{ab} \\ \Sigma_{ba} &amp; \Sigma_{bb} \end{bmatrix} x=[xaxb]μ=[μ0μ1]Σ=[ΣaaΣbaΣabΣbb]
已知: x ∼ N ( μ , Σ ) x \sim N(\mu, \Sigma) xN(μ,Σ)

x b . a = x b − Σ b a Σ a a − 1 x a x_{b.a}=x_b-\Sigma_{ba}\Sigma_{aa}^{-1}x_a xb.a=xbΣbaΣaa1xa
μ b . a = μ b − Σ b a Σ a a − 1 μ a \mu_{b.a}=\mu_b-\Sigma_{ba}\Sigma_{aa}^{-1}\mu_a μb.a=μbΣbaΣaa1μa
Σ b b . a = Σ b b − Σ b a Σ a a − 1 Σ a b \Sigma_{bb.a}=\Sigma_{bb}-\Sigma_{ba}\Sigma_{aa}^{-1}\Sigma_{ab} Σbb.a=ΣbbΣbaΣaa1Σab \qquad schur complementary

x b = x b . a + Σ b a Σ a a − 1 x a x_{b}=x_{b.a}+\Sigma_{ba}\Sigma_{aa}^{-1}x_a xb=xb.a+ΣbaΣaa1xa

E [ x b ∣ x a ] = μ b . a + Σ b a Σ a a − 1 x a E[x_b|x_a]=\mu_{b.a}+\Sigma_{ba}\Sigma_{aa}^{-1}x_a E[xbxa]=μb.a+ΣbaΣaa1xa
V a r [ x b ∣ x a ] = V a r [ x b . a ] = Σ b b . a Var[x_b|x_a]=Var[x_{b.a}]=\Sigma_{bb.a} Var[xbxa]=Var[xb.a]=Σbb.a

x b ∣ x a ∼ N ( μ b . a + Σ b a Σ a a − 1 x a , Σ b b . a ) x_b|x_a \sim N(\mu_{b.a}+\Sigma_{ba}\Sigma_{aa}^{-1}x_a,\Sigma_{bb.a}) xbxaN(μb.a+ΣbaΣaa1xa,Σbb.a)

推导:
[ x z ] ∼ ( [ μ 0 ] [ O Δ Δ T I ] ) \begin{bmatrix} x \\ z \end{bmatrix} \sim \left( \begin{bmatrix} \mu \\ 0 \end{bmatrix} \begin{bmatrix} O &amp; \Delta \\ \Delta^T &amp; I \end{bmatrix} \right) [xz]([μ0][OΔTΔI])

Δ = C o v ( x , z ) = E [ ( x − μ ) ( z − 0 ) ] = E [ ( x − μ ) Σ T ] = E [ ( w z + ϵ ) Σ T ] = E [ w z Σ T + ϵ Σ T ] = w E [ z Σ T ] + E [ ϵ ] ⋅ E [ Σ T ] = w ⋅ I + 0 = w \begin{aligned} \Delta &amp;=Cov(x,z) \\ &amp;=E[(x-\mu)(z-0)] \\ &amp;=E[(x-\mu)\Sigma^T] \\ &amp;=E[(wz+\epsilon )\Sigma^T] \\ &amp;=E[wz\Sigma^T + \epsilon\Sigma^T ] \\ &amp;=wE[z\Sigma^T] +E[\epsilon]\cdot E[\Sigma^T] \\ &amp;=w\cdot I+0 \\ &amp;=w \end{aligned} Δ=Cov(x,z)=E[(xμ)(z0)]=E[(xμ)ΣT]=E[(wz+ϵ)ΣT]=E[wzΣT+ϵΣT]=wE[zΣT]+E[ϵ]E[ΣT]=wI+0=w

[ x z ] ∼ ( [ μ 0 ] [ O Δ Δ T I ] ) = ( [ μ 0 ] [ w w T + σ 2 I w w T I ] ) \begin{bmatrix} x \\ z \end{bmatrix} \sim \left( \begin{bmatrix} \mu \\ 0 \end{bmatrix} \begin{bmatrix} O &amp; \Delta \\ \Delta^T &amp; I \end{bmatrix} \right) = \left( \begin{bmatrix} \mu \\ 0 \end{bmatrix} \begin{bmatrix} ww^T+\sigma^2I &amp; w \\ w^T &amp; I \end{bmatrix} \right) [xz]([μ0][OΔTΔI])=([μ0][wwT+σ2IwTwI])

[ x z ] ∼ N ( μ ^ , Σ ^ ) \begin{bmatrix} x \\ z \end{bmatrix} \sim N(\hat{\mu},\hat{\Sigma}) [xz]N(μ^,Σ^)
通过之前的公式能很简单的求出 z ∣ x z|x zx

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值