高斯分布基础知识详解

目录

高斯分布

linear Gaussian model

$z_{t}=Az_{t-1}+B+\epsilon$其中$\epsilon$为噪音

极大似然估计

  • 已知
    X = ( x 1 , x 2 , x 3 , ⋯ x N ) T X=(\mathrm{x}_{1}, \mathrm{x}_{2}, \mathrm{x}_{3}, \cdots \mathrm{x}_{N})^{T} X=(x1,x2,x3,xN)T X i ∈ R X_{i}\in\R XiR
  • 推导 X i ∼ N ( μ , σ 2 ) X_{i}\sim\N(\mu,\sigma^2) XiN(μ,σ2) X i ∈ R P X_{i}\in\R^{P} XiRP

l o g P ( X ∣ θ ) = l o g ∏ i = 1 N P ( X ∣ θ ) = ∑ i = 1 N l o g P ( x i ∣ θ ) = ∑ i = 1 N [ log ⁡ 1 2 π + log ⁡ 1 σ − ( x i − μ ) 2 2 σ 2 ] log\mathrm{P}(X|\theta)=log\prod_{i=1}^{N}\mathrm{P}(X|\theta)=\sum_{i=1}^{N}log\mathrm{P}(x_{i}|\theta)=\sum_{i=1}^{N}[\log\frac{1}{\sqrt{2\pi}}+\log\frac{1}{\sigma}-\frac{(x_{i}-\mu)^2}{2\sigma^2}] logP(Xθ)=logi=1NP(Xθ)=i=1NlogP(xiθ)=i=1N[log2π 1+logσ12σ2(xiμ)2]

μ M L E = arg ⁡ max ⁡ μ log ⁡ P ( X ∣ θ ) = arg ⁡ min ⁡ μ ∑ i = 1 N ( x i − μ ) 2 \mu_{MLE}=\underset{\mu}{\arg\max}\log\mathrm{P}(X|\theta)=\underset{\mu}{\arg\min}\sum_{i=1}^{N}(x_{i}-\mu)^2 μMLE=μargmaxlogP(Xθ)=μargmini=1N(xiμ)2

得到 μ M L E = 1 N ∑ i = 1 N x i \mu_{MLE}=\frac{1}{N}\sum_{i=1}^{N}x_{i} μMLE=N1i=1Nxi

同理 σ M L E 2 = 1 N ∑ i = 1 N ( x i − μ M L E ) 2 \sigma_{MLE}^{2}=\frac{1}{N}\sum_{i=1}^{N}(x_{i}-\mu_{MLE})^{2} σMLE2=N1i=1N(xiμMLE)2

  • μ M L E \mu_{MLE} μMLE叫无偏估计
  • σ M L E 2 \sigma_{MLE}^{2} σMLE2叫有偏估计( E [ σ M L E 2 ] = N − 1 N σ 2 E[\sigma_{MLE}^{2}]=\frac{N-1}{N}\sigma^{2} E[σMLE2]=NN1σ2
  • 无偏 σ ^ = 1 N − 1 ∑ i = 1 N ( x i − μ ) 2 \hat{\sigma}=\frac{1}{N-1}\sum_{i=1}^{N}(x_{i}-\mu)^2 σ^=N11i=1N(xiμ)2
    • 推导 E [ σ M L E 2 ] = E [ 1 N ∑ i = 1 N ( x i 2 − 2 x i μ M L E + μ M L E 2 ) ] = E [ 1 N ∑ i = 1 N ( x i 2 − 2 μ M L E 2 + μ M L E 2 ) ] = E [ 1 N ∑ i = 1 N x i 2 − μ 2 ] − E [ μ M L E 2 − μ 2 ) ] = σ 2 − ( E [ μ M L E 2 ] − E 2 [ μ M L E ] ) = N − 1 N σ 2 E[\sigma_{MLE}^{2}]=E[\frac{1}{N}\sum_{i=1}^{N}(x_{i}^{2}-2x_{i}\mu_{MLE}+\mu_{MLE}^{2})]=E[\frac{1}{N}\sum_{i=1}^{N}(x_{i}^{2}-2\mu_{MLE}^{2}+\mu_{MLE}^{2})]=E[\frac{1}{N}\sum_{i=1}^{N}x_{i}^{2}-\mu^{2}]-E[\mu_{MLE}^{2}-\mu^{2})]=\sigma^{2}-(E[\mu_{MLE}^{2}]-E^{2}[\mu_{MLE}])=\frac{N-1}{N}\sigma^{2} E[σMLE2]=E[N1i=1N(xi22xiμMLE+μMLE2)]=E[N1i=1N(xi22μMLE2+μMLE2)]=E[N1i=1Nxi2μ2]E[μMLE2μ2)]=σ2(E[μMLE2]E2[μMLE])=NN1σ2
  • 极大似然估计方差偏小

多维高斯分布

  • x ∼ N ( μ , ∑ ) = 1 ( 2 π ) P 2 ∣ ∑ ∣ 1 2 e x p ( − 1 2 ( x − μ ) T ∑ − 1 ( x − μ ) ) x\sim\N(\mu,\sum)=\frac{1}{(2\pi)^{\frac{P}{2}}|\sum|^{\frac{1}{2}}}exp(-\frac{1}{2}(x-\mu)^{T}\sum^{-1}(x-\mu)) xN(μ,)=(2π)2P211exp(21(xμ)T1(xμ))
    其中 μ = ( μ 1 , μ 2 , ⋯   , μ P ) T \mu=(\mu_{1},\mu_{2},\cdots,\mu_{P})^{T} μ=(μ1,μ2,,μP)T, ∑ = ( σ 11 ⋯ σ 1 P ⋮ ⋮ ⋮ σ P 1 ⋯ σ P P ) \sum=\left(\begin{array}{ll}{\sigma_{11}}&{\cdots}&{\sigma_{1P}}\\{\vdots}&{\vdots}&{\vdots}\\{\sigma_{P1}}&{\cdots}&{\sigma_{PP}} \end{array}\right) =σ11σP1σ1PσPP`(方差矩阵)为正交的且为半正定(一般)

  • ( x − μ ) T ∑ − 1 ( x − μ ) (x-\mu)^{T}\sum^{-1}(x-\mu) (xμ)T1(xμ):马氏距离(x与 μ \mu μ的距离)

    • ∑ = I \sum=I =I,马氏距离=欧氏距离
    • example:
      • z 1 = ( z 11 , z 12 ) , z 2 = ( z 21 , z 22 ) z_{1}=(z_{11},z_{12}),z_{2}=(z_{21},z_{22}) z1=(z11,z12),z2=(z21,z22), ( z 1 − z 2 ) T ∑ − 1 ( z 1 − z 2 ) = ( z 11 − z 21 ) 2 − ( z 12 − z 22 ) 2 (z_{1}-z_{2})^{T}\sum^{-1}(z_{1}-z_{2})=(z_{11}-z_{21})^{2}-(z_{12}-z_{22})^{2} (z1z2)T1(z1z2)=(z11z21)2(z12z22)2为欧氏距离
    • ∑ = U λ U T = U T U = 1 \sum=U\lambda U^{T}=U^{T}U=1 =UλUT=UTU=1,其中 U U T = U T U = 1 UU^{T}=U^{T}U=1 UUT=UTU=1 λ \lambda λ为特征值矩阵
      • ∑ = U λ U T = U T U = 1 = ∑ i = 1 P u i λ u i T \sum=U\lambda U^{T}=U^{T}U=1=\sum_{i=1}^{P}u_{i}\lambda u_{i}^{T} =UλUT=UTU=1=i=1PuiλuiT
      • ∑ − 1 = ( U λ U T ) − 1 = ∑ i = 1 P u i λ − 1 u i T \sum^{-1}=(U\lambda U^{T})^{-1}=\sum_{i=1}^{P}u_{i}\lambda^{-1} u_{i}^{T} 1=(UλUT)1=i=1Puiλ1uiT
    • Δ = ( x − μ ) T ∑ − 1 ( x − μ ) = ∑ i = 1 P ( x − μ ) T u i λ − 1 u i T ( x − μ ) = ∑ i = 1 P y i 1 λ i y i T = ∑ i = 1 P y i 2 λ i \Delta=(x-\mu)^{T}\sum^{-1}(x-\mu)=\sum_{i=1}^{P}(x-\mu)^{T}u_{i}\lambda^{-1} u_{i}^{T}(x-\mu)=\sum_{i=1}^{P}y_{i}\frac{1}{\lambda_{i}}y_{i}^{T}=\sum_{i=1}^{P}\frac{y_{i}^{2}}{\lambda_{i}} Δ=(xμ)T1(xμ)=i=1P(xμ)Tuiλ1uiT(xμ)=i=1Pyiλi1yiT=i=1Pλiyi2,其中 y i = ( x − μ ) T μ i y_{i}=(x-\mu)^{T}\mu_{i} yi=(xμ)Tμi
      • p = 2 p=2 p=2, Δ = y 1 2 λ 1 + y 2 2 λ 1 = r i \Delta=\frac{y_{1}^{2}}{\lambda_{1}}+\frac{y_{2}^{2}}{\lambda_{1}}=r_{i} Δ=λ1y12+λ1y22=ri,便可以将其看作是椭圆,之不过是变换轴,这样就会出现等高线,即可与二维的高斯分布对比。
  • ∑ p × p \sum_{p\times p} p×p

    • p 2 − p 2 + p = p 2 + p 2 \frac{p^{2}-p}{2}+p=\frac{p^{2}+p}{2} 2p2p+p=2p2+p个参数即有 p ( p + 1 ) 2 = O ( p 2 ) \frac{p(p+1)}{2}=O(p^{2}) 2p(p+1)=O(p2)
  • 局限性:如果样本点拟合成两个高斯分布更准确,但是实际中使用高斯则是用一个大的高斯去拟合,这样就会存在造成较大误差。

已知高斯分布求边缘和条件高斯分布

  • 已知 x = ( x a x b ) x=\left(\begin{array}{ll}{x_{a}}\\{x_{b}}\end{array}\right) x=(xaxb), μ = ( μ a μ b ) \mu=\left(\begin{array}{ll}{\mu_{a}}\\{\mu_{b}}\end{array}\right) μ=(μaμb), ∑ = ( σ a a σ a b σ b a σ b b ) \sum=\left(\begin{array}{ll}{\sigma_{aa}}&{\sigma_{ab}}\\{\sigma_{ba}}&{\sigma_{bb}}\end{array}\right) =(σaaσbaσabσbb),其中 a + b = p a+b=p a+b=p。求 P ( x a ) P(x_{a}) P(xa), P ( x b ∣ x a ) P(x_{b}|x_{a}) P(xbxa), P ( x b ) P(x_{b}) P(xb), P ( x a ∣ x b ) P(x_{a}|x_{b}) P(xaxb)
  • 配方法(prml)
  • 定理: x ∼ N ( μ , σ 2 ) x\sim N(\mu,\sigma^{2}) xN(μ,σ2), y = A X + B y=AX+B y=AX+B;结论: y ∼ N ( A μ + B , A ∑ A T ) y\sim N(A\mu+B,A\sum A^{T}) yN(Aμ+B,AAT)
    • x a = ( I m , 0 ) ( x a x b ) x_{a}=(I_{m},0)\left(\begin{array}{ll}{x_{a}}\\{x_{b}}\end{array}\right) xa=(Im,0)(xaxb)`
      • ` E ( x a ) = E ( ( I m , 0 ) ( x a x b ) ) = μ a E(x_{a})=E((I_{m},0)\left(\begin{array}{ll}{x_{a}}\\{x_{b}}\end{array}\right))=\mu_{a} E(xa)=E((Im,0)(xaxb))=μa
      • ` v a r ( x a ) = v a r ( ( I m , 0 ) ( ∑ a a ∑ a b ∑ b a ∑ b b ) ) = σ a a var(x_{a})=var((I_{m},0)\left(\begin{array}{ll}{\sum_{aa}}&{\sum_{ab}}\\{\sum_{ba}}&{\sum_{bb}}\end{array}\right))=\sigma_{aa} var(xa)=var((Im,0)(aabaabbb))=σaa
      • ` x a ∼ N ( μ a , ∑ a a ) x_{a}\sim N(\mu_{a},\sum_{aa}) xaN(μa,aa)
    • P ( x a ∣ x b ) P(x_{a}|x_{b}) P(xaxb)
      • 构造 x b − a = x b − x b b ∑ − 1 x a x_{b-a}=x_{b}-x_{bb}\sum^{-1}x_{a} xba=xbxbb1xa, μ b − a = μ b − μ b b ∑ − 1 μ a \mu_{b-a}=\mu_{b}-\mu_{bb}\sum^{-1}\mu_{a} μba=μbμbb1μa, ∑ b − a = ∑ b b − ∑ b a ∑ − 1 ∑ a b \sum_{b-a}=\sum_{bb}-\sum_{ba}\sum^{-1}\sum_{ab} ba=bbba1ab`
      • E ( x b ∣ x a ) = E ( x a ) E(x_{b}|x_{a})= E(x_{a}) E(xbxa)=E(xa)=

已知边缘概率密度求联合

  • 已知 p ( x ) = N ( x ∣ μ , λ − 1 ) p(x)=N(x|\mu,\lambda^{-1}) p(x)=N(xμ,λ1), p ( x ∣ y ) = N ( y ∣ A x + B , L − 1 ) p(x|y)=N(y|Ax+B,L^{-1}) p(xy)=N(yAx+B,L1)。求 p ( y ) p(y) p(y), p ( x ∣ y ) p(x|y) p(xy)
    • 对于线性高斯 y = A X + B + ϵ , ϵ ∼ N ( 0 , L − 1 ) , ϵ 与 x 独 立 y=AX+B+\epsilon,\epsilon\sim N(0,L^{-1}),\epsilon与x独立 y=AX+B+ϵ,ϵN(0,L1),ϵx
    • E [ y ] = E [ A x + B + ϵ ] = A μ + B E[y]=E[Ax+B+\epsilon ]=A\mu+B E[y]=E[Ax+B+ϵ]=Aμ+B, v a r [ Y ] = v a r [ A X + B + ϵ ] = v a r [ A X + B ] + v a r [ ϵ ] = A λ − 1 A T + L − 1 var[Y]=var[AX+B+\epsilon]=var[AX+B]+var[\epsilon]=A\lambda^{-1}A^{T}+L^{-1} var[Y]=var[AX+B+ϵ]=var[AX+B]+var[ϵ]=Aλ1AT+L1,得到 y ∼ N ( A μ + B , A λ − 1 A T + L − 1 ) y\sim N(A\mu+B,A\lambda^{-1}A^{T}+L^{-1}) yNAμ+B,Aλ1AT+L1
    • z = ( x y ) ∼ N ( [ μ A μ + b ] , [ λ − 1 Δ Δ L − 1 + A λ − 1 A T + L − 1 ] ) z=\left(\begin{array}{ll}{x}\\{y}\end{array}\right)\sim N\left(\left[\begin{array}{ll}{\mu}\\{A\mu+b}\end{array}\right],\left[\begin{array}{ll}{\lambda^{-1}} & {\Delta} \\ {\Delta} & {L^{-1}+A\lambda^{-1}A^{T}+L^{-1}}\end{array}\right]\right) z=(xy)N([μAμ+b],[λ1ΔΔL1+Aλ1AT+L1]),其中 Δ = c o v ( x , y ) = E [ ( X − μ ) ( y − E [ y ] ) T ] = E [ ( X − μ ) ( A X − A μ + ϵ ) ] = E [ ( X − μ ) ( X − μ ) T A T ] = E [ ( X − μ ) ( X − μ ) T ] A T = V a r [ x ] A T = λ − 1 A T \Delta=cov(x,y)=E[(X-\mu)(y-E[y])^{T}]=E[(X-\mu)(AX-A\mu+\epsilon)]=E[(X-\mu)(X-\mu)^{T}A^{T}]=E[(X-\mu)(X-\mu)^{T}]A^{T}=Var[x]A^{T}=\lambda^{-1}A^{T} Δ=cov(x,y)=E[(Xμ)(yE[y])T]=E[(Xμ)(AXAμ+ϵ)]=E[(Xμ)(Xμ)TAT]=E[(Xμ)(Xμ)T]AT=Var[x]AT=λ1AT
      -[x] p ( x ∣ y ) ∼ p(x|y)\sim p(xy)
  • 6
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

pinn山里娃

原创不易请多多支持

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值