【机器学习算法】高斯判别分析GDA

高斯判别分析

  高斯判别分析(Gaussian discriminative analysis )是一个较为直观的模型,属于生成模型的一种,采用一种软分类的思路,所谓软分类就是我们对一个样本决定它的类别时使用概率模型来决定,而不是直接由函数映射到某一类上。生成模型通过求解联合概率来求解 P ( y ∣ x ) P(y|x) P(yx)。它假设
y ∼ B e r n o u l l i ( ϕ ) x ∣ y = 1 ∼ N ( μ 1 , Σ ) x ∣ y = 0 ∼ N ( μ 2 , Σ ) y \sim Bernoulli(\phi) \\ x|y=1 \sim N(\mu_1,\Sigma) \\ x|y=0 \sim N(\mu_2,\Sigma) yBernoulli(ϕ)xy=1N(μ1,Σ)xy=0N(μ2,Σ)
  则有
P ( y ) = ϕ y ( 1 − ϕ ) 1 − y P ( x ∣ y ) = N ( μ 1 , Σ ) y ⋅ N ( μ 2 , Σ ) 1 − y \begin{aligned} &P(y)=\phi^y(1-\phi)^{1-y} \\ &P(x|y)=N(\mu_1,\Sigma)^y·N(\mu_2,\Sigma)^{1-y} \end{aligned} P(y)=ϕy(1ϕ)1yP(xy)=N(μ1,Σ)yN(μ2,Σ)1y
  模型的参数为
θ = ( μ 1 , μ 2 , Σ , ϕ ) \theta=(\mu_1,\mu_2,\Sigma,\phi) θ=(μ1,μ2,Σ,ϕ)
  对于生成模型,我们要求解的目标函数是
y ^ = arg ⁡ max ⁡ y ∈ { 0 , 1 } p ( y ∣ x ) = arg ⁡ max ⁡ y p ( y ) p ( x ∣ y ) \hat y=\arg \max_{y \in \{0,1\}}p(y|x)=\arg \max_yp(y)p(x|y) y^=argy{0,1}maxp(yx)=argymaxp(y)p(xy)
  定义似然函数,则
θ ^ = arg ⁡ max ⁡ θ l ( θ ) = arg ⁡ max ⁡ θ log ⁡ ∏ i = 1 N p ( x i , y i ) = arg ⁡ max ⁡ θ log ⁡ ∏ i = 1 N p ( y i ) p ( x i ∣ y i ) = arg ⁡ max ⁡ θ ∑ i = 1 N ( log ⁡ N ( μ 1 , Σ ) y i + log ⁡ N ( μ 2 , Σ ) 1 − y i + log ⁡ ϕ y i ( 1 − ϕ ) 1 − y i ) \begin{aligned} \hat \theta &=\arg \max_\theta l(\theta) \\ &=\arg \max_\theta \log \prod_{i=1}^Np(x_i,y_i) \\ &=\arg \max_\theta \log \prod_{i=1}^Np(y_i)p(x_i|y_i) \\ &=\arg \max_\theta \sum_{i=1}^N(\log N(\mu_1,\Sigma)^{y_i} \\&+\log N(\mu_2,\Sigma)^{1-y_i}+\log \phi^{y_i}(1-\phi)^{1-y_i})\\ \end{aligned} θ^=argθmaxl(θ)=argθmaxlogi=1Np(xi,yi)=argθmaxlogi=1Np(yi)p(xiyi)=argθmaxi=1N(logN(μ1,Σ)yi+logN(μ2,Σ)1yi+logϕyi(1ϕ)1yi)

  • ϕ \phi ϕ
    ∂ l ( θ ) ∂ ϕ = ∑ i = 1 N y i 1 ϕ − ( 1 − y i ) 1 1 − ϕ = 0    ⟺    ∑ i = 1 N y i ( 1 − ϕ ) − ( 1 − y i ) ϕ = 0    ⟺    ∑ i = 1 N ( y i − ϕ ) = 0    ⟺    ∑ i = 1 N y i − N ϕ = 0    ⟺    ϕ ^ = 1 N ∑ i = 1 N y i = N 1 N \begin{aligned} &\frac{\partial l(\theta)}{\partial \phi}=\sum_{i=1}^Ny_i\frac{1}{ \phi}-(1-y_i)\frac{1}{1-\phi} = 0 \\ &\iff \sum_{i=1}^Ny_i(1-\phi)-(1-y_i)\phi=0 \\ &\iff \sum_{i=1}^N(y_i-\phi)=0 \\ &\iff \sum_{i=1}^Ny_i-N\phi=0 \\ &\iff \hat \phi =\frac{1}{N}\sum_{i=1}^Ny_i =\frac{N_1}{N}\\ \end{aligned} ϕl(θ)=i=1Nyiϕ1(1yi)1ϕ1=0i=1Nyi(1ϕ)(1yi)ϕ=0i=1N(yiϕ)=0i=1NyiNϕ=0ϕ^=N1i=1Nyi=NN1
  • μ 1 , μ 2 \mu_1,\mu_2 μ1,μ2
      两个的求解过程其实是相同的,所以我们直接求解 μ 1 \mu_1 μ1,由于我们只对 μ 1 \mu_1 μ1求解,所以原式可以化简为
    ∑ i = 1 N y i log ⁡ 1 ( 2 π ) p 2 ∣ Σ ∣ 1 2 exp ⁡ ( − 1 2 ( x i − μ 1 ) T Σ − 1 ( x i − μ 1 ) ) = ∑ i = 1 N y i log ⁡ 1 ( 2 π ) p 2 ∣ Σ ∣ 1 2 exp ⁡ ( − 1 2 ( x i T Σ − 1 − μ 1 T Σ − 1 ) ( x i − μ 1 ) ) = ∑ i = 1 N y i log ⁡ 1 ( 2 π ) p 2 ∣ Σ ∣ 1 2 exp ⁡ ( − 1 2 ( x i T Σ − 1 x i − 2 μ 1 T Σ − 1 x i + μ 1 T Σ − 1 μ 1 ) ) \begin{aligned} &\sum_{i=1}^Ny_i\log \frac{1}{(2\pi)^{\frac{p}{2}}|\Sigma|^{\frac{1}{2}}}\exp(-\frac{1}{2}(x_i-\mu_1)^T\Sigma^{-1}(x_i-\mu_1)) \\ &=\sum_{i=1}^Ny_i\log \frac{1}{(2\pi)^{\frac{p}{2}}|\Sigma|^{\frac{1}{2}}}\exp(-\frac{1}{2}(x_i^T\Sigma^{-1}-\mu_1^T\Sigma^{-1})(x_i-\mu_1))\\ &=\sum_{i=1}^Ny_i\log \frac{1}{(2\pi)^{\frac{p}{2}}|\Sigma|^{\frac{1}{2}}}\exp(-\frac{1}{2}(x_i^T\Sigma^{-1}x_i-2\mu_1^T\Sigma^{-1}x_i+\mu_1^T\Sigma^{-1}\mu_1)) \end{aligned} i=1Nyilog(2π)2pΣ211exp(21(xiμ1)TΣ1(xiμ1))=i=1Nyilog(2π)2pΣ211exp(21(xiTΣ1μ1TΣ1)(xiμ1))=i=1Nyilog(2π)2pΣ211exp(21(xiTΣ1xi2μ1TΣ1xi+μ1TΣ1μ1))
      对上式求导并令导数为0,有
    − 1 2 ∑ i = 1 N y i ( − 2 Σ − 1 x i + 2 Σ − 1 μ 1 ) = 0    ⟺    ∑ i = 1 N y i ( Σ − 1 μ 1 − Σ − 1 x i ) = 0    ⟺    ∑ i = 1 N y i ( μ 1 − x i ) = 0    ⟺    ∑ i = 1 N y i μ 1 = ∑ i = 1 N y i x i    ⟺    μ ^ 1 = ∑ i = 1 N y i x i ∑ i = 1 N y i = ∑ i = 1 N y i x i N 1 \begin{aligned} &-\frac{1}{2}\sum_{i=1}^Ny_i(-2\Sigma^{-1}x_i+2\Sigma^{-1}\mu_1)=0 \\ &\iff \sum_{i=1}^Ny_i(\Sigma^{-1}\mu_1-\Sigma^{-1}x_i)=0 \\ &\iff \sum_{i=1}^Ny_i(\mu_1-x_i)=0 \\ &\iff \sum_{i=1}^Ny_i\mu_1=\sum_{i=1}^Ny_ix_i \\ &\iff \hat \mu_1=\frac{\sum\limits_{i=1}^Ny_ix_i}{\sum\limits_{i=1}^Ny_i}=\frac{\sum\limits_{i=1}^Ny_ix_i}{N_1} \\ \end{aligned} 21i=1Nyi(2Σ1xi+2Σ1μ1)=0i=1Nyi(Σ1μ1Σ1xi)=0i=1Nyi(μ1xi)=0i=1Nyiμ1=i=1Nyixiμ^1=i=1Nyii=1Nyixi=N1i=1Nyixi
      同理可得
    μ ^ 2 = ∑ i = 1 N ( 1 − y i ) x i ∑ i = 1 N ( 1 − y i ) = ∑ i = 1 N ( 1 − y i ) x i N 2 \hat \mu_2=\frac{\sum\limits_{i=1}^N(1-y_i)x_i}{\sum\limits_{i=1}^N(1-y_i)}=\frac{\sum\limits_{i=1}^N(1-y_i)x_i}{N_2} μ^2=i=1N(1yi)i=1N(1yi)xi=N2i=1N(1yi)xi
  • Σ \Sigma Σ:
      尝试对通项 log ⁡ N ( μ , Σ ) \log N(\mu,\Sigma) logN(μ,Σ)进行化简,有
    ∑ i = 1 N log ⁡ N ( μ , Σ ) = ∑ i = 1 N log ⁡ 1 ( 2 π ) p 2 ∣ Σ ∣ 1 2 exp ⁡ ( − 1 2 ( x i − μ ) T Σ − 1 ( x i − μ ) ) = ∑ i = 1 N ( log ⁡ 1 ( 2 π ) p 2 + ∣ Σ ∣ − 1 2 − 1 2 ( x i − μ ) T Σ − 1 ( x i − μ ) ) = ∑ i = 1 N ( C − 1 2 log ⁡ ∣ Σ ∣ − 1 2 ( x i − μ ) T Σ − 1 ( x i − μ ) ) = C − 1 2 N log ⁡ ∣ Σ ∣ − 1 2 t r ( ∑ i = 1 N ( x i − μ ) T Σ − 1 ( x i − μ ) ) = C − 1 2 N log ⁡ ∣ Σ ∣ − 1 2 t r ( ∑ i = 1 N ( x i − μ ) ( x i − μ ) T Σ − 1 ) = − 1 2 N log ⁡ ∣ Σ ∣ − 1 2 t r ( S Σ − 1 ) + C \begin{aligned} \sum_{i=1}^N\log N(\mu,\Sigma) &=\sum_{i=1}^N \log \frac{1}{(2\pi)^{\frac{p}{2}}|\Sigma|^{\frac{1}{2}}}\exp (-\frac{1}{2}(x_i-\mu)^T\Sigma^{-1}(x_i-\mu)) \\ &=\sum_{i=1}^N(\log \frac{1}{(2\pi)^{\frac{p}{2}}}+|\Sigma|^{-\frac{1}{2}}-\frac{1}{2}(x_i-\mu)^T\Sigma^{-1}(x_i-\mu)) \\ &=\sum_{i=1}^N(C-\frac{1}{2}\log|\Sigma|-\frac{1}{2}(x_i-\mu)^T\Sigma^{-1}(x_i-\mu))\\ &=C-\frac{1}{2}N\log |\Sigma|-\frac{1}{2}tr(\sum_{i=1}^N(x_i-\mu)^T\Sigma^{-1}(x_i-\mu))\\ &=C-\frac{1}{2}N\log |\Sigma|-\frac{1}{2}tr(\sum_{i=1}^N(x_i-\mu)(x_i-\mu)^T\Sigma^{-1})\\ &=-\frac{1}{2}N\log |\Sigma|-\frac{1}{2}tr(S\Sigma^{-1})+C\\ \end{aligned} i=1NlogN(μ,Σ)=i=1Nlog(2π)2pΣ211exp(21(xiμ)TΣ1(xiμ))=i=1N(log(2π)2p1+Σ2121(xiμ)TΣ1(xiμ))=i=1N(C21logΣ21(xiμ)TΣ1(xiμ))=C21NlogΣ21tr(i=1N(xiμ)TΣ1(xiμ))=C21NlogΣ21tr(i=1N(xiμ)(xiμ)TΣ1)=21NlogΣ21tr(SΣ1)+C
      由于只需要对 Σ \Sigma Σ求解,所以对似然函数化简为
    ∑ i = 1 N ( y i log ⁡ N ( μ 1 , Σ ) + ( 1 − y i ) log ⁡ N ( μ 2 , Σ ) ) = ∑ x i ∈ c 1 log ⁡ N ( μ 1 , Σ ) + ∑ x i ∈ c 2 log ⁡ N ( μ 2 , Σ ) = − 1 2 N 1 log ⁡ ∣ Σ ∣ − 1 2 t r ( S 1 Σ − 1 ) − 1 2 N 2 log ⁡ ∣ Σ ∣ − 1 2 N 2 t r ( S 2 Σ − 1 ) + C = − 1 2 ( N 1 log ⁡ ∣ Σ ∣ + N 1 t r ( S 1 Σ − 1 ) + N 2 log ⁡ ∣ Σ ∣ + N 2 t r ( S 2 Σ − 1 ) ) + C \begin{aligned} &\sum_{i=1}^N(y_i\log N(\mu_1,\Sigma) +(1-y_i)\log N(\mu_2,\Sigma) ) \\ &=\sum_{x_i \in c_1}\log N(\mu_1,\Sigma)+\sum_{x_i \in c_2}\log N(\mu_2,\Sigma) \\ &=-\frac{1}{2}N_1\log |\Sigma|-\frac{1}{2}tr(S_1\Sigma^{-1})-\frac{1}{2}N_2\log |\Sigma|-\frac{1}{2}N_2tr(S_2\Sigma^{-1})+C \\ &=-\frac{1}{2}(N_1\log |\Sigma|+N_1tr(S_1\Sigma^{-1})+N_2\log |\Sigma|+N_2tr(S_2\Sigma^{-1}))+C \\ \end{aligned} i=1N(yilogN(μ1,Σ)+(1yi)logN(μ2,Σ))=xic1logN(μ1,Σ)+xic2logN(μ2,Σ)=21N1logΣ21tr(S1Σ1)21N2logΣ21N2tr(S2Σ1)+C=21(N1logΣ+N1tr(S1Σ1)+N2logΣ+N2tr(S2Σ1))+C
      根据tr的求导公式
    ∂ t r ( A B ) ∂ A = B − 1 ∂ t r ( ∣ A ∣ ) ∂ A = ∣ A ∣ ⋅ A − 1 t r ( A B ) = t r ( B A ) \begin{aligned} &\frac{\partial tr(AB)}{\partial A}=B^{-1}\\ &\frac{\partial tr(|A|)}{\partial A}=|A|·A^{-1} \\ &tr(AB)=tr(BA) \end{aligned} Atr(AB)=B1Atr(A)=AA1tr(AB)=tr(BA)
      对上面化简后的式子进行求导并令导数为0,有
    − 1 2 ( N 1 ∣ Σ ∣ ∣ Σ ∣ Σ − 1 + N 1 ∂ t r ( Σ − 1 S 1 ) ∂ Σ − 1 ∂ t r ( Σ − 1 ) ∂ Σ + N 2 ∂ t r ( Σ − 1 S 2 ) ∂ Σ − 1 ∂ t r ( Σ − 1 ) ∂ Σ ) = 0    ⟺    N 1 ∣ Σ ∣ ∣ Σ ∣ Σ − 1 − N 1 S 1 T Σ − 2 − N 1 S 2 T Σ − 2 = 0    ⟺    N Σ − 1 − N 1 S 1 Σ − 2 − N 1 S 2 Σ − 2 = 0    ⟺    N Σ − N 1 S 1 − N 1 S 2 = 0    ⟺    N Σ − N 1 S 1 − N 1 S 2 = 0    ⟺    Σ ^ = 1 N ( N 1 S 1 + N 2 S 2 ) \begin{aligned} &-\frac{1}{2}(N\frac{1}{|\Sigma|}|\Sigma|\Sigma^{-1}+N_1\frac{\partial tr(\Sigma^{-1}S_1)}{\partial \Sigma^{-1}}\frac{\partial tr(\Sigma^{-1})}{\partial \Sigma}+N_2\frac{\partial tr(\Sigma^{-1}S_2)}{\partial \Sigma^{-1}}\frac{\partial tr(\Sigma^{-1})}{\partial \Sigma})=0 \\ &\iff N\frac{1}{|\Sigma|}|\Sigma|\Sigma^{-1}-N_1S_1^T\Sigma^{-2}-N_1S_2^T\Sigma^{-2}=0 \\ &\iff N\Sigma^{-1}-N_1S_1\Sigma^{-2}-N_1S_2\Sigma^{-2}=0\\ &\iff N\Sigma-N_1S_1-N_1S_2=0 \\ &\iff N\Sigma-N_1S_1-N_1S_2=0 \\ &\iff \hat \Sigma =\frac{1}{N}(N_1S_1+N_2S_2) \\ \end{aligned} 21(NΣ1ΣΣ1+N1Σ1tr(Σ1S1)Σtr(Σ1)+N2Σ1tr(Σ1S2)Σtr(Σ1))=0NΣ1ΣΣ1N1S1TΣ2N1S2TΣ2=0NΣ1N1S1Σ2N1S2Σ2=0NΣN1S1N1S2=0NΣN1S1N1S2=0Σ^=N1(N1S1+N2S2)
  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值