高斯判别分析(GDA)公式推导

该文详细探讨了最大似然估计在概率分布下的应用,通过建立对数似然函数并对其进行求导,逐步求解出未知参数ψ、μ0、μ1和协方差矩阵∑的最优值。在求解过程中,分别针对不同参数设置了梯度为零的条件,最终得出参数的闭合形式解,包括类别概率ψ和两个高斯分布的均值以及协方差矩阵。该文深入浅出地阐述了最大似然估计在分类问题中的实现步骤。
摘要由CSDN通过智能技术生成

在这里插入图片描述

解:将概率分布代入对数似然函数,

l ( ψ , μ 0 , μ 1 , ∑ ) = ∑ i = 1 m l o g p X ∣ Y ( x ( i ) ∣ y ( i ) ; μ 0 , μ 1 , ∑ ) + ∑ i = 1 m l o g p Y ( y ( i ) ; ψ ) l(\psi,\mu_0,\mu_1,\sum)=\sum^m_{i=1}{log{p_{X|Y}(x^{(i)}|y^{(i)};\mu_0,\mu_1,\sum)}}+\sum^m_{i=1}log{p_Y}(y^{(i)};\psi) l(ψ,μ0,μ1,)=i=1mlogpXY(x(i)y(i);μ0,μ1,)+i=1mlogpY(y(i);ψ)

= ∑ i = 1 m ( 1 − y ( i ) ) l o g 1 ( 2 π ) n / 2 ∣ ∑ ∣ 1 / 2 e x p ( 1 2 ( x ( i ) − μ 0 ) T ∑ − 1 ( x ( i ) − μ 0 ) ) =\sum^m_{i=1}(1-y^{(i)}){log \frac{1}{(2\pi)^{n/2}|\sum|^{1/2}}exp(\frac{1}{2}(x^{(i)}-\mu_0)^T\sum^{-1}(x^{(i)}-\mu_0))} =i=1m(1y(i))log(2π)n/21/21exp(21(x(i)μ0)T1(x(i)μ0))

+ ∑ i = 1 m y ( i ) l o g 1 ( 2 π ) n / 2 ∣ ∑ ∣ 1 / 2 e x p ( 1 2 ( x ( i ) − μ 1 ) T ∑ − 1 ( x ( i ) − μ 1 ) ) +\sum^m_{i=1}y^{(i)}{log \frac{1}{(2\pi)^{n/2}|\sum|^{1/2}}exp(\frac{1}{2}(x^{(i)}-\mu_1)^T\sum^{-1}(x^{(i)}-\mu_1))} +i=1my(i)log(2π)n/21/21exp(21(x(i)μ1)T1(x(i)μ1))

+ ∑ i = 1 m l o g ψ y ( i ) ( 1 − ψ ) 1 − y ( i ) +\sum^m_{i=1}{log\psi^{y^{(i)}}(1-\psi)^{1-y^{(i)}}} +i=1mlogψy(i)(1ψ)1y(i)

求取 l ( ψ , μ 0 , μ 1 , ∑ ) l(\psi,\mu_0,\mu_1,\sum) l(ψ,μ0,μ1,)的最大值,令

∂ ∂ ψ l ( ψ , μ 0 , μ 1 , ∑ ) = 0 \frac{\partial}{\partial\psi}l(\psi,\mu_0,\mu_1,\sum)=0 ψl(ψ,μ0,μ1,)=0 (1)

∇ μ 0 l ( ψ , μ 0 , μ 1 , ∑ ) = 0 \nabla_{\mu_0}l(\psi,\mu_0,\mu_1,\sum)=0 μ0l(ψ,μ0,μ1,)=0 (2)

∇ μ 1 l ( ψ , μ 0 , μ 1 , ∑ ) = 0 \nabla_{\mu_1}l(\psi,\mu_0,\mu_1,\sum)=0 μ1l(ψ,μ0,μ1,)=0 (3)

∇ ∑ l ( ψ , μ 0 , μ 1 , ∑ ) = 0 \nabla_{\sum}l(\psi,\mu_0,\mu_1,\sum)=0 l(ψ,μ0,μ1,)=0 (4)

对于(1)式:

∂ ∂ ψ ∑ i = 1 m y ( i ) l o g ψ + ( 1 − y ( i ) ) l o g ( 1 − ψ ) = 0 \frac{\partial}{\partial\psi}{\sum^m_{i=1}y^{(i)}log\psi+(1-y^{(i)})log(1-\psi)}=0 ψi=1my(i)logψ+(1y(i))log(1ψ)=0

∑ i = 1 m y ( i ) ψ + 1 − y ( i ) 1 − ψ = 0 {\sum^m_{i=1}\frac{y^{(i)}}{\psi}+\frac{1-y^{(i)}}{1-\psi}}=0 i=1mψy(i)+1ψ1y(i)=0

∑ i = 1 m y ( i ) ( 1 − ψ ) + ( 1 − y ( i ) ) ψ = 0 {\sum^m_{i=1}y^{(i)}{(1-\psi)}+(1-y^{(i)}){\psi}}=0 i=1my(i)(1ψ)+(1y(i))ψ=0

∑ i = 1 m y ( i ) = m ψ {\sum^m_{i=1}y^{(i)}}=m\psi i=1my(i)=mψ

ψ = ∑ i = 1 m 1 { y ( i ) = 1 } m \psi=\frac{\sum^m_{i=1}1\{y^{(i)}=1\}}{m} ψ=mi=1m1{y(i)=1}

对于(2)式:

∇ μ 0 ∑ i = 1 m ( 1 − y ( i ) ) ( x ( i ) − μ 0 ) T ∑ − 1 ( x ( i ) − μ 0 ) = 0 \nabla_{\mu_0}\sum^m_{i=1}(1-y^{(i)})(x^{(i)}-\mu_0)^T\sum^{-1}(x^{(i)}-\mu_0)=0 μ0i=1m(1y(i))(x(i)μ0)T1(x(i)μ0)=0

∑ i = 1 m ( 1 − y ( i ) ) ( x ( i ) − μ 0 ) T ∑ − 1 ( x ( i ) − μ 0 ) = 0 \sum^m_{i=1}(1-y^{(i)})(x^{(i)}-\mu_0)^T\sum^{-1}(x^{(i)}-\mu_0)=0 i=1m(1y(i))(x(i)μ0)T1(x(i)μ0)=0

∑ i = 1 m ( 1 − y ( i ) ) [ ∑ − 1 ( x ( i ) − μ 0 ) d ( x ( i ) − μ 0 ) T + ( x ( i ) − μ 0 ) T ∑ − 1 d ( x ( i ) − μ 0 ) ] = 0 \sum^m_{i=1}(1-y^{(i)})[\sum^{-1}(x^{(i)}-\mu_0)d(x^{(i)}-\mu_0)^T+(x^{(i)}-\mu_0)^T\sum^{-1}d(x^{(i)}-\mu_0)]=0 i=1m(1y(i))[1(x(i)μ0)d(x(i)μ0)T+(x(i)μ0)T1d(x(i)μ0)]=0

∑ i = 1 m ( 1 − y ( i ) ) ∑ − 1 ( x ( i ) − μ 0 ) = 0 \sum^m_{i=1}(1-y^{(i)})\sum^{-1}(x^{(i)}-\mu_0)=0 i=1m(1y(i))1(x(i)μ0)=0

∑ i = 1 m ( 1 − y ( i ) ) ( x ( i ) − μ 0 ) = 0 \sum^m_{i=1}(1-y^{(i)})(x^{(i)}-\mu_0)=0 i=1m(1y(i))(x(i)μ0)=0

∑ i = 1 m ( 1 − y ( i ) ) x ( i ) = ∑ i = 1 m ( 1 − y ( i ) ) μ 0 \sum^m_{i=1}(1-y^{(i)})x^{(i)}=\sum^m_{i=1}(1-y^{(i)})\mu_0 i=1m(1y(i))x(i)=i=1m(1y(i))μ0

μ 0 = ∑ i = 1 m 1 { y ( i ) = 0 } x ( i ) / ∑ i = 1 m 1 { y ( i ) = 0 } \mu_0=\sum^m_{i=1}1\{y^{(i)}=0\}x^{(i)}/\sum^m_{i=1}1\{y^{(i)}=0\} μ0=i=1m1{y(i)=0}x(i)/i=1m1{y(i)=0}

对于(3)式,类同(2)式:

μ 0 = ∑ i = 1 m 1 { y ( i ) = 1 } x ( i ) / ∑ i = 1 m 1 { y ( i ) = 1 } \mu_0=\sum^m_{i=1}1\{y^{(i)}=1\}x^{(i)}/\sum^m_{i=1}1\{y^{(i)}=1\} μ0=i=1m1{y(i)=1}x(i)/i=1m1{y(i)=1}

对于(4)式:

∇ ∑ ( − m 2 l o g ∣ ∑ ∣ ) − 1 2 ∑ i = 1 m ( 1 − y ( i ) ) ( x ( i ) − μ 0 ) T ∑ − 1 ( x ( i ) − μ 0 ) − 1 2 ∑ i = 1 m y ( i ) ( x ( i ) − μ 1 ) T ∑ − 1 ( x ( i ) − μ 1 ) = 0 \nabla_{\sum}(-\frac{m}{2}log|\sum|)-\frac{1}{2}\sum^m_{i=1}(1-y^{(i)})(x^{(i)}-\mu_0)^T\sum^{-1}(x^{(i)}-\mu_0)-\frac{1}{2}\sum^m_{i=1}y^{(i)}(x^{(i)}-\mu_1)^T\sum^{-1}(x^{(i)}-\mu_1)=0 (2mlog)21i=1m(1y(i))(x(i)μ0)T1(x(i)μ0)21i=1my(i)(x(i)μ1)T1(x(i)μ1)=0

∇ ∑ ( m l o g ∣ ∑ ∣ ) + ∇ ∑ ∑ i = 1 m ( 1 − y ( i ) ) ( x ( i ) − μ 0 ) T ∑ − 1 ( x ( i ) − μ 0 ) + ∇ ∑ ∑ i = 1 m y ( i ) ( x ( i ) − μ 1 ) T ∑ − 1 ( x ( i ) − μ 1 ) = 0 \nabla_{\sum}(mlog|\sum|)+\nabla_{\sum}\sum^m_{i=1}(1-y^{(i)})(x^{(i)}-\mu_0)^T\sum^{-1}(x^{(i)}-\mu_0)+\nabla_{\sum}\sum^m_{i=1}y^{(i)}(x^{(i)}-\mu_1)^T\sum^{-1}(x^{(i)}-\mu_1)=0 (mlog)+i=1m(1y(i))(x(i)μ0)T1(x(i)μ0)+i=1my(i)(x(i)μ1)T1(x(i)μ1)=0

已知协方差矩阵 S i = 1 m ∑ i = 1 m ( x ( i ) − μ i ) ( x ( i ) − μ i ) T S_i=\frac{1}{m}\sum^m_{i=1}(x^{(i)}-\mu_i)(x^{(i)}-\mu_i)^T Si=m1i=1m(x(i)μi)(x(i)μi)T,将通过 S i S_i Si简化表达上式

∇ ∑ ∑ i = 1 m ( x ( i ) − μ i ) T ∑ − 1 ( x ( i ) − μ i ) \nabla_{\sum}\sum^m_{i=1}(x^{(i)}-\mu_i)^T\sum^{-1}(x^{(i)}-\mu_i) i=1m(x(i)μi)T1(x(i)μi)

= ∇ ∑ t r ( ∑ i = 1 m ( x ( i ) − μ i ) T ∑ − 1 ( x ( i ) − μ i ) ) =\nabla_{\sum}tr(\sum^m_{i=1}(x^{(i)}-\mu_i)^T\sum^{-1}(x^{(i)}-\mu_i)) =tr(i=1m(x(i)μi)T1(x(i)μi))

= ∇ ∑ t r ( ∑ i = 1 m ( x ( i ) − μ i ) ( x ( i ) − μ i ) T ∑ − 1 ) =\nabla_{\sum}tr(\sum^m_{i=1}(x^{(i)}-\mu_i)(x^{(i)}-\mu_i)^T\sum^{-1}) =tr(i=1m(x(i)μi)(x(i)μi)T1)

= ∇ ∑ t r ( m i S i ∑ − 1 ) =\nabla_{\sum}tr(m_iS_i\sum^{-1}) =tr(miSi1)

其中 m i = ∑ k = 1 m 1 { y ( k ) = i } m_i=\sum^m_{k=1}1\{y^{(k)}=i\} mi=k=1m1{y(k)=i}

∇ ∑ t r ( m i S i ∑ − 1 ) = − m i S i T ∑ − 2 \nabla_{\sum}tr(m_iS_i\sum^{-1})=-m_iS_i^T\sum^{-2} tr(miSi1)=miSiT2

∇ ∑ ( m l o g ∣ ∑ ∣ ) = m 1 ∣ ∑ ∣ ∣ ∑ ∣ ∑ − 1 = m ∑ − 1 \nabla_{\sum}(mlog|\sum|)=m\frac{1}{|\sum|}|\sum|\sum^{-1}=m\sum^{-1} (mlog)=m11=m1

因此,(4)式可简化为

m ∑ − 1 − ∑ i 2 m i S i T ∑ − 2 = 0 m\sum^{-1}-\sum_i^{2}m_iS_i^T\sum^{-2}=0 m1i2miSiT2=0

∑ = 1 m ∑ i 2 m i S i T \sum=\frac{1}{m}\sum_i^{2}m_iS_i^T =m1i2miSiT

∑ = 1 m ∑ i = 1 m ( x ( i ) − μ y ( i ) ) T ( x ( i ) − μ y ( i ) ) \sum=\frac{1}{m}\sum_{i=1}^{m}(x^{(i)}-\mu_{y^{(i)}})^T(x^{(i)}-\mu_{y^{(i)}}) =m1i=1m(x(i)μy(i))T(x(i)μy(i))

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

u小鬼

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值