高斯判别分析(Gaussian Discriminative Analysis)

本文深入探讨了高斯判别分析(GDA),一种经典的生成学习模型和监督分类算法。通过对联合概率分布P(x,y)建模,文章详细介绍了GDA的数学原理,包括其对y、x|y=0和x|y=1的概率分布假设,以及如何通过最大似然估计求解参数ϕ、μ0、μ1和Σ。通过推导似然函数及其偏导数,得出了参数的最优估计公式。
摘要由CSDN通过智能技术生成

  高斯判别分析(GDA)是经典的生成学习模型,也是一种监督分类学习算法。
  假设有样本集 D = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x n , y n ) } D=\{(x_1,y_1),(x_2,y_2),...,(x_n,y_n)\} D={(x1,y1),(x2,y2),...,(xn,yn)},其中 x i ∈ R d , y i ∈ { 0 , 1 } x_i \in R^d,y_i \in \{0,1\} xiRd,yi{0,1}。高斯判别分析作为生成学习算法,同样也是对联合概率 P ( x , y ) P(x,y) P(x,y)建模,在GDA模型中首先假设:
y ∼ B e r n o u l l i ( ϕ ) x ∣ y = 0 ∼ N ( μ 0 , Σ ) x ∣ y = 1 ∼ N ( μ 1 , Σ ) y \sim Bernoulli(\phi) \\ x|y=0 \sim N(\mu_0,\Sigma) \\ x|y=1 \sim N(\mu_1,\Sigma) yBernoulli(ϕ)xy=0N(μ0,Σ)xy=1N(μ1,Σ)
其概率分布:
p ( y ) = ϕ y ( 1 − ϕ ) 1 − y p ( x ∣ y = 0 ) = 1 ( 2 π ) d 2 ∣ Σ ∣ 1 2 e x p ( − 1 2 ( x − μ 0 ) T Σ − 1 ( x − μ 0 ) ) p ( x ∣ y = 1 ) = 1 ( 2 π ) d 2 ∣ Σ ∣ 1 2 e x p ( − 1 2 ( x − μ 1 ) T Σ − 1 ( x − μ 1 ) ) p(y)= \phi^y(1- \phi)^{1-y} \\ p(x|y=0) = \frac{1}{(2\pi)^{\frac d2} |\Sigma|^{\frac12}} exp \left(-\frac 12 (x - \mu_0)^T \Sigma^{-1} (x - \mu_0) \right) \\ p(x|y=1) = \frac{1}{(2\pi)^{\frac d2} |\Sigma|^{\frac12}} exp \left(-\frac 12 (x - \mu_1)^T \Sigma^{-1} (x - \mu_1) \right) p(y)=ϕy(1ϕ)1yp(xy=0)=(2π)2dΣ211exp(21(xμ0)TΣ1(xμ0))p(xy=1)=(2π)2dΣ211exp(21(xμ1)TΣ1(xμ1))
在样本集D上的对数似然函数:
l ( ϕ , μ 0 , μ 1 , Σ ) = l o g ∏ i = 1 m P ( x i , y i ; ϕ , μ 0 , μ 1 , Σ ) = l o g ∏ i = 1 m P ( x i ∣ y i ; μ 0 , μ 1 , Σ ) P ( y i ; ϕ ) = ∑ i = 1 m l o g P ( x i ∣ y i ; μ 0 , μ 1 , Σ ) + l o g P ( y i ; ϕ ) = ∑ i = 1 m l o g P ( x i ∣ y i = 0 ; μ 0 , Σ ) 1 − y i P ( x i ∣ y i = 1 ; μ 1 , Σ ) y i + l o g P ( y i ; ϕ ) = ∑ i = 1 m ( 1 − y i ) l o g P ( x i ∣ y i = 0 ; μ 0 , Σ ) + y i l o g P ( x i ∣ y i = 1 ; μ 1 , Σ ) + l o g P ( y i ; ϕ ) = ∑ i = 1 m ( 1 − y i ) [ − d 2 l o g 2 π − 1 2 l o g ∣ Σ ∣ − 1 2 ( x − μ 0 ) T Σ − 1 ( x − μ 0 ) ] + y i [ − d 2 l o g 2 π − 1 2 l o g ∣ Σ ∣ − 1 2 ( x − μ 1 ) T Σ − 1 ( x − μ 1 ) ] + l o g ϕ y ( 1 − ϕ ) 1 − y \begin{aligned} l(\phi, \mu_0 ,\mu_1 ,\Sigma) &= log \prod_{i=1}^m P(x_i,y_i;\phi, \mu_0 ,\mu_1 ,\Sigma) \\ & = log \prod_{i=1}^m P(x_i|y_i;\mu_0 ,\mu_1 ,\Sigma)P(y_i;\phi) \\ & = \sum_{i=1}^m logP(x_i|y_i;\mu_0 ,\mu_1 ,\Sigma) + log P(y_i;\phi) \\ & = \sum_{i=1}^m logP(x_i|y_i=0;\mu_0 ,\Sigma)^{1-y_i} P(x_i|y_i=1;\mu_1 ,\Sigma)^{y_i} + log P(y_i;\phi) \\ & = \sum_{i=1}^m (1-y_i)logP(x_i|y_i=0;\mu_0 ,\Sigma) + y_i log P(x_i|y_i=1;\mu_1 ,\Sigma) + log P(y_i;\phi) \\ & = \sum_{i=1}^m (1-y_i)[-\frac d2log 2\pi - \frac12 log|\Sigma| - \frac 12 (x - \mu_0)^T \Sigma^{-1} (x - \mu_0)] \\ & \qquad + y_i[-\frac d2log 2\pi - \frac12 log|\Sigma| - \frac 12 (x - \mu_1)^T \Sigma^{-1} (x - \mu_1)] + log \phi^y(1- \phi)^{1-y} \end{aligned} l(ϕ,μ0,μ1,Σ)=logi=1mP(xi,yi;ϕ,μ0,μ1,Σ)=logi=1mP(xiyi;μ0,μ1,Σ)P(yi;ϕ)=i=1mlogP(xiyi;μ0,μ1,Σ)+logP(yi;ϕ)=i=1mlogP(xiyi=0;μ0,Σ)1yiP(xiyi=1;μ1,Σ)yi+logP(yi;ϕ)=i=1m(1yi)logP(xiyi=0;μ0,Σ)+yilogP(xiyi=1;μ1,Σ)+logP(yi;ϕ)=i=1m(1yi)[2dlog2π21logΣ21(xμ0)TΣ1(xμ0)]+yi[2dlog2π21logΣ21(xμ1)TΣ1(xμ1)]+logϕy(1ϕ)1y
在计算似然函数的最大值我们先了解几个公式:

t r A B C = t r C A B = t r B C A ∂ t r A X ∂ X = ∂ t r X A ∂ X = A T ∂ u T v ∂ x = ∂ u v ∂ x = ∂ u ∂ x v + ∂ v ∂ x u ∂ l o g ∣ X ∣ ∂ X = 1 ∣ X ∣ ∣ X ∣ ( X − 1 ) T ∂ ∣ X ∣ ∂ X = ( X − 1 ) T ∂ t r X − 1 A ∂ X = − ( X − 1 ) T A T ( X − 1 ) T trABC=trCAB = tr BCA \\ \frac{\partial trAX}{ \partial X}=\frac{\partial trXA}{ \partial X} =A^T \\ \frac{\partial u^Tv}{\partial x} = \frac{\partial uv}{\partial x} = \frac{\partial u}{\partial x}v+\frac{\partial v}{\partial x}u \\ \frac{\partial log|X|}{\partial X} =\frac{1}{|X|}|X|(X^{-1})^T \\ \frac{\partial |X|}{\partial X} = (X^{-1})^T \\ \frac{\partial trX^{-1}A}{\partial X} = -(X^{-1})^TA^T(X^{-1})^T trABC=trCAB=trBCAXtrAX=XtrXA=ATxuTv=xuv=xuv+xvuXlogX=X1X(X1)TXX=(X1)TXtrX1A=(X1)TAT(X1)T

我们通过最大似然函数估计参数:
∂ l ( ϕ , μ 0 , μ 1 , Σ ) ∂ ϕ = ∂ ∑ i = 1 m l o g ϕ y i ( 1 − ϕ ) 1 − y i ∂ ϕ = ∂ ∑ i = 1 m y i l o g ϕ + ( 1 − y i ) l o g ( 1 − ϕ ) ∂ ϕ = ∑ i = 1 m y i ϕ − 1 − y i 1 − ϕ = ∑ i = 1 m y i − ϕ ϕ ( 1 − ϕ ) = 0 ⇒ ∑ i = 1 m y i − ϕ = 0 ⇒ ∑ i = 1 m y i = ∑ i = 1 m ϕ = m ϕ ϕ = ∑ i = 1 m I ( y i = 1 ) m \begin{aligned} \frac{\partial l(\phi, \mu_0 ,\mu_1 ,\Sigma) }{ \partial \phi} & = \frac{ \partial \sum_{i=1}^m log \phi^{y_i}(1- \phi)^{1-y_i}}{\partial \phi} \\ & = \frac{ \partial \sum_{i=1}^m y_i log \phi + (1-y_i)log(1- \phi) }{ \partial \phi } \\ & = \sum_{i=1}^m \frac {y_i}{\phi} - \frac{ 1-y_i }{ 1- \phi } = \sum_{i=1}^m \frac{y_i - \phi}{\phi (1- \phi)} = 0 \\ & \Rightarrow \sum_{i=1}^m y_i - \phi = 0 \Rightarrow \sum_{i=1}^m y_i = \sum_{i=1}^m \phi = m \phi \\ \phi & = \frac{ \sum_{i=1}^m I(y_i=1) }{m} \end{aligned} ϕl(ϕ,μ0,μ1,Σ)ϕ=ϕi=1mlogϕyi(1ϕ)1yi=ϕi=1myilogϕ+(1yi)log(1ϕ)=i=1mϕyi1ϕ1yi=i=1mϕ(1ϕ)yiϕ=0i=1myiϕ=0i=1myi=i=1mϕ=mϕ=mi=1mI(yi=1)
∂ l ( ϕ , μ 0 , μ 1 , Σ ) ∂ μ 0 = ∂ ∑ i = 1 m ( 1 − y i ) [ − 1 2 ( x i − μ 0 ) T Σ − 1 ( x i − μ 0 ) ] ∂ μ 0 = ∑ i = 1 m − 1 2 ( 1 − y i ) [ ∂ ( x i − μ 0 ) ∂ μ 0 Σ − 1 ( x i − μ 0 ) + ∂ Σ − 1 ( x i − μ 0 ) ∂ μ 0 ( x i − μ 0 ) ] = ∑ i = 1 m − 1 2 ( 1 − y i ) [ − Σ − 1 ( x i − μ 0 ) − ( Σ − 1 ) T ( x i − μ 0 ) ] = ∑ i = 1 m ( 1 − y i ) Σ − 1 ( x i − μ 0 ) = 0 ⇒ ∑ i = 1 m ( 1 − y i ) Σ Σ − 1 ( x i − μ 0 ) = 0 Σ ⇒ ∑ i = 1 m ( 1 − y i ) ( x i − μ 0 ) = 0 μ 0 = ∑ i = 1 m I ( y i = 0 ) x i m \begin{aligned} \frac{\partial l(\phi, \mu_0 ,\mu_1 ,\Sigma) }{ \partial \mu_0} & = \frac{ \partial \sum_{i=1}^m (1-y_i)[ - \frac 12 (x_i - \mu_0)^T \Sigma^{-1} (x_i - \mu_0) ]}{\partial \mu_0} \\ & = \sum_{i=1}^m - \frac 12 (1-y_i) [ \frac{\partial (x_i - \mu_0)}{\partial \mu_0} \Sigma^{-1} (x_i - \mu_0) + \frac{\partial \Sigma^{-1} (x_i - \mu_0) }{\partial \mu_0} (x_i - \mu_0)] \\ & = \sum_{i=1}^m - \frac 12 (1-y_i) [- \Sigma^{-1} (x_i - \mu_0) - (\Sigma_{-1})^T (x_i - \mu_0)] \\ & = \sum_{i=1}^m (1-y_i) \Sigma^{-1} (x_i - \mu_0) = 0 \\ & \Rightarrow \sum_{i=1}^m (1-y_i) \Sigma \Sigma^{-1} (x_i - \mu_0) = 0 \Sigma \Rightarrow \sum_{i=1}^m (1-y_i) (x_i - \mu_0) =0 \\ \mu_0 & = \frac{ \sum_{i=1}^m I(y_i=0) x_i}{m} \end{aligned} μ0l(ϕ,μ0,μ1,Σ)μ0=μ0i=1m(1yi)[21(xiμ0)TΣ1(xiμ0)]=i=1m21(1yi)[μ0(xiμ0)Σ1(xiμ0)+μ0Σ1(xiμ0)(xiμ0)]=i=1m21(1yi)[Σ1(xiμ0)(Σ1)T(xiμ0)]=i=1m(1yi)Σ1(xiμ0)=0i=1m(1yi)ΣΣ1(xiμ0)=0Σi=1m(1yi)(xiμ0)=0=mi=1mI(yi=0)xi
∂ l ( ϕ , μ 0 , μ 1 , Σ ) ∂ μ 1 = ∂ ∑ i = 1 m y i [ − 1 2 ( x i − μ 1 ) T Σ − 1 ( x i − μ 1 ) ] ∂ μ 1 = ∑ i = 1 m − 1 2 y i [ ∂ ( x i − μ 1 ) ∂ μ 1 Σ − 1 ( x i − μ 1 ) + ∂ Σ − 1 ( x i − μ 1 ) ∂ μ 1 ( x i − μ 1 ) ] = ∑ i = 1 m − 1 2 y i [ − Σ − 1 ( x i − μ 1 ) − ( Σ − 1 ) T ( x i − μ 1 ) ] = ∑ i = 1 m y i Σ − 1 ( x i − μ 1 ) = 0 ⇒ ∑ i = 1 m y i Σ Σ − 1 ( x i − μ 1 ) = 0 Σ ⇒ ∑ i = 1 m y i ( x i − μ 1 ) = 0 μ 1 = ∑ i = 1 m I ( y i = 1 ) x i m \begin{aligned} \frac{\partial l(\phi, \mu_0 ,\mu_1 ,\Sigma) }{ \partial \mu_1} & = \frac{ \partial \sum_{i=1}^m y_i[ - \frac 12 (x_i - \mu_1)^T \Sigma^{-1} (x_i - \mu_1) ] }{\partial \mu_1} \\ & = \sum_{i=1}^m - \frac 12 y_i [ \frac{\partial (x_i - \mu_1)}{\partial \mu_1} \Sigma^{-1} (x_i - \mu_1) + \frac{\partial \Sigma^{-1} (x_i - \mu_1) }{\partial \mu_1} (x_i - \mu_1)] \\ & = \sum_{i=1}^m - \frac 12 y_i [- \Sigma^{-1} (x_i - \mu_1) - (\Sigma^{-1})^T (x_i - \mu_1)] \\ & = \sum_{i=1}^m y_i \Sigma^{-1} (x_i - \mu_1) = 0 \\ & \Rightarrow \sum_{i=1}^m y_i \Sigma \Sigma^{-1} (x_i - \mu_1) = 0 \Sigma \Rightarrow \sum_{i=1}^m y_i (x_i - \mu_1) =0 \\ \mu_1 & = \frac{ \sum_{i=1}^m I(y_i=1) x_i}{m} \end{aligned} μ1l(ϕ,μ0,μ1,Σ)μ1=μ1i=1myi[21(xiμ1)TΣ1(xiμ1)]=i=1m21yi[μ1(xiμ1)Σ1(xiμ1)+μ1Σ1(xiμ1)(xiμ1)]=i=1m21yi[Σ1(xiμ1)(Σ1)T(xiμ1)]=i=1myiΣ1(xiμ1)=0i=1myiΣΣ1(xiμ1)=0Σi=1myi(xiμ1)=0=mi=1mI(yi=1)xi

∂ l ( ϕ , μ 0 , μ 1 , Σ ) ∂ Σ = ∂ ∑ i = 1 m y i [ − 1 2 ( x i − μ 1 ) T Σ − 1 ( x i − μ 1 ) ] ∂ Σ + ∂ ∑ i = 1 m ( 1 − y i ) [ − 1 2 ( x i − μ 0 ) T Σ − 1 ( x i − μ 0 ) ] ∂ Σ + ∂ ∑ i = 1 m − 1 2 l o g ∣ Σ ∣ ∂ Σ = ∑ i = 1 m y i ∂ t r [ − 1 2 ( x i − μ 1 ) T Σ − 1 ( x i − μ 1 ) ] ∂ Σ + ( 1 − y i ) ∂ t r [ − 1 2 ( x i − μ 0 ) T Σ − 1 ( x i − μ 0 ) ] ∂ Σ + ∑ i = 1 m ∂ ( − 1 2 l o g ∣ Σ ∣ ) ∂ Σ = ∑ i = 1 m − 1 2 y i ∂ t r [ Σ − 1 ( x i − μ 1 ) ( x i − μ 1 ) T ] ∂ Σ − 1 2 ( 1 − y i ) ∂ t r [ Σ − 1 ( x i − μ 0 ) ( x i − μ 0 ) T ] ∂ Σ + ∑ i = 1 m − 1 2 1 ∣ Σ ∣ ∣ Σ ∣ ( Σ − 1 ) T = ∑ i = 1 m − 1 2 y i [ − ( Σ − 1 ) T ( ( x i − μ 1 ) ( x i − μ 1 ) T ) T ( Σ − 1 ) T ] − 1 2 ( 1 − y i ) [ − ( Σ − 1 ) T ( ( x i − μ 0 ) ( x i − μ 0 ) T ) T ( Σ − 1 ) T ] − 1 2 m ( Σ − 1 ) T = ∑ i = 1 m − 1 2 y i [ − Σ − 1 ( x i − μ 1 ) ( x i − μ 1 ) T Σ − 1 ] − 1 2 ( 1 − y i ) [ − Σ − 1 ( x i − μ 0 ) ( x i − μ 0 ) T Σ − 1 ] − 1 2 m Σ − 1 = 0 ⇒ ∑ i = 1 m − 1 2 y i [ − Σ Σ − 1 ( x i − μ 1 ) ( x i − μ 1 ) T Σ Σ − 1 ] − 1 2 ( 1 − y i ) [ − Σ Σ − 1 ( x i − μ 0 ) ( x i − μ 0 ) T Σ Σ − 1 ] − 1 2 m Σ Σ − 1 Σ = Σ 0 Σ ⇒ ∑ i = 1 m − 1 2 y i [ − ( x i − μ 1 ) ( x i − μ 1 ) T ] − 1 2 ( 1 − y i ) [ − ( x i − μ 0 ) ( x i − μ 0 ) T ] − 1 2 m Σ = 0 ⇒ ∑ i = 1 m y i [ ( x i − μ 1 ) ( x i − μ 1 ) T ] + ( 1 − y i ) [ ( x i − μ 0 ) ( x i − μ 0 ) T ] − m Σ = 0 Σ = ∑ i = 1 m y i [ ( x i − μ 1 ) ( x i − μ 1 ) T ] + ( 1 − y i ) [ ( x i − μ 0 ) ( x i − μ 0 ) T ] m \begin{aligned} \frac{\partial l(\phi, \mu_0 ,\mu_1 ,\Sigma) }{ \partial \Sigma} & = \frac{ \partial \sum_{i=1}^m y_i[ - \frac 12 (x_i - \mu_1)^T \Sigma^{-1} (x_i - \mu_1) ]}{\partial \Sigma} + \frac{ \partial \sum_{i=1}^m (1-y_i) [ - \frac 12 (x_i - \mu_0)^T \Sigma^{-1} (x_i - \mu_0) ]}{\partial \Sigma} + \frac{ \partial \sum_{i=1}^m - \frac 12 log| \Sigma| }{ \partial \Sigma } \\ & = \sum_{i=1}^m y_i \frac{ \partial tr[ - \frac 12 (x_i - \mu_1)^T \Sigma^{-1} (x_i - \mu_1) ]}{\partial \Sigma} + (1-y_i) \frac{ \partial tr[ - \frac 12 (x_i - \mu_0)^T \Sigma^{-1} (x_i - \mu_0) ]}{\partial \Sigma} + \sum_{i=1}^m \frac{ \partial (- \frac 12 log| \Sigma| ) }{ \partial \Sigma } \\ & = \sum_{i=1}^m - \frac 12 y_i \frac{ \partial tr[ \Sigma^{-1} (x_i - \mu_1)(x_i - \mu_1)^T ]}{\partial \Sigma} - \frac 12 (1-y_i) \frac{ \partial tr[ \Sigma^{-1} (x_i - \mu_0)(x_i - \mu_0)^T ]}{\partial \Sigma} + \sum_{i=1}^m - \frac 12 \frac{1}{|\Sigma|} |\Sigma| (\Sigma^{-1})^T \\ & = \sum_{i=1}^m - \frac 12 y_i [- (\Sigma^{-1})^T( (x_i - \mu_1)(x_i - \mu_1)^T)^T (\Sigma^{-1})^T] - \frac 12 (1-y_i) [- (\Sigma^{-1})^T( (x_i - \mu_0)(x_i - \mu_0)^T)^T (\Sigma^{-1})^T] - \frac 12 m (\Sigma^{-1})^T\\ & = \sum_{i=1}^m - \frac 12 y_i [- \Sigma^{-1} (x_i - \mu_1)(x_i - \mu_1)^T \Sigma^{-1}] - \frac 12 (1-y_i) [- \Sigma^{-1} (x_i - \mu_0)(x_i - \mu_0)^T \Sigma^{-1}] - \frac 12 m \Sigma^{-1} =0\\ & \Rightarrow \sum_{i=1}^m - \frac 12 y_i [- \Sigma \Sigma^{-1} (x_i - \mu_1)(x_i - \mu_1)^T \Sigma \Sigma^{-1}] - \frac 12 (1-y_i) [- \Sigma \Sigma^{-1} (x_i - \mu_0)(x_i - \mu_0)^T \Sigma \Sigma^{-1}] - \frac 12 m \Sigma \Sigma^{-1}\Sigma =\Sigma0\Sigma \\ & \Rightarrow \sum_{i=1}^m - \frac 12 y_i [- (x_i - \mu_1)(x_i - \mu_1)^T] - \frac 12 (1-y_i) [-(x_i - \mu_0)(x_i - \mu_0)^T] - \frac 12 m \Sigma =0 \\ & \Rightarrow \sum_{i=1}^m y_i [ (x_i - \mu_1)(x_i - \mu_1)^T] + (1-y_i) [(x_i - \mu_0)(x_i - \mu_0)^T] -m \Sigma =0 \\ \Sigma & = \frac{ \sum_{i=1}^m y_i [ (x_i - \mu_1)(x_i - \mu_1)^T] + (1-y_i) [(x_i - \mu_0)(x_i - \mu_0)^T] }{m} \end{aligned} Σl(ϕ,μ0,μ1,Σ)Σ=Σi=1myi[21(xiμ1)TΣ1(xiμ1)]+Σi=1m(1yi)[21(xiμ0)TΣ1(xiμ0)]+Σi=1m21logΣ=i=1myiΣtr[21(xiμ1)TΣ1(xiμ1)]+(1yi)Σtr[21(xiμ0)TΣ1(xiμ0)]+i=1mΣ(21logΣ)=i=1m21yiΣtr[Σ1(xiμ1)(xiμ1)T]21(1yi)Σtr[Σ1(xiμ0)(xiμ0)T]+i=1m21Σ1Σ(Σ1)T=i=1m21yi[(Σ1)T((xiμ1)(xiμ1)T)T(Σ1)T]21(1yi)[(Σ1)T((xiμ0)(xiμ0)T)T(Σ1)T]21m(Σ1)T=i=1m21yi[Σ1(xiμ1)(xiμ1)TΣ1]21(1yi)[Σ1(xiμ0)(xiμ0)TΣ1]21mΣ1=0i=1m21yi[ΣΣ1(xiμ1)(xiμ1)TΣΣ1]21(1yi)[ΣΣ1(xiμ0)(xiμ0)TΣΣ1]21mΣΣ1Σ=Σ0Σi=1m21yi[(xiμ1)(xiμ1)T]21(1yi)[(xiμ0)(xiμ0)T]21mΣ=0i=1myi[(xiμ1)(xiμ1)T]+(1yi)[(xiμ0)(xiμ0)T]mΣ=0=mi=1myi[(xiμ1)(xiμ1)T]+(1yi)[(xiμ0)(xiμ0)T]
综上,我们有:
ϕ = ∑ i = 1 m I ( y i = 1 ) m μ 0 = ∑ i = 1 m I ( y i = 0 ) x i m μ 1 = ∑ i = 1 m I ( y i = 1 ) x i m Σ = ∑ i = 1 m y i [ ( x i − μ 1 ) ( x i − μ 1 ) T ] + ( 1 − y i ) [ ( x i − μ 0 ) ( x i − μ 0 ) T ] m \phi = \frac{ \sum_{i=1}^m I(y_i=1) }{m} \\ \mu_0 = \frac{ \sum_{i=1}^m I(y_i=0) x_i}{m} \\ \mu_1 = \frac{ \sum_{i=1}^m I(y_i=1) x_i}{m} \\ \Sigma = \frac{ \sum_{i=1}^m y_i [ (x_i - \mu_1)(x_i - \mu_1)^T] + (1-y_i) [(x_i - \mu_0)(x_i - \mu_0)^T] }{m} ϕ=mi=1mI(yi=1)μ0=mi=1mI(yi=0)xiμ1=mi=1mI(yi=1)xiΣ=mi=1myi[(xiμ1)(xiμ1)T]+(1yi)[(xiμ0)(xiμ0)T]

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值