logistic 回归梯度下降函数推导过程

logistic 回归的代价函数:

J ( θ ) = − 1 m ∑ i = 1 m [ y ( i ) log ⁡ h θ ( x ( i ) ) + ( 1 − y ( i ) ) log ⁡ ( 1 − h θ ( x ( i ) ) ) ] J(\theta)=-\frac{1}{m}\sum_{i=1}^{m} \left[y^{(i)} \log h_{\theta}\left(x^{(i)}\right)+\left(1-y^{(i)}\right) \log \left(1-h_{\theta}\left(x^{(i)}\right)\right)\right] J(θ)=m1i=1m[y(i)loghθ(x(i))+(1y(i))log(1hθ(x(i)))]

注意,这里的 log 是以 e 为底的。

logistic 回归的梯度下降函数:
θ j : = θ j − α ∂ ∂ θ j J ( θ ) \theta_{j}:=\theta_{j}-\alpha \frac{\partial}{\partial \theta_{j}} J\left(\theta\right) θj:=θjαθjJ(θ)

梯度下降函数求偏导后:

θ j : = θ j − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) \theta_{j}:=\theta_{j}-\alpha \frac{1}{m} \sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)} θj:=θjαm1i=1m(hθ(x(i))y(i))xj(i)

我就寻思为什么这逻辑回归的梯度下降函数和线性回归的一样呢?

琢磨了一会,觉得还是得推导一下,推导完了以后发现,还真是这样。下面是推导的过程:

z = θ 0 x 0 + θ 1 x 1 + ⋯ + θ n x n z=\theta_0x_0+\theta_1x_1+\cdots+\theta_nx_n z=θ0x0+θ1x1++θnxn

h θ ( x ) = s i g m o i d ( z ) = 1 1 + e − z h_\theta(x)=sigmoid(z)=\frac{1}{1+e^{-z}} hθ(x)=sigmoid(z)=1+ez1

则,
∂ J ( θ ) ∂ θ 0 = d J ( θ ) d h θ ( x ) ⋅ d h θ ( x ) d z ⋅ ∂ z ∂ θ 0 \frac{\partial J(\theta)}{\partial \theta_0}=\frac{d J(\theta)}{d h_\theta(x)} \cdot \frac{d h_\theta(x)}{d z} \cdot \frac{\partial z}{\partial \theta_0} θ0J(θ)=dhθ(x)dJ(θ)dzdhθ(x)θ0z

d J ( θ ) d h θ ( x ) = − 1 m ∑ i = 1 m [ y ⋅ 1 h θ ( x ) + ( − 1 ) ( 1 − y ) 1 1 − h θ ( x ) ] \frac{d J(\theta)}{d h_\theta(x)}=-\frac{1}{m}\sum^m_{i=1} \left[y\cdot\frac{1}{h_\theta(x)}+(-1)(1-y)\frac{1}{1-h_\theta(x)}\right] dhθ(x)dJ(θ)=m1i=1m[yhθ(x)1+(1)(1y)1hθ(x)1]

d h θ ( x ) d z = ( 1 1 + e − z ) ′ = ( e z 1 + e z ) ′ = e z ( 1 + e z ) − ( e z ) 2 ( 1 + e z ) 2 = e z ( 1 + e z ) 2 = 1 e z + 2 + e − z \frac{d h_\theta(x)}{d z}=(\frac{1}{1+e^{-z}})'= \left(\frac{e^{z}}{1+e^{z}}\right)' =\frac{e^{z}\left(1+e^{z})-\left(e^{z}\right)^{2}\right.}{\left(1+e^{z}\right)^{2}} =\frac{e^{z}}{\left(1+e^{z}\right)^{2}}=\frac{1}{e^{z}+2+e^{-z}} dzdhθ(x)=(1+ez1)=(1+ezez)=(1+ez)2ez(1+ez)(ez)2=(1+ez)2ez=ez+2+ez1

= e − z ( e − z ) 2 + 2 e − z + 1 =\frac{e^{-z}}{\left(e^{-z}\right)^{2}+2 e^{-z}+1} =(ez)2+2ez+1ez

= e − z ( e − z + 1 ) 2 = e − z e − z + 1 ⋅ 1 e − z + 1 =\frac{e^{-z}}{\left(e^{-z}+1\right)^{2}}=\frac{e^{-z}}{e^{-z}+1} \cdot \frac{1}{e^{-z}+1} =(ez+1)2ez=ez+1ezez+11

= ( 1 − 1 e − z + 1 ) ⋅ 1 e z + 1 =\left(1-\frac{1}{e^{-z}+1}\right) \cdot \frac{1}{e^{z}+1} =(1ez+11)ez+11

= ( 1 − h θ ( x ) ) h θ ( x ) =(1-h_\theta(x))h_\theta(x) =(1hθ(x))hθ(x)

∂ z ∂ θ 0 = x 0 \frac{\partial z}{\partial \theta_0}=x_0 θ0z=x0

故, ∂ J ( θ ) ∂ θ 0 = d J ( θ ) d h θ ( x ) ⋅ d h θ ( x ) d z ⋅ ∂ z ∂ θ 0 = x 0 ⋅ [ ( 1 − h θ ( x ) ) ⋅ h θ ( x ) ] ⋅ ( − 1 m ) ∑ i = 1 m [ y ⋅ 1 h θ ( x ) + ( − 1 ) ( 1 − y ) 1 1 − h θ ( x ) ] \frac{\partial J(\theta)}{\partial \theta_0}=\frac{d J(\theta)}{d h_\theta(x)} \cdot \frac{d h_\theta(x)}{d z} \cdot \frac{\partial z}{\partial \theta_0}=x_0\cdot\left[(1-h_\theta(x)) \cdot h_\theta(x)\right]\cdot(-\frac{1}{m})\sum^m_{i=1} \left[y\cdot\frac{1}{h_\theta(x)}+(-1)(1-y)\frac{1}{1-h_\theta(x)}\right] θ0J(θ)=dhθ(x)dJ(θ)dzdhθ(x)θ0z=x0[(1hθ(x))hθ(x)](m1)i=1m[yhθ(x)1+(1)(1y)1hθ(x)1]

= x 0 ( − 1 m ) ∑ i = 1 m [ y ( 1 − h θ ( x ) ) + ( − 1 ) ( 1 − y ) h θ ( x ) ] \\=x_0(-\frac{1}{m})\sum^m_{i=1} \left[y \left(1-h_\theta(x)\right)+(-1)(1-y)h_\theta(x)\right] =x0(m1)i=1m[y(1hθ(x))+(1)(1y)hθ(x)]

= x 0 ( − 1 m ) ∑ i = 1 m ( y − h θ ( x ) ) \\=x_0(-\frac{1}{m})\sum^m_{i=1} \left(y-h_\theta(x)\right) =x0(m1)i=1m(yhθ(x))

= x 0 ( 1 m ) ∑ i = 1 m ( h θ ( x ) − y ) \\=x_0(\frac{1}{m})\sum^m_{i=1} \left(h_\theta(x)-y\right) =x0(m1)i=1m(hθ(x)y)

对于 x 1 , x 2 , ⋯ x_1, x_2, \cdots x1,x2, 等其他特征,以此类推。



(完)
  • 9
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值