逻辑回归的梯度下降法的向量化(详细过程)

逻辑回归的梯度下降公式

逻辑回归的梯度下降公式:

θ j : = θ j − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) \theta_{j}:=\theta_{j}-\alpha \frac{1}{m} \sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)} θj:=θjαm1i=1m(hθ(x(i))y(i))xj(i)

其中:
h θ ( x ( i ) ) = g ( θ T x ( i ) ) = 1 1 + e − θ T x ( i ) h_{\theta}(x^{(i)})=g\left(\theta^T x^{(i)}\right)=\frac{1}{1+e^{-\theta^{T} x^{(i)}}} hθ(x(i))=g(θTx(i))=1+eθTx(i)1

向量化后的公式为:

θ : = θ − α m X T ( g ( X θ ) − y ⃗ ) \theta:=\theta-\frac{\alpha}{m} X^{T}(g(X \theta)-\vec{y}) θ:=θmαXT(g(Xθ)y )

其中:

y ⃗ = ( y ( 1 ) y ( 2 ) ⋮ y ( m ) )         θ = ( θ 0 θ 1 ⋮ θ n )        X = [ x 0 ( 1 ) x 1 ( 1 ) ⋯ x n ( 1 ) x 0 ( 2 ) x 1 ( 2 ) ⋯ x n ( 2 ) ⋮ ⋮ x 0 ( m ) x 1 ( m ) ⋯ x n ( m ) ] m × ( n + 1 ) \vec{y}=\left(\begin{array}{c} y^{(1)} \\ y^{(2)} \\ \vdots \\ y^{(m)} \end{array}\right)~~~~~~~\theta=\left(\begin{array}{c} \theta_{0} \\ \theta_{1} \\ \vdots \\ \theta_{n} \end{array}\right)~~~~~~X=\left[\begin{array}{cccc} x_{0}^{(1)} & x_{1}^{(1)} & \cdots & x_{n}^{(1)} \\ x_{0}^{(2)} & x_{1}^{(2)} & \cdots & x_{n}^{(2)} \\ \vdots & & &\vdots\\ x_{0}^{(m)} & x_{1}^{(m)} & \cdots & x_{n}^{(m)} \end{array}\right]_{m \times(n+1)} y =y(1)y(2)y(m)       θ=θ0θ1θn      X=x0(1)x0(2)x0(m)x1(1)x1(2)x1(m)xn(1)xn(2)xn(m)m×(n+1)

X θ = [ θ 0 x 0 ( 1 ) + θ 1 x 1 ( 1 ) + θ 2 x 2 ( 1 ) + ⋯ + θ n x n ( 1 ) θ 0 x 0 ( 2 ) + θ 1 x 2 ( 2 ) + θ 2 x 2 ( 2 ) + ⋯ + θ n x n ( 2 ) ⋯ θ 0 x 0 ( m ) + θ 1 x 1 ( m ) + θ 2 x 2 ( m ) + ⋯ + θ n x n ( m ) ]                     g ( X θ ) = [ h θ ( x ( 1 ) ) h θ ( x ( 2 ) ) ⋯ h θ ( x ( m ) ) ] X \theta=\left[\begin{array}{c} \theta_{0} x_{0}^{(1)}+\theta_{1} x_{1}^{(1)}+\theta_{2} x_{2}^{(1)}+\cdots+\theta_{n} x_{n}{ }^{(1)} \\ \theta_{0} x_{0}^{(2)}+\theta_{1} x_{2}^{(2)}+\theta_{2} x_{2}^{(2)}+\cdots+\theta_{n} x_{n}^{(2)} \\ \cdots \\ \theta_{0} x_{0}^{(m)}+\theta_{1} x_{1}^{(m)}+\theta_{2} x_{2}^{(m)}+\cdots+\theta_{n} x_{n}^{(m)} \end{array}\right]~~~~~~~~~~~~~~~~~~~ g(X \theta)=\left[\begin{array}{c} h_{\theta}\left(x^{(1)}\right) \\ h_{\theta}\left(x^{(2)}\right) \\ \cdots\\ h_\theta\left(x^{(m)}\right) \end{array}\right] Xθ=θ0x0(1)+θ1x1(1)+θ2x2(1)++θnxn(1)θ0x0(2)+θ1x2(2)+θ2x2(2)++θnxn(2)θ0x0(m)+θ1x1(m)+θ2x2(m)++θnxn(m)                   g(Xθ)=hθ(x(1))hθ(x(2))hθ(x(m))

详细向量化过程

∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) = [ h θ ( x ( 1 ) ) − y ( 1 ) ] x j ( 1 ) + [ h θ ( x ( 2 ) ) − y ( 2 ) ] x j ( 2 ) + ⋯ + [ h θ ( x ( m ) ) − y ( m ) ] x j ( m ) = ( x j ( 1 ) , x j ( 2 ) , ⋯   , x j ( m ) ) ⋅ ( h θ ( x ( 1 ) ) − y ( 1 ) h θ ( x ( 2 ) ) − y ( 2 ) ⋮ h θ ( x ( m ) ) − y ( m ) ) = ( x j ( 1 ) , x j ( 2 ) , ⋯   , x j ( m ) ) ⋅ [ ( h θ ( x ( 1 ) ) h θ ( x ( 2 ) ) ⋮ h θ ( x ( m ) ) ) − ( y ( 1 ) y ( 2 ) ⋮ y ( m ) ) ] = x j ⋅ [ g ( X θ ) − y ⃗ ] \begin{aligned} &\sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)} \\\\ =&{\left[h_{\theta}\left(x^{(1)}\right)-y^{(1)}\right]x_{j}^{(1)}+\left[h_{\theta}\left(x^{(2)}\right)-y^{(2)}\right] x_{j}^{(2)}} +\cdots+\left[h_{\theta}\left(x^{(m)}\right)-y^{(m)}\right] x_{j}^{(m)} \\\\ = &\left(x_{j}^{(1)}, x_{j}^{(2)}, \cdots, x_{j}^{(m)}\right) \cdot\left(\begin{array}{c} h_{\theta}\left(x^{(1)}\right)-y^{(1)} \\ h_{\theta}\left(x^{(2)}\right)-y^{(2)} \\ \vdots \\ h_{\theta}\left(x^{(m)}\right)-y^{(m)} \end{array}\right) \\\\ =& \left(x_{j}^{(1)}, x_{j}^{(2)}, \cdots, x_{j}^{(m)}\right)\cdot\left[\left(\begin{array}{c} h_{\theta}\left(x^{(1)}\right) \\ h_{\theta}\left(x^{(2)}\right) \\ \vdots \\ h_{\theta}\left(x^{(m)}\right) \end{array}\right)-\left(\begin{array}{c} y^{(1)} \\ y^{(2)} \\ \vdots \\ y^{(m)} \end{array}\right)\right] \\\\ =& x_{j} \cdot[g(X \theta)-\vec{y}] \end{aligned} ====i=1m(hθ(x(i))y(i))xj(i)[hθ(x(1))y(1)]xj(1)+[hθ(x(2))y(2)]xj(2)++[hθ(x(m))y(m)]xj(m)(xj(1),xj(2),,xj(m))hθ(x(1))y(1)hθ(x(2))y(2)hθ(x(m))y(m)(xj(1),xj(2),,xj(m))hθ(x(1))hθ(x(2))hθ(x(m))y(1)y(2)y(m)xj[g(Xθ)y ]

则:
θ j : = θ j − α m x j [ g ( X θ ) − y ⃗ ] \theta_{j}:=\theta_{j}-\frac{\alpha}{m}x_{j}[g(X \theta)-\vec{y}] θj:=θjmαxj[g(Xθ)y ]

[ θ 0 θ 1 ⋮ θ n ] : = [ θ 0 θ 1 ⋮ θ n ] − α m [ x 0 x 1 ⋮ x n ] [ g ( X θ ) − y ⃗ ] \left[\begin{array}{c} \theta_{0} \\ \theta_{1} \\ \vdots \\ \theta_{n} \end{array}\right]:=\left[\begin{array}{c} \theta_{0} \\ \theta_{1} \\ \vdots \\ \theta_{n} \end{array}\right]-\frac{\alpha}{m}\left[\begin{array}{c} x_{0} \\ x_{1} \\ \vdots \\ x_{n} \end{array}\right]\left[g\left(X\theta\right)-\vec{y}\right] θ0θ1θn:=θ0θ1θnmαx0x1xn[g(Xθ)y ]

最终得:

θ : = θ − α m X T ( g ( X θ ) − y ⃗ ) \theta:=\theta-\frac{\alpha}{m} X^{T}(g(X \theta)-\vec{y}) θ:=θmαXT(g(Xθ)y )

  • 6
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

iioSnail

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值