反向传播的计算——手撕BP神经网络


个人主页:https://yang1he.gitee.io
干货会越来越多的,欢迎来玩


反向传播的计算——手撕BP神经网络(1)

对基本原理,手撕是最合适的态度,反向传播的概念不难理解,但计算方法总是容易遗忘,本文举一个两层的神经网络,来计算他的反向传播过程。

问题描述

在这里插入图片描述

计算方法

具体分为四步

  1. 计算正向计算预测值
  2. 计算误差
  3. 通过链式求导法则求出梯度
  4. 误差的反向传播
  1. 计算正向计算预测值

Z h = W h X + b h = − 0.2 ∗ 1 + 0.1 = − 0.1 Y h = 1 1 + e ( − Z h ) = 1 1 + e ( − 0.1 ) = 0.57 Z o = W o X + b o = − 0.3 ∗ 0.57 + 0.2 = − 0.029 Y o = 1 1 + e ( − Z o ) = 1 1 + e ( − 0.029 ) = 0.59 Z_h=W_hX+b_h=-0.2*1+0.1=-0.1\\ Y_h=\frac{1}{1+e^{(-Z_h)}}=\frac{1}{1+e^{(-0.1)}}=0.57\\ Z_o=W_oX+b_o=-0.3*0.57+0.2=-0.029\\ Y_o=\frac{1}{1+e^{(-Z_o)}}=\frac{1}{1+e^{(-0.029)}}=0.59 Zh=WhX+bh=0.21+0.1=0.1Yh=1+e(Zh)1=1+e(0.1)1=0.57Zo=WoX+bo=0.30.57+0.2=0.029Yo=1+e(Zo)1=1+e(0.029)1=0.59

  1. 计算误差

    此处采用最常见的误差
    L o s s = 1 2 ( y − y o ) 2 = 0.02205 Loss=\frac{1}{2}(y-y_o)^2=0.02205 Loss=21(yyo)2=0.02205

  2. 通过链式求导法则求出梯度

    w o w_o wo为例:
    ∂ L o s s ∂ w o = ∂ L o s s ∂ y o ⋅ ∂ y o ∂ z o ⋅ ∂ z o ∂ w o = − 0.21 × 0.2419 × 0.57 = − 0.02895543 \frac{\partial L o s s}{\partial w_o}=\frac{\partial L o s s}{\partial y_o} \cdot \frac{\partial y_o}{\partial z_o} \cdot \frac{\partial z_o}{\partial w_o}=-0.21 \times 0.2419 \times 0.57=-0.02895543 woLoss=yoLosszoyowozo=0.21×0.2419×0.57=0.02895543

    公式求导结果
    Loss = 1 2 ( y − y o ) 2 \text{Loss}=\frac{1}{2}\left(y-y_o\right)^2 Loss=21(yyo)2 ∂ Loss ∂ y o = − 2 × 1 2 × ( y − y o ) = − ( 0.8 − 0.59 ) = − 0.21 \frac{\partial \text{Loss}}{\partial y_o}=-2 \times \frac{1}{2} \times\left(y-y_o\right)=-(0.8-0.59)=-0.21 yoLoss=2×21×(yyo)=(0.80.59)=0.21
    y o = 1 1 + e − z o y_o=\frac{1}{1+e^{-z_o}} yo=1+ezo1 ∂ y o ∂ z o = y o ( 1 − y o ) = 0.59 × ( 1 − 0.59 ) = 0.2419 \frac{\partial y_o}{\partial z_o}=y_o\left(1-y_o\right)=0.59 \times(1-0.59)=0.2419 zoyo=yo(1yo)=0.59×(10.59)=0.2419
    z o = w o y h + b o z_o=w_o y_h+b_o zo=woyh+bo ∂ z o ∂ w o = y h = 0.57 \frac{\partial z_o}{\partial w_o}=y_h=0.57 wozo=yh=0.57

    接下来以 w h w_h wh为例:
    ∂  Loss  ∂ w h = ∂  Loss  ∂ y o ⋅ ∂ y o ∂ z o ⋅ ∂ z o ∂ y h ⋅ ∂ y h ∂ z h ⋅ ∂ z h ∂ w h = − 0.21 × 0.2419 × 0.3 × 0.2451 × 1 = − 0.0037352 \frac{\partial \text { Loss }}{\partial w_h}=\frac{\partial \text { Loss }}{\partial y_o} \cdot \frac{\partial y_o}{\partial z_o} \cdot \frac{\partial z_o}{\partial y_h} \cdot \frac{\partial y_h}{\partial z_h} \cdot \frac{\partial z_h}{\partial w_h}=-0.21 \times 0.2419 \times 0.3 \times 0.2451 \times 1=-0.0037352 wh Loss =yo Loss zoyoyhzozhyhwhzh=0.21×0.2419×0.3×0.2451×1=0.0037352
    将上述公式展开,详细分为5部分如下表

    公式求导结果
    Loss = 1 2 ( y − y o ) 2 \text{Loss}=\frac{1}{2}\left(y-y_o\right)^2 Loss=21(yyo)2 ∂ Loss ∂ y o = − 2 × 1 2 × ( y − y o ) = − ( 0.8 − 0.59 ) = − 0.21 \frac{\partial \text{Loss}}{\partial y_o}=-2 \times \frac{1}{2} \times\left(y-y_o\right)=-(0.8-0.59)=-0.21 yoLoss=2×21×(yyo)=(0.80.59)=0.21
    y o = 1 1 + e − z o y_o=\frac{1}{1+e^{-z_o}} yo=1+ezo1 ∂ y o ∂ z o = y o ( 1 − y o ) = 0.59 × ( 1 − 0.59 ) = 0.2419 \frac{\partial y_o}{\partial z_o}=y_o\left(1-y_o\right)=0.59 \times(1-0.59)=0.2419 zoyo=yo(1yo)=0.59×(10.59)=0.2419
    z h = w h x + b h z_h=w_h x+b_h zh=whx+bh ∂ z o ∂ y h = w o = 0.3 \frac{\partial z_o}{\partial y_h}=w_o=0.3 yhzo=wo=0.3
    y h = 1 1 + e − z h y_h=\frac{1}{1+e^{-z_h}} yh=1+ezh1 ∂ y h ∂ z h = y h ( 1 − y h ) = 0.57 × ( 1 − 0.57 ) = 0.2451 \frac{\partial y_h}{\partial z_h}=y_h\left(1-y_h\right)=0.57 \times(1-0.57)=0.2451 zhyh=yh(1yh)=0.57×(10.57)=0.2451
    z h = w h x + b h z_h=w_h x+b_h zh=whx+bh ∂ z h ∂ w h = x = 1 \frac{\partial z_h}{\partial w_h}=x=1 whzh=x=1

    其实到这里聪明的你应该就懂了,但为了强化记忆,再举一个 b h b_h bh的例子

    ∂  Loss  ∂ b h = ∂  Loss  ∂ y o ⋅ ∂ y o ∂ z o ⋅ ∂ z o ∂ y h ⋅ ∂ y h ∂ z h ⋅ ∂ z h ∂ b h = − 0.21 × 0.2419 × 0.3 × 0.2451 × 1 = − 0.00373525 \frac{\partial \text { Loss }}{\partial b_h}=\frac{\partial \text { Loss }}{\partial y_o} \cdot \frac{\partial y_o}{\partial z_o} \cdot \frac{\partial z_o}{\partial y_h} \cdot \frac{\partial y_h}{\partial z_h} \cdot \frac{\partial z_h}{\partial b_h}=-0.21 \times 0.2419 \times 0.3 \times 0.2451 \times 1=-0.00373525 \\ bh Loss =yo Loss zoyoyhzozhyhbhzh=0.21×0.2419×0.3×0.2451×1=0.00373525

    公式求导结果
    Loss = 1 2 ( y − y o ) 2 \text{Loss}=\frac{1}{2}\left(y-y_o\right)^2 Loss=21(yyo)2 ∂ Loss ∂ y o = − 2 × 1 2 × ( y − y o ) = − ( 0.8 − 0.59 ) = − 0.21 \frac{\partial \text{Loss}}{\partial y_o}=-2 \times \frac{1}{2} \times\left(y-y_o\right)=-(0.8-0.59)=-0.21 yoLoss=2×21×(yyo)=(0.80.59)=0.21
    y o = 1 1 + e − z o y_o=\frac{1}{1+e^{-z_o}} yo=1+ezo1 ∂ y o ∂ z o = y o ( 1 − y o ) = 0.59 × ( 1 − 0.59 ) = 0.2419 \frac{\partial y_o}{\partial z_o}=y_o\left(1-y_o\right)=0.59 \times(1-0.59)=0.2419 zoyo=yo(1yo)=0.59×(10.59)=0.2419
    z o = w o y h + b o z_o=w_o y_h+b_o zo=woyh+bo ∂ z o ∂ y h = w o = 0.3 \frac{\partial z_o}{\partial y_h}=w_o=0.3 yhzo=wo=0.3
    y h = 1 1 + e − z h y_h=\frac{1}{1+e^{-z_h}} yh=1+ezh1 ∂ y h ∂ z h = y h ( 1 − y h ) = 0.57 × ( 1 − 0.57 ) = 0.2451 \frac{\partial y_h}{\partial z_h}=y_h\left(1-y_h\right)=0.57 \times(1-0.57)=0.2451 zhyh=yh(1yh)=0.57×(10.57)=0.2451
    z h = w h x + b h z_h=w_h x+b_h zh=whx+bh ∂ z h ∂ b h = 1 \frac{\partial z_h}{\partial b_h}=1 bhzh=1

    现在我们计算出了各参数的梯度,接下来就是更新参数了。

  3. 误差的反向传播

    通过对原值减去一个超参数学习率乘以计算的梯度,对原参数进行优化
    W h ( k + 1 ) = w h ( k ) − η ∂ L o s s ∂ w h W o ( k + 1 ) = w o ( k ) − η ∂ L o s s ∂ w o b h ( k + 1 ) = b h ( k ) − η ∂ L o s s ∂ b h b o ( k + 1 ) = b o ( k ) − η ∂ L o s s ∂ b o W_h^{(k+1)}=w_h^{(k)}-\eta\frac{\partial Loss}{\partial w_h}\\ W_o^{(k+1)}=w_o^{(k)}-\eta\frac{\partial Loss}{\partial w_o}\\ b_h^{(k+1)}=b_h^{(k)}-\eta\frac{\partial Loss}{\partial b_h}\\ b_o^{(k+1)}=b_o^{(k)}-\eta\frac{\partial Loss}{\partial b_o}\\ Wh(k+1)=wh(k)ηwhLossWo(k+1)=wo(k)ηwoLossbh(k+1)=bh(k)ηbhLossbo(k+1)=bo(k)ηboLoss

w h ( 1 ) = w h ( 0 ) − η ∂  Loss  ∂ w h = 0.201867625 b h ( 1 ) = b h ( 0 ) − η ∂  Loss  ∂ b h = 0.101867625 w o ( 1 ) = w o ( 0 ) − η ∂  Loss  ∂ w o = 0.314477715 b o ( 1 ) = b o ( 0 ) − η ∂  Loss  ∂ b o = 0.2253995 \begin{gathered} w_h^{(1)}=w_h^{(0)}-\eta \frac{\partial \text { Loss }}{\partial w_h}=0.201867625 \\ b_h^{(1)}=b_h^{(0)}-\eta \frac{\partial \text { Loss }}{\partial b_h}=0.101867625 \end{gathered} \begin{gathered} \quad w_o^{(1)}=w_o^{(0)}-\eta \frac{\partial \text { Loss }}{\partial w_o}=0.314477715 \\ b_o^{(1)}=b_o^{(0)}-\eta \frac{\partial \text { Loss }}{\partial b_o}=0.2253995 \end{gathered} wh(1)=wh(0)ηwh Loss =0.201867625bh(1)=bh(0)ηbh Loss =0.101867625wo(1)=wo(0)ηwo Loss =0.314477715bo(1)=bo(0)ηbo Loss =0.2253995

至此,这个神经网络的参数更新完成。这个例子不失为一个很好的入门方式,相信你以后会遇见更复杂的Loss函数,如果绕晕了,就请回头看看这个笨题。
artial b_o}=0.2253995
\end{gathered}
$$

至此,这个神经网络的参数更新完成。这个例子不失为一个很好的入门方式,相信你以后会遇见更复杂的Loss函数,如果绕晕了,就请回头看看这个笨题。

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值