Backpropagation

Backpropagation简称BP,翻译为反向传播,是一种与最优化方法(如梯度下降法)结合使用的训练人工神经网络的方法。该方法会对网络中所有weights计算loss function的梯度,用来更新这些weights以minimize loss function。
在这里插入图片描述

h 1 = i 1 w 1 + i 2 w 2 + b 1 h 2 = i 1 w 3 + i 2 w 4 + b 1 o 1 = h 1 w 5 + h 2 w 6 + b 2 o 2 = h 1 w 7 + h 2 w 8 + b 2 h_1=i_1w_1+i_2w_2+b_1\\ h_2=i_1w_3+i_2w_4+b_1\\ o_1=h_1w_5+h_2w_6+b_2\\ o_2=h_1w_7+h_2w_8+b_2\\ h1=i1w1+i2w2+b1h2=i1w3+i2w4+b1o1=h1w5+h2w6+b2o2=h1w7+h2w8+b2
每个神经元后用激活函数sigmoid函数激活,图中未表示出。上述就是前向传播的过程,第一次使用的参数为random initialzation得到,后续计算第一次输出与实际值的总误差L(这里使用均方误差作为损失函数)。
L = L o s s F u n c t i o n = ∑ i = 1 2 1 2 ( t a r g e t o 1 − o 1 ) 2 L = 1 2 ( t a r g e t o 1 − o 1 ) 2 + 1 2 ( t a r g e t o 2 − o 2 ) 2 L=LossFunction=\sum_{i=1}^2\frac{1}{2}(target_{o_1}-o_1)^2\\ L=\frac{1}{2}(target_{o_1}-o_1)^2+\frac{1}{2}(target_{o_2}-o_2)^2 L=LossFunction=i=1221(targeto1o1)2L=21(targeto1o1)2+21(targeto2o2)2
计算出总误差后,我们需要得知每个weight对误差的影响,并对weight进行修正。
∂ L ∂ w 1 = ∂ h 1 ∂ w 1 ∂ o 1 ∂ h 1 ∂ L ∂ o 1 + ∂ h 1 ∂ w 1 ∂ o 2 ∂ h 1 ∂ L ∂ o 2 ∂ L ∂ w 2 = ∂ h 1 ∂ w 2 ∂ o 1 ∂ h 1 ∂ L ∂ o 1 + ∂ h 1 ∂ w 2 ∂ o 2 ∂ h 1 ∂ L ∂ o 2 ∂ L ∂ w 3 = ∂ h 2 ∂ w 3 ∂ o 1 ∂ h 2 ∂ L ∂ o 1 + ∂ h 2 ∂ w 3 ∂ o 2 ∂ h 2 ∂ L ∂ o 2 ∂ L ∂ w 4 = ∂ h 2 ∂ w 4 ∂ o 1 ∂ h 2 ∂ L ∂ o 1 + ∂ h 2 ∂ w 4 ∂ o 2 ∂ h 2 ∂ L ∂ o 2 ∂ L ∂ w 5 = ∂ o 1 ∂ w 5 ∂ L ∂ o 1 ∂ L ∂ w 6 = ∂ o 1 ∂ w 6 ∂ L ∂ o 1 ∂ L ∂ w 7 = ∂ o 2 ∂ w 7 ∂ L ∂ o 2 ∂ L ∂ w 8 = ∂ o 2 ∂ w 8 ∂ L ∂ o 2 \frac{\partial L}{\partial w_1}=\frac{\partial h_1}{\partial w_1}\frac{\partial o_1}{\partial h_1}\frac{\partial L}{\partial o_1}+\frac{\partial h_1}{\partial w_1}\frac{\partial o_2}{\partial h_1}\frac{\partial L}{\partial o_2}\\ \frac{\partial L}{\partial w_2}=\frac{\partial h_1}{\partial w_2}\frac{\partial o_1}{\partial h_1}\frac{\partial L}{\partial o_1}+\frac{\partial h_1}{\partial w_2}\frac{\partial o_2}{\partial h_1}\frac{\partial L}{\partial o_2}\\ \frac{\partial L}{\partial w_3}=\frac{\partial h_2}{\partial w_3}\frac{\partial o_1}{\partial h_2}\frac{\partial L}{\partial o_1}+\frac{\partial h_2}{\partial w_3}\frac{\partial o_2}{\partial h_2}\frac{\partial L}{\partial o_2}\\ \frac{\partial L}{\partial w_4}=\frac{\partial h_2}{\partial w_4}\frac{\partial o_1}{\partial h_2}\frac{\partial L}{\partial o_1}+\frac{\partial h_2}{\partial w_4}\frac{\partial o_2}{\partial h_2}\frac{\partial L}{\partial o_2}\\ \frac{\partial L}{\partial w_5}=\frac{\partial o_1}{\partial w_5}\frac{\partial L}{\partial o_1}\\ \frac{\partial L}{\partial w_6}=\frac{\partial o_1}{\partial w_6}\frac{\partial L}{\partial o_1}\\ \frac{\partial L}{\partial w_7}=\frac{\partial o_2}{\partial w_7}\frac{\partial L}{\partial o_2}\\ \frac{\partial L}{\partial w_8}=\frac{\partial o_2}{\partial w_8}\frac{\partial L}{\partial o_2}\\ w1L=w1h1h1o1o1L+w1h1h1o2o2Lw2L=w2h1h1o1o1L+w2h1h1o2o2Lw3L=w3h2h2o1o1L+w3h2h2o2o2Lw4L=w4h2h2o1o1L+w4h2h2o2o2Lw5L=w5o1o1Lw6L=w6o1o1Lw7L=w7o2o2Lw8L=w8o2o2L
若每次都通过前向传播计算weight对loss function的影响,则计算复杂度过高,所以采取从后往前计算的反向传播方法。
∂ L ∂ o 1 ∂ L ∂ o 2 ∂ o 1 ∂ h 1 ∂ o 1 ∂ h 2 ∂ o 2 ∂ h 1 ∂ o 2 ∂ h 2 ∂ o 1 ∂ w 5 ∂ o 1 ∂ w 6 ∂ o 2 ∂ w 7 ∂ o 2 ∂ w 8 ∂ h 1 ∂ w 1 ∂ h 1 ∂ w 2 ∂ h 2 ∂ w 3 ∂ h 2 ∂ w 4 \frac{\partial L}{\partial o_1}\quad\frac{\partial L}{\partial o_2}\\\\ \frac{\partial o_1}{\partial h_1}\quad\frac{\partial o_1}{\partial h_2}\quad\frac{\partial o_2}{\partial h_1}\quad\frac{\partial o_2}{\partial h_2}\quad\frac{\partial o_1}{\partial w_5}\quad\frac{\partial o_1}{\partial w_6}\quad\frac{\partial o_2}{\partial w_7}\quad\frac{\partial o_2}{\partial w_8}\quad\\\\ \frac{\partial h_1}{\partial w_1}\quad\frac{\partial h_1}{\partial w_2}\quad\frac{\partial h_2}{\partial w_3}\quad\frac{\partial h_2}{\partial w_4}\quad o1Lo2Lh1o1h2o1h1o2h2o2w5o1w6o1w7o2w8o2w1h1w2h1w3h2w4h2
使用反向传播,只需计算以上的梯度,相比于前向传播计算可以避免很多重复梯度计算,大大降低了计算复杂度。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值