BP算法推导

例子1:

此例子是一个多分类问题
在这里插入图片描述
解:

首先有如下结论:

E k = 1 2 ∑ k = 1 n ( z k − f k ( x k ) ) 2 \color{green} E_k = \frac{1}{2}\sum_{k=1}^{n} ( z_k - f_k(x_k))^2 Ek=21k=1n(zkfk(xk))2

z k = ∑ i = 1 , j = 1 3 , 2 y i v i j z_k = \sum_{i=1,j =1}^{3,2}y_i v_{ij} zk=i=1,j=13,2yivij

∂ E k ∂ z k = z k − f k \frac{\partial E_k}{\partial z_{k}} = z_k - f_k zkEk=zkfk

错误: E k = S ( ∑ i = 1 n z i ) = S ( ∑ i = 1 n ∑ j = 1 n v i j y i j ) \color{red} 错误:E_k = S(\sum_{i=1}^{n} z_i) = S(\sum_{i=1}^{n} \sum_{j=1}^{n}v_{ij}y_{ij}) 错误:Ek=S(i=1nzi)=S(i=1nj=1nvijyij)
并且可手动推导验证如下公式:
s ′ ( u ) = s ( u ) ( 1 − s ( u ) ) s' (u)= s(u)(1-s(u)) s(u)=s(u)(1s(u))

E k E_k Ek v 11 v_{11} v11的偏导数只跟 z 1 z_1 z1有关,跟 z 2 z_2 z2无关。同时, z 1 z_1 z1 v 11 v_{11} v11的偏导数只跟 y 1 y_1 y1 v 11 v_{11} v11有关,跟 y 2 y_2 y2, y 3 y_3 y3 v 21 v_{21} v21, v 12 v_{12} v12, v 21 v_{21} v21等都无关。

∂ E k ∂ v 11 = ∂ E k ∂ z 1 ∂ z 1 ∂ v 11 = ( z 1 − f 1 ) s ( ∑ i = 1 , j = 1 3 , 2 y i j v i j ) ′ = ( z 1 − f 1 ) s ( ∑ i = 1 , j = 1 3 , 2 y i j v i j ) ( 1 − s ( ∑ i = 1 , j = 1 3 , 2 y i j v i j ) ) y 1 = ( z 1 − f 1 ) z 1 ( 1 − z 1 ) y 1 \frac{\partial E_k}{\partial v_{11}}=\frac{\partial E_k}{\partial z_{1}} \frac{\partial z_1}{\partial v_{11}} =( z_1 - f_1)s(\sum_{i=1,j=1}^{3,2}y_{ij} v_{ij})' =\\ ( z_1 - f_1)s(\sum_{i=1,j=1}^{3,2}y_{ij} v_{ij})(1-s(\sum_{i=1,j=1}^{3,2}y_{ij} v_{ij})) y_1 = ( z_1 - f_1)z_1(1-z_1) y_1 v11Ek=z1Ekv11z1=(z1f1)s(i=1,j=13,2yijvij)=(z1f1)s(i=1,j=13,2yijvij)(1s(i=1,j=13,2yijvij))y1=(z1f1)z1(1z1)y1

∂ E k ∂ v 21 = ∂ E k ∂ z 2 ∂ z 2 ∂ v 21 = ( z 2 − f 2 ) s ( ∑ i = 1 , j = 1 3 , 2 y i j v i j ) ′ = ( z 2 − f 2 ) s ( ∑ i = 1 , j = 1 3 , 2 y i j v i j ) ( 1 − s ( ∑ i = 1 , j = 1 3 , 2 y i j v i j ) y 1 v 11 = ( z 2 − f 2 ) z 2 ( 1 − z 2 ) y 1 \frac{\partial E_k}{\partial v_{21}}=\frac{\partial E_k}{\partial z_{2}} \frac{\partial z_2}{\partial v_{21}} =( z_2 - f_2)s(\sum_{i=1,j=1}^{3,2}y_{ij} v_{ij})' =\\ ( z_2 - f_2)s(\sum_{i=1,j=1}^{3,2}y_{ij} v_{ij})(1-s(\sum_{i=1,j=1}^{3,2}y_{ij} v_{ij}) y_1v_{11} = ( z_2 - f_2)z_2(1-z_2) y_1 v21Ek=z2Ekv21z2=(z2f2)s(i=1,j=13,2yijvij)=(z2f2)s(i=1,j=13,2yijvij)(1s(i=1,j=13,2yijvij)y1v11=(z2f2)z2(1z2)y1

∂ E k ∂ u 11 = ∂ E k ∂ z 2 ∂ z 2 ∂ y 1 ∂ y 1 ∂ x 1 + ∂ E k ∂ z 1 ∂ z 1 ∂ y 1 ∂ y 1 ∂ x 1 = ( z 2 − f 2 ) s ( ∑ i = 1 , j = 1 3 , 2 y i j v i j ) ′ + ( z 1 − f 1 ) s ( ∑ i = 1 , j = 1 3 , 2 y i j v i j ) ′ = ( z 2 − f 2 ) z 2 ( 1 − z 2 ) v 21 y 1 ( 1 − y 1 ) x 1 + ( z 1 − f 1 ) z 1 ( 1 − z 1 ) v 11 y 1 ( 1 − y 1 ) x 1 \frac{\partial E_k}{\partial u_{11}}=\frac{\partial E_k}{\partial z_{2}} \frac{\partial z_2}{\partial y_{1}} \frac{\partial y_1}{\partial x_{1}} + \frac{\partial E_k}{\partial z_{1}} \frac{\partial z_1}{\partial y_{1}} \frac{\partial y_1}{\partial x_{1}} = \\ ( z_2 - f_2)s(\sum_{i=1,j=1}^{3,2}y_{ij} v_{ij})' + ( z_1 - f_1)s(\sum_{i=1,j=1}^{3,2}y_{ij} v_{ij})' =\\ ( z_2 - f_2)z_2(1-z_2) v_{21} y_1(1-y_1) x1 + ( z_1 - f_1)z_1(1-z_1)v_{11} y_1(1-y_1)x_1 u11Ek=z2Eky1z2x1y1+z1Eky1z1x1y1=(z2f2)s(i=1,j=13,2yijvij)+(z1f1)s(i=1,j=13,2yijvij)=(z2f2)z2(1z2)v21y1(1y1)x1+(z1f1)z1(1z1)v11y1(1y1)x1

例子2:

此例子是一个二分类问题
在这里插入图片描述

解:

首先注意到,本轮训练的样本为[(2,3),0],也就是说,样本为(2,3),输出为0

其次,所有的对数默认为底数是自然对数e,且 L ( a , y ) ′ = [ − y ln ⁡ a − ( 1 − y ) ln ⁡ ) 1 − a ) ] ′ = − y a − − ( 1 − y ) 1 − a = − y a + 1 − y 1 − a L(a,y)'=[-y\ln a - (1-y) \ln )1-a)]' = \frac{-y}{a} - \frac{-(1-y)}{1-a}= \frac{-y}{a} + \frac{1-y}{1-a} L(a,y)=[ylna(1y)ln)1a)]=ay1a(1y)=ay+1a1y (将y看作常数对a求导)

第一次学习过程:

z = ∑ i = 1 2 w i x i + b = 0.6 + 1.2 + 1 = 2.8 z = \sum_{i =1}^{2}w_i x_i + b= 0.6 + 1.2 + 1 = 2.8 z=i=12wixi+b=0.6+1.2+1=2.8

a = 1 1 + e − z = 0.94 a = \frac{1}{1+e^{-z}} = 0.94 a=1+ez1=0.94

L ( a , y ) ∣ y = 0 = − ln ⁡ ( 1 − a ) = 2.8 L(a,y) |_{y=0} = -\ln(1-a) = 2.8 L(a,y)y=0=ln(1a)=2.8

第一次结束后,利用反向求导公式求导:

∂ L ∂ w 1 = ∂ L ∂ a ∂ a ∂ z ∂ z ∂ w 1 = ( − y a + 1 − y 1 − a ) a ( 1 − a ) x 1 = 1.8 \frac{\partial L}{\partial w_1} =\frac{\partial L}{\partial a} \frac{\partial a}{\partial z} \frac{\partial z}{\partial w_1} =\\ ( \frac{-y}{a} + \frac{1-y}{1-a})a(1-a)x_1 = 1.8 w1L=aLzaw1z=(ay+1a1y)a(1a)x1=1.8

∂ L ∂ w 2 = ∂ L ∂ a ∂ a ∂ z ∂ z ∂ w 1 = ( − y a + 1 − y 1 − a ) a ( 1 − a ) x 2 = 2.8 \frac{\partial L}{\partial w_2} =\frac{\partial L}{\partial a} \frac{\partial a}{\partial z} \frac{\partial z}{\partial w_1} =\\ ( \frac{-y}{a} + \frac{1-y}{1-a})a(1-a)x_2 = 2.8 w2L=aLzaw1z=(ay+1a1y)a(1a)x2=2.8

∂ L ∂ b = ∂ L ∂ a ∂ a ∂ z = ( − y a + 1 − y 1 − a ) a ( 1 − a ) = 0.9 \frac{\partial L}{\partial b} =\frac{\partial L}{\partial a} \frac{\partial a}{\partial z} =\\ ( \frac{-y}{a} + \frac{1-y}{1-a})a(1-a)= 0.9 bL=aLza=(ay+1a1y)a(1a)=0.9

利用梯度下降算法修正w参数的值:

w 1 = w 1 − w 1 ∗ η ∇ w 1 = 0.11 w_1 = w_1 - w_1 * \eta \nabla_{w_1} = 0.11 w1=w1w1ηw1=0.11

w 2 = w 2 − w 2 ∗ η ∇ w 2 = 0.12 w_2 = w_2 - w_2 * \eta \nabla_{w_2} = 0.12 w2=w2w2ηw2=0.12

b = b − b ∗ η ∇ b = 0.09 b = b - b * \eta \nabla_{b} =0.09 b=bbηb=0.09

此处需要避免的错误是,梯度下降算法减去的是学习率乘以梯度,而不是x值乘以学习率乘以梯度,一定要理解清楚,不要稀里糊涂。

例子3:

在这里插入图片描述

解答:

  • 4
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值