深度学习与PyTorch笔记15

MLP反向传播

在这里插入图片描述
最终的loss对倒数第二层的 W i j W_{ij} Wij求导的过程:
W j k W_{jk} Wjk
∂ E ∂ W j k = ( O k − t k ) O k ( 1 − O k ) O j J \frac{\partial E}{\partial W_{jk}}=(O_{k}-t_{k})O_{k}(1-O_{k})O^{J}_{j} WjkE=(Oktk)Ok(1Ok)OjJ
δ k K = ( O k − t k ) O k ( 1 − O k ) \delta^{K}_{k}=(O_{k}-t_{k})O_{k}(1-O_{k}) δkK=(Oktk)Ok(1Ok) δ k K \delta^{K}_{k} δkK有k个值。
∂ E ∂ W j k = δ k K O j J \frac{\partial E}{\partial W_{jk}}=\delta^{K}_{k}O^{J}_{j} WjkE=δkKOjJ
w i j w_{ij} wij,将 E E E展开,
∂ E ∂ W i j = ∂ ∂ W i j 1 2 ∑ k ∈ K ( O k − t k ) 2 \frac{\partial E}{\partial W_{ij}}=\frac{\partial }{\partial W_{ij}}\frac{1}{2}\sum_{k\in K}(O_{k}-t_{k})^{2} WijE=Wij21kK(Oktk)2
∂ E ∂ W i j = ∑ k ∈ K ( O k − t k ) ∂ ∂ W i j O k \frac{\partial E}{\partial W_{ij}}=\sum_{k\in K}(O_{k}-t_{k})\frac{\partial }{\partial W_{ij}}O_{k} WijE=kK(Oktk)WijOk
∂ E ∂ W i j = ∑ k ∈ K ( O k − t k ) ∂ ∂ W i j σ ( x k ) \frac{\partial E}{\partial W_{ij}}=\sum_{k\in K}(O_{k}-t_{k})\frac{\partial }{\partial W_{ij}}\sigma(x_{k}) WijE=kK(Oktk)Wijσ(xk)
∂ E ∂ W i j = ∑ k ∈ K ( O k − t k ) σ ( x k ) ( 1 − σ ( x k ) ) ∂ x k ∂ W i j \frac{\partial E}{\partial W_{ij}}=\sum_{k\in K}(O_{k}-t_{k})\sigma(x_{k})(1-\sigma(x_{k}))\frac{\partial x_{k}}{\partial W_{ij}} WijE=kK(Oktk)σ(xk)(1σ(xk))Wijxk
∂ E ∂ W i j = ∑ k ∈ K ( O k − t k ) O k ( 1 − O k ) ∂ x k ∂ O j ⋅ ∂ O j ∂ W i j \frac{\partial E}{\partial W_{ij}}=\sum_{k\in K}(O_{k}-t_{k})O_{k}(1-O_{k})\frac{\partial x_{k}}{\partial O_{j}}\cdot\frac{\partial O_{j}}{\partial W_{ij}} WijE=kK(Oktk)Ok(1Ok)OjxkWijOj
∂ E ∂ W i j = ∑ k ∈ K ( O k − t k ) O k ( 1 − O k ) W j k ∂ O j ∂ W i j \frac{\partial E}{\partial W_{ij}}=\sum_{k\in K}(O_{k}-t_{k})O_{k}(1-O_{k})W_{jk}\frac{\partial O_{j}}{\partial W_{ij}} WijE=kK(Oktk)Ok(1Ok)WjkWijOj
∂ E ∂ W i j = ∂ O j ∂ W i j ∑ k ∈ K ( O k − t k ) O k ( 1 − O k ) W j k \frac{\partial E}{\partial W_{ij}}=\frac{\partial O_{j}}{\partial W_{ij}}\sum_{k\in K}(O_{k}-t_{k})O_{k}(1-O_{k})W_{jk} WijE=WijOjkK(Oktk)Ok(1Ok)Wjk
∂ E ∂ W i j = O j ( 1 − O j ) ∂ x j ∂ W i j ∑ k ∈ K ( O k − t k ) O k ( 1 − O k ) W j k \frac{\partial E}{\partial W_{ij}}=O_{j}(1-O_{j})\frac{\partial x_{j}}{\partial W_{ij}}\sum_{k\in K}(O_{k}-t_{k})O_{k}(1-O_{k})W_{jk} WijE=Oj(1Oj)WijxjkK(Oktk)Ok(1Ok)Wjk
∂ E ∂ W i j = O j ( 1 − O j ) O i ∑ k ∈ K ( O k − t k ) O k ( 1 − O k ) W j k \frac{\partial E}{\partial W_{ij}}=O_{j}(1-O_{j})O_{i}\sum_{k\in K}(O_{k}-t_{k})O_{k}(1-O_{k})W_{jk} WijE=Oj(1Oj)OikK(Oktk)Ok(1Ok)Wjk
∂ E ∂ W i j = O i O j ( 1 − O j ) ∑ k ∈ K σ k W j k \frac{\partial E}{\partial W_{ij}}=O_{i}O_{j}(1-O_{j})\sum_{k\in K}\sigma_{k}W_{jk} WijE=OiOj(1Oj)kKσkWjk
总结:
For an output layer node k ∈ K k\in K kK
∂ E ∂ W j k = O j δ k \frac{\partial E}{\partial W_{jk}}=O_{j}\delta_{k} WjkE=Ojδk
Where δ k = ( O k − t k ) O k ( 1 − O k ) \delta_{k}=(O_{k}-t_{k})O_{k}(1-O_{k}) δk=(Oktk)Ok(1Ok)
For a hidden layer node j ∈ J j\in J jJ
∂ E ∂ W j k = O i δ j \frac{\partial E}{\partial W_{jk}}=O_{i}\delta_{j} WjkE=Oiδj
Where δ j = O j ( 1 − O j ) ∑ k ∈ K σ k W j k \delta_{j}=O_{j}(1-O_{j})\sum_{k\in K}\sigma_{k}W_{jk} δj=Oj(1Oj)kKσkWjk

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值