参考资料
cs231n Course Materials: Backprop
Derivatives, Backpropagation, and Vectorization
cs231n Lecture 4:Neural Networks and Backpropagation
cs231n Assignment 2
笔记: Batch Normalization及其反向传播
2. ReLU
前向传播
Y = max ( 0 , X ) (2.1) Y=\max{(0,X)}\tag{2.1} Y=max(0,X)(2.1)
反向传播
∂ L ∂ X n , d = ∂ L ∂ Y n , d ∂ Y n , d ∂ X n , d = ∂ L ∂ Y n , d 1 { X n , d > 0 } (2.2) \begin{aligned}\frac{\partial{L}}{\partial{X_{n,d}}}&=\frac{\partial{L}}{\partial{Y_{n,d}}}\frac{\partial{Y_{n,d}}}{\partial{X_{n,d}}}\\&=\frac{\partial{L}}{\partial{Y_{n,d}}}\mathbf{1}\{X_{n,d}>0\}\end{aligned}\tag{2.2} ∂Xn,d∂L=∂Yn,d∂L∂Xn,d∂Yn,d=∂Yn,d∂L1{Xn,d>0}(2.2)