softmax函数的表达式: a i = e z i ∑ k e z k a_{i}=\frac{e^{z_{i}}}{\sum_{k} e^{z_{k}}} ai=∑kezkezi
交叉熵 损失函数: C = − ∑ i y i ln a i C=-\sum_{i} y_{i} \ln a_{i} C=−∑iyilnai
根据复合函数求导法则: ∂ C ∂ z i = ∑ j ( ∂ C j ∂ a j ∂ a j ∂ z i ) \frac{\partial C}{\partial z_{i}}=\sum_{j}\left(\frac{\partial C_{j}}{\partial a_{j}} \frac{\partial a_{j}}{\partial z_{i}}\right) ∂zi∂C=∑j(∂aj∂Cj∂zi∂aj)
计算前面一项: ∂ C j ∂ a j = ∂ ( − y j ln a j ) ∂ a j = − y j 1 a j \frac{\partial C_{j}}{\partial a_{j}}=\frac{\partial\left(-y_{j} \ln a_{j}\right)}{\partial a_{j}}=-y_{j} \frac{1}{a_{j}} ∂aj∂Cj=∂aj∂(−yjlnaj)=−yjaj