Backpropagation
-
Loss function content:
L ( θ ) = ∑ i C i n ( θ ) = ∑ i o i 2 − o i ^ 2 L(\theta)=\sum_{i}C_i^n(\theta)=\sum_{i}o_i^2-\hat{o_i}^2 L(θ)=i∑Cin(θ)=i∑oi2−oi^2
-
how to calculate the partial value of partial of var
∂ C i ∂ θ = ∂ o i ∂ θ ∗ 2 o i {\partial{C_i} \over \partial{\theta}}={\partial{o_i}\over \partial{}\theta}*2o_i ∂θ∂Ci=∂θ∂oi∗2oi
- for example, we calculate the partial of w11 in the level n-1.
∂ o i ∂ w 11 n − 1 = ∂ o i ∂ y 1 n − 1 ∂ y 1 n − 1 ∂ w 11 n − 1 = ∂ o i ∂ y 1 n − 1 x 1 {\partial{o_i}\over \partial{}w^{n-1}_{11}}={\partial{o_i}\over \partial{y^{n-1}_{1}}}{\partial{y^{n-1}_{1}}\over \partial{w^{n-1}_{11}}}={\partial{o_i}\over \partial{y^{n-1}_{1}}}x_1 ∂w11n−1∂oi=∂y1n−1∂oi∂w11n−1∂y1n−1=∂y1n−1∂oix1
- backward pass: we convert the question to calculate
∂ o i ∂ y 1 n − 1 = ∂ o i ∂ x 1 n ∂ x 1 n ∂ y 1 n − 1 = ∂ o i ∂ x 1 n ϕ ′ ( y 1 n − 1 ) {\partial{o_i}\over \partial{y^{n-1}_{1}}}={\partial{o_i}\over \partial{x^{n}_{1}}}{\partial{x^n_1}\over \partial{y^{n-1}_{1}}}={\partial{o_i}\over \partial{x^{n}_{1}}}\phi'(y^{n-1}_1) ∂y1n−1∂oi=∂x1n∂oi∂y1n−1∂x1n=∂x1n∂oiϕ′(y1n−1)
∂ o i ∂ x 1 n = ∂ o i ∂ y 1 n w 11 n + ∂ o i ∂ y 2 n w 12 n + ∂ o i ∂ y 3 n w 13 n {\partial{o_i}\over \partial{x^{n}_{1}}}={\partial{o_i}\over \partial{y^{n}_{1}}}w^n_{11}+{\partial{o_i}\over \partial{y^{n}_{2}}}w^n_{12}+{\partial{o_i}\over \partial{y^{n}_{3}}}w^n_{13} ∂x1n∂oi=∂y1n∂oiw11n+∂y2n∂oiw12n+∂y3n∂oiw13n
Therefore, we have
[ ∂ o i ∂ y 1 n − 1 ∂ o i ∂ y 2 n − 1 ∂ o i ∂ y 3 n − 1 ] = [ ϕ ′ ( y 1 n − 1 ) 0 0 0 ϕ ′ ( y 2 n − 1 ) 0 0 0 ϕ ′ ( y 3 n − 1 ) ] [ w 11 w 12 w 13 w 21 w 22 w 23 w 31 w 32 w 33 ] n [ ∂ o i ∂ y 1 n ∂ o i ∂ y 2 n ∂ o i ∂ y 3 n ] \begin{bmatrix} {\partial{o_i}\over \partial{y^{n-1}_{1}}} \\ {\partial{o_i}\over \partial{y^{n-1}_{2}}}\\{\partial{o_i}\over \partial{y^{n-1}_{3}}} \end{bmatrix} =\begin{bmatrix} \phi'(y^{n-1}_1) &0&0\\ 0&\phi'(y^{n-1}_2)&0\\0&0&\phi'(y^{n-1}_3) \end{bmatrix} \begin{bmatrix}w_{11}w_{12}w_{13}\\w_{21}w_{22}w_{23}\\w_{31}w_{32}w_{33} \end{bmatrix}^n \begin{bmatrix} {\partial{o_i}\over \partial{y^{n}_{1}}} \\ {\partial{o_i}\over \partial{y^{n}_{2}}}\\{\partial{o_i}\over \partial{y^{n}_{3}}} \end{bmatrix} ⎣⎢⎡∂y1n−1∂oi∂y2n−1∂oi∂y3n−1∂oi⎦⎥⎤=⎣⎡ϕ′(y1n−1)000ϕ′(y2n−1)000ϕ′(y3n−1)⎦⎤⎣⎡w11w12w13w21w22w23w31w32w33⎦⎤n⎣⎢⎡∂y1n∂oi∂y2n∂oi∂y3n∂oi⎦⎥⎤