本人数学很菜, 连求导都要想半天… 下面是记录读这篇博客的过程中, 对卷积求导的过程
https://blog.csdn.net/xiaojiajia007/article/details/75041651
假设有 y 4 × 1 = C 4 × 16 x 16 × 1 y_{4\times 1}=C_{4\times 16}x_{16\times 1} y4×1=C4×16x16×1,已知 ∂ L o s s y \partial Loss \over \mathbf y y∂Loss,求 ∂ L o s s x \partial Loss \over \mathbf x x∂Loss
∂ L o s s x = [ ∂ L o s s ∂ x 1 ⋮ ∂ L o s s ∂ x 16 ] = [ ∑ i = 1 4 ∂ L o s s ∂ y i ∂ y i ∂ x 1 ⋮ ∑ i = 1 4 ∂ L o s s ∂ y i ∂ y i ∂ x 16 ] {\partial Loss \over \mathbf x}=\begin{bmatrix}\partial Loss\over \partial x_1\\\vdots\\\partial Loss\over \partial x_{16}\end{bmatrix}=\begin{bmatrix}\sum_{i=1}^4{\frac{\partial Loss}{\partial y_i}\frac{\partial y_i}{\partial x_1}}\\\vdots\\\sum_{i=1}^4{\frac{\partial Loss}{\partial y_i}\frac{\partial y_i}{\partial x_{16}}}\end{bmatrix} x∂Loss=⎣⎢⎡∂x1∂Loss⋮∂x16∂Loss⎦⎥⎤=⎣⎢⎢⎡∑i=14∂yi∂Loss∂x1∂yi⋮∑i=14∂yi∂Loss∂x16∂yi⎦⎥⎥⎤
元素相乘再求和可以写成矩阵乘法:
= [ [ ∂ L o s s ∂ y 1 … ∂ L o s s ∂ y 4 ] × [ ∂ y 1 ∂ x 1 ⋮ ∂ y 4 ∂ x 1 ] ⋮ [ ∂ L o s s y 1 … ∂ L o s s ∂ y 4 ] × [ ∂ y 1 ∂ x 16 ⋮ ∂ y 4 ∂ x 16 ] ] =\begin{bmatrix}\begin{bmatrix}\partial Loss \over \partial y_1 & \dots & \partial Loss \over \partial y_4\end{bmatrix} \times \begin{bmatrix} \partial y_1 \over \partial x_1 \\\vdots\\\partial y_4 \over \partial x_1 \end{bmatrix}\\\vdots\\\begin{bmatrix} \partial Loss \over y_1 & \dots & \partial Loss \over \partial y_4 \end{bmatrix} \times \begin{bmatrix} \partial y_1 \over \partial x_{16} \\\vdots\\\partial y_4 \over \partial x_{16} \end{bmatrix}\end{bmatrix} =⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡[∂y1∂Loss…∂y4∂Loss]×⎣⎢⎡∂x1∂y1⋮∂x1∂y4⎦⎥⎤⋮[y1∂Loss…∂y4∂Loss]×⎣⎢⎡∂x16∂y1⋮∂x16∂y4⎦⎥⎤⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤
= [ ∂ L o s s ∂ y T × ∂ y ∂ x 1 ⋮ ∂ L o s s ∂ y T × ∂ y ∂ x 16 ] =\begin{bmatrix} \frac{\partial Loss}{\partial \mathbf y}^T\times \frac{\partial \mathbf y}{\partial x_1}\\\vdots\\\frac{\partial Loss}{\partial \mathbf y}^T\times\frac{\partial \mathbf y}{\partial x_{16}} \end{bmatrix} =⎣⎢⎢⎡∂y∂LossT×∂x1∂y⋮∂y∂LossT×∂x16∂y⎦⎥⎥⎤
这种情况不能提公因式, 因为 ∂ y ∂ x \frac{\partial \mathbf y}{\partial x} ∂x∂y都是列向量, 把 ∂ L o s s ∂ y \frac{\partial Loss}{\partial \mathbf y} ∂y∂Loss提出来之后, 右边不能成为矩阵
所以转置一下. B T A = ( A T B ) T B^TA=(A^TB)^T BTA=(ATB)T, 由于 ∂ L o s s ∂ y × ∂ y ∂ x 1 \frac{\partial Loss}{\partial \mathbf y}\times \frac{\partial \mathbf y}{\partial x_1} ∂y∂Loss×∂x1∂y得到的是 1 × 1 1\times 1 1×1的数字, 所以 B T A = ( A T B ) T = A T B B^TA=(A^TB)^T=A^TB BTA=(ATB)T=ATB
= [ ∂ y ∂ x 1 T × ∂ L o s s ∂ y ⋮ ∂ y x 16 T × ∂ L o s s ∂ y ] =\begin{bmatrix} \frac{\partial \mathbf y}{\partial x_1}^T\times \frac{\partial Loss}{\partial \mathbf y} \\\vdots \\\frac{\partial \mathbf y}{x_{16}}^T\times \frac{\partial Loss}{\partial \mathbf y} \end{bmatrix} =⎣⎢⎢⎡∂x1∂yT×∂y∂Loss⋮x16∂yT×∂y∂Loss⎦⎥⎥⎤
= [ ∂ y ∂ x 1 T ⋮ ∂ y x 16 T ] × ∂ L o s s ∂ y =\begin{bmatrix} \frac{\partial \mathbf y}{\partial x_1}^T\\\vdots\\\frac{\partial \mathbf y}{x_{16}}^T \end{bmatrix}\times \frac{\partial Loss}{\partial \mathbf y} =⎣⎢⎢⎡∂x1∂yT⋮x16∂yT⎦⎥⎥⎤×∂y∂Loss
= [ ∂ y 1 ∂ x 1 … ∂ y 4 ∂ x 1 ⋮ ⋮ ∂ y 1 ∂ x 16 … ∂ y 4 ∂ x 16 ] × [ ∂ L o s s ∂ y 1 ⋮ ∂ L o s s ∂ y 4 ] =\begin{bmatrix} \frac{\partial y_1}{\partial x_1} & \dots & \frac{\partial y_4}{\partial x_1} \\ \vdots & & \vdots \\ \frac{\partial y_1}{\partial x_{16}} & \dots & \frac{\partial y_4}{\partial x_{16}} \end{bmatrix}\times \begin{bmatrix} \frac{\partial Loss}{\partial y_1} \\ \vdots \\ \frac{\partial Loss}{\partial y_4} \end{bmatrix} =⎣⎢⎡∂x1∂y1⋮∂x16∂y1……∂x1∂y4⋮∂x16∂y4⎦⎥⎤×⎣⎢⎡∂y1∂Loss⋮∂y4∂Loss⎦⎥⎤
= [ c 1 , 1 … c 4 , 1 ⋮ ⋮ c 1 , 16 … c 4 , 16 ] × [ ∂ L o s s ∂ y 1 ⋮ ∂ L o s s ∂ y 4 ] =\begin{bmatrix} c_{1,1} & \dots & c_{4,1}\\\vdots & &\vdots\\ c_{1,16} & \dots &c_{4,16}\end{bmatrix}\times \begin{bmatrix} \frac{\partial Loss}{\partial y_1} \\ \vdots \\ \frac{\partial Loss}{\partial y_4} \end{bmatrix} =⎣⎢⎡c1,1⋮c1,16……c4,1⋮c4,16⎦⎥⎤×⎣⎢⎡∂y1∂Loss⋮∂y4∂Loss⎦⎥⎤