关于向量求导用到的公式实在是太多了…经常公式推着推着就被卡住,这里一次性做个总结吧。
0.引言
正文中,元素使用字母a,b,c等表示,向量使用小写的 x , y , z x,y,z x,y,z等表示,并且默认是列向量,矩阵使用大写的A,B,C进行表示。
1.向量对元素求导
- 行向量对元素求导
∂ x T ∂ a = [ ∂ x 1 ∂ a , ∂ x 2 ∂ a , . . . , ∂ x n ∂ a ] \frac{\partial x^T}{\partial a}= \begin{bmatrix} \frac{\partial x_1}{\partial a} , \frac{\partial x_2}{\partial a},... ,\frac{\partial x_n}{\partial a} \end{bmatrix} ∂a∂xT=[∂a∂x1,∂a∂x2,...,∂a∂xn] - 列向量对元素求导
∂ x ∂ a = [ ∂ x 1 ∂ a ∂ x 2 ∂ a . . . ∂ x n ∂ a ] \frac{\partial x}{\partial a}= \begin{bmatrix} \frac{\partial x_1}{\partial a} \\ \frac{\partial x_2}{\partial a}\\... \\\frac{\partial x_n}{\partial a} \end{bmatrix} ∂a∂x=⎣⎢⎢⎡∂a∂x1∂a∂x2...∂a∂xn⎦⎥⎥⎤
2.向量对向量求导
- 行向量对列向量求导
∂ y T ∂ x = [ ∂ y 1 ∂ x 1 , ∂ y 2 ∂ x 1 , . . . , ∂ y n ∂ x 1 ∂ y 1 ∂ x 2 , ∂ y 2 ∂ x 2 , . . . , ∂ y n ∂ x 2 . . . ∂ y 1 ∂ x n , ∂ y 2 ∂ x n , . . . , ∂ y n ∂ x n ] \frac{\partial y^T}{\partial x}= \begin{bmatrix} \frac{\partial y_1}{\partial x_1} , \frac{\partial y_2}{\partial x_1},... ,\frac{\partial y_n}{\partial x_1} \\ \frac{\partial y_1}{\partial x_2} , \frac{\partial y_2}{\partial x_2},... ,\frac{\partial y_n}{\partial x_2} \\ ...\\ \frac{\partial y_1}{\partial x_n} , \frac{\partial y_2}{\partial x_n},... ,\frac{\partial y_n}{\partial x_n} \end{bmatrix} ∂x∂yT=⎣⎢⎢⎢⎡∂x1∂y1,∂x1∂y2,...,∂x1∂yn∂x2∂y1,∂x2∂y2,...,∂x2∂yn...∂xn∂y1,∂xn∂y2,...,∂xn∂yn⎦⎥⎥⎥⎤ - 列向量对行向量求导
∂ y ∂ x T = [ ∂ y 1 ∂ x 1 , ∂ y 1 ∂ x 2 , . . . , ∂ y 1 ∂ x n ∂ y 2 ∂ x 1 , ∂ y 2 ∂ x 2 , . . . , ∂ y 2 ∂ x n . . . ∂ y n ∂ x 1 , ∂ y n ∂ x 2 , . . . , ∂ y n ∂ x n ] \frac{\partial y}{\partial x^T}= \begin{bmatrix} \frac{\partial y_1}{\partial x_1} , \frac{\partial y_1}{\partial x_2},... ,\frac{\partial y_1}{\partial x_n} \\ \frac{\partial y_2}{\partial x_1} , \frac{\partial y_2}{\partial x_2},... ,\frac{\partial y_2}{\partial x_n} \\ ...\\ \frac{\partial y_n}{\partial x_1} , \frac{\partial y_n}{\partial x_2},... ,\frac{\partial y_n}{\partial x_n} \end{bmatrix} ∂xT∂y=⎣⎢⎢⎢⎡∂x1∂y1,∂x2∂y1,...,∂xn∂y1∂x1∂y2,∂x2∂y2,...,∂xn∂y2...∂x1∂yn,∂x2∂yn,...,∂xn∂yn⎦⎥⎥⎥⎤ - 行向量对行向量求导
∂ y T ∂ x T = [ ∂ y T ∂ x 1 , ∂ y T ∂ x 2 , . . . , ∂ y T ∂ x n ] \frac{\partial y^T}{\partial x^T}= \begin{bmatrix} \frac{\partial y_T}{\partial x_1} , \frac{\partial y_T}{\partial x_2},... ,\frac{\partial y_T}{\partial x_n} \end{bmatrix} ∂xT∂yT=[∂x1∂yT,∂x2∂yT,...,∂xn∂yT] - 列向量对列向量求导
∂ y ∂ x = [ ∂ y 1 ∂ x ∂ y 2 ∂ x . . . ∂ y n ∂ x ] \frac{\partial y}{\partial x}= \begin{bmatrix} \frac{\partial y_1}{\partial x} \\ \frac{\partial y_2}{\partial x}\\... \\\frac{\partial y_n}{\partial x} \end{bmatrix} ∂x∂y=⎣⎢⎢⎡∂x∂y1∂x∂y2...∂x∂yn⎦⎥⎥⎤
3.矩阵对向量求导
- 矩阵对行向量求导
∂ A ∂ x T = [ ∂ A ∂ x 1 , ∂ A ∂ x 2 , . . . , ∂ A ∂ x n ] \frac{\partial A}{\partial x^T}= \begin{bmatrix} \frac{\partial A}{\partial x_1} , \frac{\partial A}{\partial x_2},... ,\frac{\partial A}{\partial x_n} \end{bmatrix} ∂xT∂A=[∂x1∂A,∂x2∂A,...,∂xn∂A] - 矩阵对列向量求导
∂ A ∂ x = [ ∂ A 11 ∂ x , ∂ A 12 ∂ x , . . . , ∂ A 1 n ∂ x . . . ∂ A n 1 ∂ x , ∂ A n 2 ∂ x , . . . , ∂ A n n ∂ x ] \frac{\partial A}{\partial x}= \begin{bmatrix} \frac{\partial A_{11}}{\partial x} , \frac{\partial A_{12}}{\partial x},... ,\frac{\partial A_{1n}}{\partial x}\\...\\\ \frac{\partial A_{n1}}{\partial x} , \frac{\partial A_{n2}}{\partial x},... ,\frac{\partial A_{nn}}{\partial x} \end{bmatrix} ∂x∂A=⎣⎡∂x∂A11,∂x∂A12,...,∂x∂A1n... ∂x∂An1,∂x∂An2,...,∂x∂Ann⎦⎤
4.矩阵复合向量的求导
- d d x x T A = A \frac{d}{dx}x^TA=A dxdxTA=A
- d d x T A x = A \frac{d}{dx^T}Ax=A dxTdAx=A
- d d x x A = A T \frac{d}{dx}xA=A^T dxdxA=AT
- d d x A x = A T \frac{d}{dx}Ax=A^T dxdAx=AT
- d d x x T = I \frac{d}{dx}x^T=I dxdxT=I
- d d x T x = I \frac{d}{dx^T}x=I dxTdx=I
- d d x x T y = d d x y T x = y \frac{d}{dx}x^Ty=\frac{d}{dx}y^Tx=y dxdxTy=dxdyTx=y
- d d x x T A y = x y T \frac{d}{dx}x^TAy=xy^T dxdxTAy=xyT
- d d A x T A x = x x T \frac{d}{dA}x^TAx=xx^T dAdxTAx=xxT
- d d A x T A T y = y x T \frac{d}{dA}x^TA^Ty=yx^T dAdxTATy=yxT
- d d x x T A x = ( A + A T ) x = 2 A x \frac{d}{dx}x^TAx=(A+A^T)x=2Ax dxdxTAx=(A+AT)x=2Ax(当A为对称矩阵时第二个等式成立)
- d d x x T x = 2 x \frac{d}{dx}x^Tx=2x dxdxTx=2x