矩阵求导之五:机器学习中的常用公式(下)

结论八

∂ ( a T X T b ) ∂ X = b a T \frac{\partial \left( \boldsymbol{a}^T\boldsymbol{X}^T\boldsymbol{b} \right)}{\partial \boldsymbol{X}}=\boldsymbol{ba}^T X(aTXTb)=baT
其中 a \boldsymbol{a} a b \boldsymbol{b} b为常数向量:
a = [ a 1 a 2 ⋯ a n ] T    b = [ b 1 b 2 ⋯ b m ] T \boldsymbol{a}=\left[ \begin{matrix} a_1& a_2& \cdots& a_n\\ \end{matrix} \right] ^T \\ \ \ \\ \boldsymbol{b}=\left[ \begin{matrix} b_1& b_2& \cdots& b_m\\ \end{matrix} \right] ^T a=[a1a2an]T  b=[b1b2bm]T
【证明】
由于标量的转置仍然等于自己,故:
∂ ( a T X T b ) ∂ X = ∂ ( a T X T b ) T ∂ X = ∂ ( b T X a ) ∂ X \frac{\partial \left( \boldsymbol{a}^T\boldsymbol{X}^T\boldsymbol{b} \right)}{\partial \boldsymbol{X}}=\frac{\partial \left( \boldsymbol{a}^T\boldsymbol{X}^T\boldsymbol{b} \right) ^T}{\partial \boldsymbol{X}}=\frac{\partial \left( \boldsymbol{b}^T\boldsymbol{Xa} \right)}{\partial \boldsymbol{X}} X(aTXTb)=X(aTXTb)T=X(bTXa)
结论七可知:
∂ ( a T X T b ) ∂ X = b a T \frac{\partial \left( \boldsymbol{a}^T\boldsymbol{X}^T\boldsymbol{b} \right)}{\partial \boldsymbol{X}}=\boldsymbol{ba}^T X(aTXTb)=baT

结论九

∂ ( a T X X T b ) ∂ X = a b T X + b a T X \frac{\partial \left( \boldsymbol{a}^T\boldsymbol{XX}^T\boldsymbol{b} \right)}{\partial \boldsymbol{X}}=\boldsymbol{ab}^T\boldsymbol{X}+\boldsymbol{ba}^T\boldsymbol{X} X(aTXXTb)=abTX+baTX
其中 a \boldsymbol{a} a b \boldsymbol{b} b为常数向量:
a = [ a 1 a 2 ⋯ a m ] T    b = [ b 1 b 2 ⋯ b m ] T \boldsymbol{a}=\left[ \begin{matrix} a_1& a_2& \cdots& a_m\\ \end{matrix} \right] ^T \\ \ \ \\ \boldsymbol{b}=\left[ \begin{matrix} b_1& b_2& \cdots& b_m\\ \end{matrix} \right] ^T a=[a1a2am]T  b=[b1b2bm]T
【证明】

f ( X ) = a T X X T b = ( [ ( a 1 b 1 ) ( x 11 x 11 + x 12 x 12 + ⋯ + x 1 n x 1 n ) ] + [ ( a 1 b 2 ) ( x 11 x 21 + x 12 x 22 + ⋯ + x 1 n x 2 n ) ] + ⋯ + [ ( a 1 b m ) ( x 11 x m 1 + x 12 x m 2 + ⋯ + x 1 n x m n ) ] + [ ( a 2 b 1 ) ( x 21 x 11 + x 22 x 12 + ⋯ + x 2 n x 1 n ) ] + [ ( a 2 b 2 ) ( x 21 x 21 + x 22 x 22 + ⋯ + x 2 n x 2 n ) ] + ⋯ + [ ( a 2 b m ) ( x 21 x m 1 + x 22 x m 2 + ⋯ + x 2 n x m n ) ] + ⋯ + [ ( a m b 1 ) ( x m 1 x 11 + x m 2 x 12 + ⋯ + x m n x 1 n ) ] + [ ( a m b 2 ) ( x m 1 x 21 + x m 2 x 22 + ⋯ + x m n x 2 n ) ] + ⋯ + [ ( a m b m ) ( x m 1 x m 1 + x m 2 x m 2 + ⋯ + x m n x m n ) ] ) f\left( \boldsymbol{X} \right) = \boldsymbol{a}^T\boldsymbol{XX}^T\boldsymbol{b} =\left( \begin{array}{c} [(a_1b_1)(x_{11}x_{11}+x_{12}x_{12}+\cdots +x_{1n}x_{1n})]+\\ [(a_1b_2)(x_{11}x_{21}+x_{12}x_{22}+\cdots +x_{1n}x_{2n})]+\\ \cdots +\\ [(a_1b_m)(x_{11}x_{m1}+x_{12}x_{m2}+\cdots +x_{1n}x_{mn})]+\\ [(a_2b_1)(x_{21}x_{11}+x_{22}x_{12}+\cdots +x_{2n}x_{1n})]+\\ [(a_2b_2)(x_{21}x_{21}+x_{22}x_{22}+\cdots +x_{2n}x_{2n})]+\\ \cdots +\\ [(a_2b_m)(x_{21}x_{m1}+x_{22}x_{m2}+\cdots +x_{2n}x_{mn})]+\\ \cdots +\\ [(a_mb_1)(x_{m1}x_{11}+x_{m2}x_{12}+\cdots +x_{mn}x_{1n})]+\\ [(a_mb_2)(x_{m1}x_{21}+x_{m2}x_{22}+\cdots +x_{mn}x_{2n})]+\\ \cdots +\\ [(a_mb_m)(x_{m1}x_{m1}+x_{m2}x_{m2}+\cdots +x_{mn}x_{mn})]\\ \end{array} \right) f(X)=aTXXTb= [(a1b1)(x11x11+x12x12++x1nx1n)]+[(a1b2)(x11x21+x12x22++x1nx2n)]++[(a1bm)(x11xm1+x12xm2++x1nxmn)]+[(a2b1)(x21x11+x22x12++x2nx1n)]+[(a2b2)(x21x21+x22x22++x2nx2n)]++[(a2bm)(x21xm1+x22xm2++x2nxmn)]++[(amb1)(xm1x11+xm2x12++xmnx1n)]+[(amb2)(xm1x21+xm2x22++xmnx2n)]++[(ambm)(xm1xm1+xm2xm2++xmnxmn)]

∂ ( a T X X T b ) ∂ X = [ ∂ f ∂ x 11 ∂ f ∂ x 12 ⋯ ∂ f ∂ x 1 n ∂ f ∂ x 21 ∂ f ∂ x 22 ⋯ ∂ f ∂ x 2 n ⋮ ⋮ ⋱ ⋮ ∂ f ∂ x m 1 ∂ f ∂ x m 2 ⋯ ∂ f ∂ x m n ] m × n = [ ( a 1 b 1 x 11 + a 1 b 2 x 21 + ⋯ + a 1 b m x m 1 ) + ( b 1 a 1 x 11 + b 1 a 2 x 21 + ⋯ + b 1 a m x m 1 ) ( a 1 b 1 x 12 + a 1 b 2 x 22 + ⋯ + a 1 b m x m 2 ) + ( b 1 a 1 x 12 + b 1 a 2 x 22 + ⋯ + b 1 a m x m 2 ) ⋯ ( a 1 b 1 x 1 n + a 1 b 2 x 2 n + ⋯ + a 1 b m x m n ) + ( b 1 a 1 x 1 n + b 1 a 2 x 2 n + ⋯ + b 1 a m x m n ) ( a 2 b 1 x 11 + a 2 b 2 x 21 + ⋯ + a 2 b m x m 1 ) + ( b 2 a 1 x 11 + b 2 a 2 x 21 + ⋯ + b 2 a m x m 1 ) ( a 2 b 1 x 12 + a 2 b 2 x 22 + ⋯ + a 2 b m x m 2 ) + ( b 2 a 1 x 12 + b 2 a 2 x 22 + ⋯ + b 2 a m x m 2 ) ⋯ ( a 2 b 1 x 1 n + a 2 b 2 x 2 n + ⋯ + a 2 b m x m n ) + ( b 2 a 1 x 1 n + b 2 a 2 x 2 n + ⋯ + b 2 a m x m n ) ⋮ ⋮ ⋱ ⋮ ( a m b 1 x 11 + a m b 2 x 21 + ⋯ + a m b m x m 1 ) + ( b m a 1 x 11 + b m a 2 x 21 + ⋯ + b m a m x m 1 ) ( a m b 1 x 12 + a m b 2 x 22 + ⋯ + a m b m x m 2 ) + ( b m a 1 x 12 + b m a 2 x 22 + ⋯ + b m a m x m 2 ) ⋯ ( a m b 1 x 1 n + a m b 2 x 2 n + ⋯ + a m b m x m n ) + ( b m a 1 x 1 n + b m a 2 x 2 n + ⋯ + b m a m x m n ) ] = [ a 1 b 1 x 11 + a 1 b 2 x 21 + ⋯ + a 1 b m x m 1 a 1 b 1 x 12 + a 1 b 2 x 22 + ⋯ + a 1 b m x m 2 ⋯ a 1 b 1 x 1 n + a 1 b 2 x 2 n + ⋯ + a 1 b m x m n a 2 b 1 x 11 + a 2 b 2 x 21 + ⋯ + a 2 b m x m 1 a 2 b 1 x 12 + a 2 b 2 x 22 + ⋯ + a 2 b m x m 2 ⋯ a 2 b 1 x 1 n + a 2 b 2 x 2 n + ⋯ + a 2 b m x m n ⋮ ⋮ ⋮ ⋮ a m b 1 x 11 + a m b 2 x 21 + ⋯ + a m b m x m 1 a m b 1 x 12 + a m b 2 x 22 + ⋯ + a m b m x m 2 ⋯ a m b 1 x 1 n + a m b 2 x 2 n + ⋯ + a m b m x m n ] + [ b 1 a 1 x 11 + b 1 a 2 x 21 + ⋯ + b 1 a m x m 1 b 1 a 1 x 12 + b 1 a 2 x 22 + ⋯ + b 1 a m x m 2 ⋯ b 1 a 1 x 1 n + b 1 a 2 x 2 n + ⋯ + b 1 a m x m n b 2 a 1 x 11 + b 2 a 2 x 21 + ⋯ + b 2 a m x m 1 b 2 a 1 x 12 + b 2 a 2 x 22 + ⋯ + b 2 a m x m 2 ⋯ b 2 a 1 x 1 n + b 2 a 2 x 2 n + ⋯ + b 2 a m x m n ⋮ ⋮ ⋱ ⋮ b m a 1 x 11 + b m a 2 x 21 + ⋯ + b m a m x m 1 b m a 1 x 12 + b m a 2 x 22 + ⋯ + b m a m x m 2 ⋯ b m a 1 x 1 n + b m a 2 x 2 n + ⋯ + b m a m x m n ] = [ a 1 b 1 a 1 b 2 ⋯ a 1 b m a 2 b 1 a 2 b 2 ⋯ a 2 b m ⋮ ⋮ ⋱ ⋮ a m b 1 a m b 2 ⋯ a m b m ] [ x 11 x 12 ⋯ x 1 n x 21 x 22 ⋯ x 2 n ⋮ ⋮ ⋱ ⋮ x m 1 x m 2 ⋯ x m n ] + [ b 1 a 1 b 1 a 2 ⋯ b 1 a m b 2 a 1 b 2 a 2 ⋯ b 2 a m ⋮ ⋮ ⋱ ⋮ b m a 1 b m a 2 ⋯ b m a m ] [ x 11 x 12 ⋯ x 1 n x 21 x 22 ⋯ x 2 n ⋮ ⋮ ⋱ ⋮ x m 1 x m 2 ⋯ x m n ] = [ a 1 a 2 ⋮ a m ] [ b 1 , b 2 , ⋯   , b m ] [ x 11 x 12 ⋯ x 1 n x 21 x 22 ⋯ x 2 n ⋮ ⋮ ⋱ ⋮ x m 1 x m 2 ⋯ x m n ] + [ b 1 b 2 ⋮ b m ] [ a 1 , a 2 , ⋯   , a m ] [ x 11 x 12 ⋯ x 1 n x 21 x 22 ⋯ x 2 n ⋮ ⋮ ⋱ ⋮ x m 1 x m 2 ⋯ x m n ] = a b T X + b a T X \begin{aligned} \frac{\partial \left( \boldsymbol{a}^T\boldsymbol{XX}^T\boldsymbol{b} \right)}{\partial \boldsymbol{X}}&=\left[ \begin{matrix} \frac{\partial f}{\partial x_{11}}& \frac{\partial f}{\partial x_{12}}& \cdots& \frac{\partial f}{\partial x_{1n}}\\ & & & \\ \frac{\partial f}{\partial x_{21}}& \frac{\partial f}{\partial x_{22}}& \cdots& \frac{\partial f}{\partial x_{2n}}\\ & & & \\ \vdots& \vdots& \ddots& \vdots\\ & & & \\ \frac{\partial f}{\partial x_{m1}}& \frac{\partial f}{\partial x_{m2}}& \cdots& \frac{\partial f}{\partial x_{mn}}\\ \end{matrix} \right] _{m\times n} \\ \\ &=\left[ \begin{matrix} (a_1b_1x_{11}+a_1b_2x_{21}+\cdots +a_1b_mx_{m1})+(b_1a_1x_{11}+b_1a_2x_{21}+\cdots +b_1a_mx_{m1})& (a_1b_1x_{12}+a_1b_2x_{22}+\cdots +a_1b_mx_{m2})+(b_1a_1x_{12}+b_1a_2x_{22}+\cdots +b_1a_mx_{m2})& \cdots& (a_1b_1x_{1n}+a_1b_2x_{2n}+\cdots +a_1b_mx_{mn})+(b_1a_1x_{1n}+b_1a_2x_{2n}+\cdots +b_1a_mx_{mn})\\ \\ (a_2b_1x_{11}+a_2b_2x_{21}+\cdots +a_2b_mx_{m1})+(b_2a_1x_{11}+b_2a_2x_{21}+\cdots +b_2a_mx_{m1})& (a_2b_1x_{12}+a_2b_2x_{22}+\cdots +a_2b_mx_{m2})+(b_2a_1x_{12}+b_2a_2x_{22}+\cdots +b_2a_mx_{m2})& \cdots& (a_2b_1x_{1n}+a_2b_2x_{2n}+\cdots +a_2b_mx_{mn})+(b_2a_1x_{1n}+b_2a_2x_{2n}+\cdots +b_2a_mx_{mn})\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ (a_mb_1x_{11}+a_mb_2x_{21}+\cdots +a_mb_mx_{m1})+(b_ma_1x_{11}+b_ma_2x_{21}+\cdots +b_ma_mx_{m1})& (a_mb_1x_{12}+a_mb_2x_{22}+\cdots +a_mb_mx_{m2})+(b_ma_1x_{12}+b_ma_2x_{22}+\cdots +b_ma_mx_{m2})& \cdots& (a_mb_1x_{1n}+a_mb_2x_{2n}+\cdots +a_mb_mx_{mn})+(b_ma_1x_{1n}+b_ma_2x_{2n}+\cdots +b_ma_mx_{mn})\\ \end{matrix} \right] \\ \\ &=\left[ \begin{matrix} a_1b_1x_{11}+a_1b_2x_{21}+\cdots +a_1b_mx_{m1}& a_1b_1x_{12}+a_1b_2x_{22}+\cdots +a_1b_mx_{m2}& \cdots& a_1b_1x_{1n}+a_1b_2x_{2n}+\cdots +a_1b_mx_{mn}\\ \\ a_2b_1x_{11}+a_2b_2x_{21}+\cdots +a_2b_mx_{m1}& a_2b_1x_{12}+a_2b_2x_{22}+\cdots +a_2b_mx_{m2}& \cdots& a_2b_1x_{1n}+a_2b_2x_{2n}+\cdots +a_2b_mx_{mn}\\ \\ \vdots& \vdots& \vdots& \vdots\\ \\ a_mb_1x_{11}+a_mb_2x_{21}+\cdots +a_mb_mx_{m1}& a_mb_1x_{12}+a_mb_2x_{22}+\cdots +a_mb_mx_{m2}& \cdots& a_mb_1x_{1n}+a_mb_2x_{2n}+\cdots +a_mb_mx_{mn}\\ \end{matrix} \right] +\left[ \begin{matrix} b_1a_1x_{11}+b_1a_2x_{21}+\cdots +b_1a_mx_{m1}& b_1a_1x_{12}+b_1a_2x_{22}+\cdots +b_1a_mx_{m2}& \cdots& b_1a_1x_{1n}+b_1a_2x_{2n}+\cdots +b_1a_mx_{mn}\\ b_2a_1x_{11}+b_2a_2x_{21}+\cdots +b_2a_mx_{m1}& b_2a_1x_{12}+b_2a_2x_{22}+\cdots +b_2a_mx_{m2}& \cdots& b_2a_1x_{1n}+b_2a_2x_{2n}+\cdots +b_2a_mx_{mn}\\ \vdots& \vdots& \ddots& \vdots\\ b_ma_1x_{11}+b_ma_2x_{21}+\cdots +b_ma_mx_{m1}& b_ma_1x_{12}+b_ma_2x_{22}+\cdots +b_ma_mx_{m2}& \cdots& b_ma_1x_{1n}+b_ma_2x_{2n}+\cdots +b_ma_mx_{mn}\\ \end{matrix} \right] \\ \\ &=\left[ \begin{matrix} a_1b_1& a_1b_2& \cdots& a_1b_m\\ \\ a_2b_1& a_2b_2& \cdots& a_2b_m\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ a_mb_1& a_mb_2& \cdots& a_mb_m\\ \end{matrix} \right] \left[ \begin{matrix} x_{11}& x_{12}& \cdots& x_{1n}\\ \\ x_{21}& x_{22}& \cdots& x_{2n}\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ x_{m1}& x_{m2}& \cdots& x_{mn}\\ \end{matrix} \right] +\left[ \begin{matrix} b_1a_1& b_1a_2& \cdots& b_1a_m\\ \\ b_2a_1& b_2a_2& \cdots& b_2a_m\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ b_ma_1& b_ma_2& \cdots& b_ma_m\\ \end{matrix} \right] \left[ \begin{matrix} x_{11}& x_{12}& \cdots& x_{1n}\\ \\ x_{21}& x_{22}& \cdots& x_{2n}\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ x_{m1}& x_{m2}& \cdots& x_{mn}\\ \end{matrix} \right] \\ \\ &=\left[ \begin{array}{c} a_1\\ \\ a_2\\ \\ \vdots\\ \\ a_m\\ \end{array} \right] [b_1,b_2,\cdots ,b_m]\left[ \begin{matrix} x_{11}& x_{12}& \cdots& x_{1n}\\ \\ x_{21}& x_{22}& \cdots& x_{2n}\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ x_{m1}& x_{m2}& \cdots& x_{mn}\\ \end{matrix} \right] +\left[ \begin{array}{c} b_1\\ \\ b_2\\ \\ \vdots\\ \\ b_m\\ \end{array} \right] [a_1,a_2,\cdots ,a_m]\left[ \begin{matrix} x_{11}& x_{12}& \cdots& x_{1n}\\ \\ x_{21}& x_{22}& \cdots& x_{2n}\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ x_{m1}& x_{m2}& \cdots& x_{mn}\\ \end{matrix} \right] \\ \\ &=\boldsymbol{ab}^T\boldsymbol{X}+\boldsymbol{ba}^T\boldsymbol{X} \end{aligned} X(aTXXTb)= x11fx21fxm1fx12fx22fxm2fx1nfx2nfxmnf m×n= (a1b1x11+a1b2x21++a1bmxm1)+(b1a1x11+b1a2x21++b1amxm1)(a2b1x11+a2b2x21++a2bmxm1)+(b2a1x11+b2a2x21++b2amxm1)(amb1x11+amb2x21++ambmxm1)+(bma1x11+bma2x21++bmamxm1)(a1b1x12+a1b2x22++a1bmxm2)+(b1a1x12+b1a2x22++b1amxm2)(a2b1x12+a2b2x22++a2bmxm2)+(b2a1x12+b2a2x22++b2amxm2)(amb1x12+amb2x22++ambmxm2)+(bma1x12+bma2x22++bmamxm2)(a1b1x1n+a1b2x2n++a1bmxmn)+(b1a1x1n+b1a2x2n++b1amxmn)(a2b1x1n+a2b2x2n++a2bmxmn)+(b2a1x1n+b2a2x2n++b2amxmn)(amb1x1n+amb2x2n++ambmxmn)+(bma1x1n+bma2x2n++bmamxmn) = a1b1x11+a1b2x21++a1bmxm1a2b1x11+a2b2x21++a2bmxm1amb1x11+amb2x21++ambmxm1a1b1x12+a1b2x22++a1bmxm2a2b1x12+a2b2x22++a2bmxm2amb1x12+amb2x22++ambmxm2a1b1x1n+a1b2x2n++a1bmxmna2b1x1n+a2b2x2n++a2bmxmnamb1x1n+amb2x2n++ambmxmn + b1a1x11+b1a2x21++b1amxm1b2a1x11+b2a2x21++b2amxm1bma1x11+bma2x21++bmamxm1b1a1x12+b1a2x22++b1amxm2b2a1x12+b2a2x22++b2amxm2bma1x12+bma2x22++bmamxm2b1a1x1n+b1a2x2n++b1amxmnb2a1x1n+b2a2x2n++b2amxmnbma1x1n+bma2x2n++bmamxmn = a1b1a2b1amb1a1b2a2b2amb2a1bma2bmambm x11x21xm1x12x22xm2x1nx2nxmn + b1a1b2a1bma1b1a2b2a2bma2b1amb2ambmam x11x21xm1x12x22xm2x1nx2nxmn = a1a2am [b1,b2,,bm] x11x21xm1x12x22xm2x1nx2nxmn + b1b2bm [a1,a2,,am] x11x21xm1x12x22xm2x1nx2nxmn =abTX+baTX

结论十

∂ ( a T X T X b ) ∂ X = X b a T + X a b T \frac{\partial \left( \boldsymbol{a}^T\boldsymbol{X}^T\boldsymbol{Xb} \right)}{\partial \boldsymbol{X}}=\boldsymbol{Xba}^T+\boldsymbol{Xab}^T X(aTXTXb)=XbaT+XabT
其中 a \boldsymbol{a} a b \boldsymbol{b} b为常数向量:
a = [ a 1 a 2 ⋯ a n ] T    b = [ b 1 b 2 ⋯ b n ] T \boldsymbol{a}=\left[ \begin{matrix} a_1& a_2& \cdots& a_n\\ \end{matrix} \right] ^T \\ \ \ \\ \boldsymbol{b}=\left[ \begin{matrix} b_1& b_2& \cdots& b_n\\ \end{matrix} \right] ^T a=[a1a2an]T  b=[b1b2bn]T
【证明】
根据结论九,有:
[ ∂ ( a T X X T b ) ∂ X ] T = ∂ ( a T X X T b ) ∂ X T = ( a b T X + b a T X ) T = X T b a T + X T a b T \begin{aligned} \left[ \frac{\partial \left( \boldsymbol{a}^T\boldsymbol{XX}^T\boldsymbol{b} \right)}{\partial \boldsymbol{X}} \right] ^T&=\frac{\partial \left( \boldsymbol{a}^T\boldsymbol{XX}^T\boldsymbol{b} \right)}{\partial \boldsymbol{X}^T} \\ &=\left( \boldsymbol{ab}^T\boldsymbol{X}+\boldsymbol{ba}^T\boldsymbol{X} \right) ^T \\ \\ &=\boldsymbol{X}^T\boldsymbol{ba}^T+\boldsymbol{X}^T\boldsymbol{ab}^T \end{aligned} X(aTXXTb) T=XT(aTXXTb)=(abTX+baTX)T=XTbaT+XTabT
于是:
∂ ( a T X T X b ) ∂ X = ∂ ( a T X T ( X T ) T b ) ∂ ( X T ) T = ( X T ) T b a T + ( X T ) T a b T = X b a T + X a b T \begin{aligned} \frac{\partial \left( \boldsymbol{a}^T\boldsymbol{X}^T\boldsymbol{Xb} \right)}{\partial \boldsymbol{X}}&=\frac{\partial \left( \boldsymbol{a}^T\boldsymbol{X}^T\left( \boldsymbol{X}^T \right) ^T\boldsymbol{b} \right)}{\partial \left( \boldsymbol{X}^T \right) ^T} \\ \\ &=\left( \boldsymbol{X}^T \right) ^T\boldsymbol{ba}^T+\left( \boldsymbol{X}^T \right) ^T\boldsymbol{ab}^T \\ \\ &=\boldsymbol{Xba}^T+\boldsymbol{Xab}^T \end{aligned} X(aTXTXb)=(XT)T(aTXT(XT)Tb)=(XT)TbaT+(XT)TabT=XbaT+XabT

参考文献

[1] 机器学习中的矩阵求导方法
[2] 矩阵求导公式的数学推导
[3] 矩阵的求导

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值