矩阵求导术

利用微分求导

  • 标量对向量: d f = ( ∂ f ∂ x ) T d x {\rm d}f = \left(\frac{\partial f}{\partial \boldsymbol x}\right)^T{\rm d}\boldsymbol x df=(xf)Tdx
  • 标量对矩阵: d f = t r [ ( ∂ f ∂ X ) T d X ] {\rm d}f = {\rm tr}\left[\left(\frac{\partial f}{\partial X}\right)^T{\rm d}X\right] df=tr[(Xf)TdX]
  • 向量对向量: d f = ( ∂ f ∂ x ) T d x {\rm d}\boldsymbol f = \left(\frac{\partial \boldsymbol f}{\partial \boldsymbol x}\right)^T{\rm d}\boldsymbol x df=(xf)Tdx
  • 矩阵对矩阵: v e c ( d F ) = ( ∂ F ∂ X ) T v e c ( d X ) {\rm vec}({\rm d}F) = \left(\frac{\partial F}{\partial X}\right)^T{\rm vec}({\rm d}X) vec(dF)=(XF)Tvec(dX)

常用的矩阵微分的运算法则

  1. d ( X ± Y ) = d X ± d Y {\rm d}(X \pm Y) = {\rm d}X \pm {\rm d}Y d(X±Y)=dX±dY
  2. d ( X Y ) = d ( X ) Y + X d Y {\rm d}(XY) = {\rm d}(X)Y + X{\rm d}Y d(XY)=d(X)Y+XdY
  3. d ( X T ) = ( d X ) T {\rm d}(X^T) = ({\rm d}X)^T d(XT)=(dX)T
  4. d t r ( X ) = t r ( d X ) {\rm d}{\rm tr}(X) = {\rm tr}({\rm d}X) dtr(X)=tr(dX)
  5. d X − 1 = − X − 1 d ( X ) X − 1 {\rm d}X^{-1} = -X^{-1}{\rm d}(X)X^{-1} dX1=X1d(X)X1
  6. d ∣ X ∣ = t r ( X # d X ) = ∣ X ∣ t r ( X − 1 d X ) {\rm d}|X| = {\rm tr}(X^\#{\rm d}X) = |X|{\rm tr}(X^{-1}{\rm d}X) dX=tr(X#dX)=Xtr(X1dX)
  7. d ( X ⊙ Y ) = d ( X ) Y ⊙ X d Y {\rm d}(X \odot Y) = {\rm d}(X)Y \odot X{\rm d}Y d(XY)=d(X)YXdY
  8. d f ( X ) = f ′ ( X ) ⊙ d X {\rm d}f(X) = f^\prime(X)\odot {\rm d}X df(X)=f(X)dX
说明:
  • t r ( X ) {\rm tr}(X) tr(X)是X的迹
  • X # X^\# X#是X的伴随矩阵
  • ⊙ \odot 是逐元素乘法
  • f f f是逐元素函数

迹技巧

  1. a = t r ( a ) , a 是 标 量 a = {\rm tr}(a),\quad a是标量 a=tr(a),a
  2. t r ( A T ) = t r ( A ) {\rm tr}(A^T) = {\rm tr}(A) tr(AT)=tr(A)
  3. t r ( A ± B ) = t r ( A ) ± t r ( B ) {\rm tr}(A \pm B) = {\rm tr}(A) \pm {\rm tr}(B) tr(A±B)=tr(A)±tr(B)
  4. t r ( A B ) = t r ( B A ) , A B 是 方 阵 {\rm tr}(AB) = {\rm tr}(BA),\quad AB是方阵 tr(AB)=tr(BA),AB
  5. t r ( A T ( B ⊙ C ) ) = t r ( ( A ⊙ B ) T C ) , A B C 尺 寸 相 同 {\rm tr}(A^T(B \odot C)) = {\rm tr}((A \odot B)^TC),\quad ABC尺寸相同 tr(AT(BC))=tr((AB)TC),ABC

矩阵向量化技巧

定义矩阵的(按列优先)向量化
v e c ( X ) = [ X 11 , X 21 , . . . , X m 1 , X 12 . . . , X m n ] T ( m n × 1 ) {\rm vec}(X) = [X_{11},X_{21},...,X_{m1},X_{12}...,X_{mn}]^T(mn \times 1) vec(X)=[X11,X21,...,Xm1,X12...,Xmn]T(mn×1)

  1. v e c ( A ± B ) = v e c ( A ) ± v e c ( B ) {\rm vec}(A \pm B) = {\rm vec}(A) \pm {\rm vec}(B) vec(A±B)=vec(A)±vec(B)
  2. v e c ( A X B ) = ( B T ⊗ A ) v e c ( X ) {\rm vec}(AXB) = (B^T \otimes A){\rm vec}(X) vec(AXB)=(BTA)vec(X)
  3. v e c ( A T ) = K m n v e c ( A ) {\rm vec}(A^T) = K_{mn}{\rm vec}(A) vec(AT)=Kmnvec(A)
  4. v e c ( A ⊙ X ) = d i a g ( A ) v e c ( X ) {\rm vec}(A \odot X) = {\rm diag}(A){\rm vec}(X) vec(AX)=diag(A)vec(X)
说明:
  • ⊗ \otimes 是Kronecker积, A ( m × n ) ⊗ B ( p × q ) = A ⊗ B = [ a 11 B ⋯ a 1 n B ⋮ ⋱ ⋮ a m 1 B ⋯ a m n B ] ( m p × n q ) A_{(m \times n)} \otimes B_{(p\times q)} =A \otimes B=\left[\begin{array}{ccc} a_{11} B & \cdots & a_{1 n} B \\ \vdots & \ddots & \vdots \\ a_{m 1} B & \cdots & a_{m n} B \end{array}\right]_{(mp\times nq)} A(m×n)B(p×q)=AB=a11Bam1Ba1nBamnB(mp×nq)
  • K m n K_{mn} Kmn大小为 ( m n × m n ) (mn \times mn) (mn×mn),是交换矩阵,将按列优先的向量化变为按行优先的向量化。如 K 22 = [ 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 ] , vec ⁡ ( A T ) = [ A 11 A 12 A 21 A 22 ] , vec ⁡ ( A ) = [ A 11 A 21 A 12 A 22 ] K_{22}=\left[\begin{array}{cccc}1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1\end{array}\right], \operatorname{vec}\left(A^{T}\right)=\left[\begin{array}{c}A_{11} \\ A_{12} \\ A_{21} \\ A_{22}\end{array}\right], \operatorname{vec}(A)=\left[\begin{array}{c}A_{11} \\ A_{21} \\ A_{12} \\ A_{22}\end{array}\right] K22=1000001001000001,vec(AT)=A11A12A21A22,vec(A)=A11A21A12A22
  • d i a g ( A ) ( m n × m n ) {\rm diag}(A)_{(mn \times mn)} diag(A)(mn×mn)是用A的元素(按列优先)排成的对角阵。如 A = [ 1 4 2 5 3 6 ] , d i a g ( A ) = [ 1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 4 0 0 0 0 0 0 5 0 0 0 0 0 0 6 ] A =\left[\begin{array}{cc}1 & 4 \\ 2 & 5 \\ 3 & 6\end{array}\right],{\rm diag}(A) = \left[\begin{array}{cccccc}1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 & 0 \\ 0 & 0 & 3 & 0 & 0 & 0 \\ 0 & 0 & 0 & 4 & 0 & 0 \\ 0 & 0 & 0 & 0 & 5 & 0 \\ 0 & 0 & 0 & 0 & 0 & 6\end{array}\right] A=123456,diag(A)=100000020000003000000400000050000006

Kronecker积技巧

  1. ( A ⊗ B ) T = A T ⊗ B T (A \otimes B)^{T}=A^{T} \otimes B^{T} (AB)T=ATBT
  2. vec ⁡ ( a b T ) = b ⊗ a \operatorname{vec}(\boldsymbol{a} \boldsymbol{b}^{T})=\boldsymbol{b} \otimes \boldsymbol{a} vec(abT)=ba
  3. ( A ⊗ B ) ( C ⊗ D ) = ( A C ) ⊗ ( B D ) (A \otimes B)(C \otimes D)=(A C) \otimes(B D) (AB)(CD)=(AC)(BD)

交换矩阵技巧

  1. K m n = K n m T , K m n K n m = I K_{m n}=K_{n m}^{T}, K_{m n} K_{n m}=I Kmn=KnmT,KmnKnm=I
  2. K p m ( A ( m × n ) ⊗ B ( p × q ) ) K n q = B ⊗ A K_{p m}(A_{(m \times n)} \otimes B_{(p \times q)}) K_{n q}=B \otimes A Kpm(A(m×n)B(p×q))Knq=BA

参考:

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值