张量梯度求导公式总结

结论1. 若 L = A B C D E F L=ABCDEF L=ABCDEF, 则
∂ L ∂ A = ( B C D E F ) T , ∂ L ∂ B = A T ( C D E F ) T \frac{\partial L}{\partial A}=(BCDEF)^T, \frac{\partial L}{\partial B}=A^T(CDEF)^T AL=(BCDEF)T,BL=AT(CDEF)T
∂ L ∂ C = ( A B ) T ( D E F ) T , ∂ L ∂ D = ( A B C ) T ( E F ) T \frac{\partial L}{\partial C}=(AB)^T(DEF)^T, \frac{\partial L}{\partial D}=(ABC)^T(EF)^T CL=(AB)T(DEF)T,DL=(ABC)T(EF)T
∂ L ∂ E = ( A B C D ) T F T , ∂ L ∂ F = ( A B C D E ) T \frac{\partial L}{\partial E}=(ABCD)^TF^T, \frac{\partial L}{\partial F}=(ABCDE)^T EL=(ABCD)TFT,FL=(ABCDE)T
这个还是比较容易看出规律的,L对右边项中间某个张量的偏导等于该张量左边所有的转置乘右边所有的转置。

结论2. 若 O p × n = V p × m H m × n O_{p\times n}=V_{p\times m}H_{m\times n} Op×n=Vp×mHm×n, L o s s Loss Loss是标量(scalar)则,
∂ L o s s ∂ H = ∂ O ∂ H ∂ L o s s ∂ O \frac{\partial Loss}{\partial H}=\frac{\partial O}{\partial H}\frac{\partial Loss}{\partial O} HLoss=HOOLoss
∂ L o s s ∂ V = ∂ L o s s ∂ O ∂ O ∂ V \frac{\partial Loss}{\partial V}=\frac{\partial Loss}{\partial O}\frac{\partial O}{\partial V} VLoss=OLossVO
下证明之:
∵ L o s s ∈ R , 令 L o s s = A 1 × p O p × n B n × 1 \because Loss \in \mathbb{R}, \quad令 \quad Loss = A_{1\times p}O_{p\times n}B_{n\times 1} LossR,Loss=A1×pOp×nBn×1又由已知 O p × n = V p × m H m × n O_{p\times n}=V_{p\times m}H_{m\times n} Op×n=Vp×mHm×n
∴ L o s s = A 1 × p V p × m H m × n B n × 1 \therefore Loss=A_{1\times p}V_{p\times m}H_{m\times n}B_{n\times 1} Loss=A1×pVp×mHm×nBn×1
结论1
∂ L o s s ∂ H = ( A V ) T B T = V T A T B T = V T ( A T B T ) \frac{\partial Loss}{\partial H}=(AV)^TB^T=V^TA^TB^T=V^T(A^TB^T) HLoss=(AV)TBT=VTATBT=VT(ATBT)
∂ O ∂ H = V T , ∂ L o s s ∂ O = A T B T \frac{\partial O}{\partial H}=V^T,\frac{\partial Loss}{\partial O}=A^TB^T HO=VT,OLoss=ATBT
∴ ∂ L o s s ∂ H = ∂ O ∂ H ∂ L o s s ∂ O \therefore \frac{\partial Loss}{\partial H}=\frac{\partial O}{\partial H}\frac{\partial Loss}{\partial O} HLoss=HOOLoss
同理可证
∂ L o s s ∂ V = ∂ L o s s ∂ O ∂ O ∂ V \frac{\partial Loss}{\partial V}=\frac{\partial Loss}{\partial O}\frac{\partial O}{\partial V} VLoss=OLossVO
结论3.
( ∂ C ∂ A T ) T = ∂ C ∂ A (\frac{\partial C}{\partial A^T})^T=\frac{\partial C}{\partial A} (ATC)T=AC
结论4. 若 C = A T B C=A^TB C=ATB, 则由结论3易证
∂ C ∂ A = B \frac{\partial C}{\partial A} = B AC=B
结论5. 若 y = w T X w y=w^TXw y=wTXw, 则
∂ y ∂ w = ( X + X T ) w \frac{\partial y}{\partial w}=(X+X^T)w wy=(X+XT)w
特别地,若 X X X是实对称矩阵,则有 X = X T X=X^T X=XT,故
∂ y ∂ w = 2 X w \frac{\partial y}{\partial w}=2Xw wy=2Xw

  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值