矩阵运算

参考:
Wikipedia – Matrix calculus
Wikipedia 上对于矩阵的微分描述得很详细。


给定两个矩阵 A = ( a i j ) m × n A=\begin{pmatrix} a_{ij}\end{pmatrix}_{m \times n} A=(aij)m×n B = ( b i j ) m × n B=\begin{pmatrix} b_{ij}\end{pmatrix}_{m \times n} B=(bij)m×n,它们的阿达马积和克罗内克积定义如下:
阿达马积(Hadamard product) A ∘ B = ( a i j ⋅ b i j ) m × n A \circ B=\begin{pmatrix} a_{ij} \cdot b_{ij} \end{pmatrix}_{m \times n} AB=(aijbij)m×n,又称逐元素积(elementwise product)。
克罗内克积(Kronnecker product) A ⨂ B = ( a 11 B ⋯ a 1 n B ⋮ ⋱ ⋮ a m 1 B ⋯ a m n B ) A\bigotimes B=\begin{pmatrix}a_{11} B& \cdots & a_{1n}B \\ \vdots & \ddots & \vdots \\ a_{m1}B & \cdots & a_{mn}B \end{pmatrix} AB=a11Bam1Ba1nBamnB

矩阵的求导:

1. 矩阵 Y 对标量 x i x_i xi 求导:

相当于每个元素求倒数后转置一下,注: M × N M \times N M×N 矩阵求导后变 N × M N \times M N×M 矩阵

∂ Y ∂ x i = [ ∂ Y i j ∂ x i ] T \frac{\partial Y}{\partial x_{i}}=\begin{bmatrix} \frac{\partial Y_{ij}}{\partial x_{i}}\end{bmatrix}^T xiY=[xiYij]T

2. 标量 y i y_i yi 对列向量 x x x 求导:

∂ y i ∂ x = [ ∂ y i ∂ x 1 ∂ y i ∂ x 2 ⋮ ] \frac{\partial y_i}{\partial x}=\begin{bmatrix} \frac{\partial y_{i}}{\partial x_{1}} \\ \frac{\partial y_{i}}{\partial x_{2}} \\ \vdots \end{bmatrix} xyi=x1yix2yi

3. 行向量 y T y^T yT 对列向量 x x x 求导:

注: 1 × M 1 \times M 1×M 矩阵对 N × 1 N \times 1 N×1 矩阵求导后变 N × M N \times M N×M 矩阵

∂ y T ∂ x = [ y 1 y 2 ⋯ y n ] [ x 1 x 2 ⋮ x n ] = [ ∂ y 1 ∂ x ∂ y 2 ∂ x ⋮ ∂ y n ∂ x ] \frac{\partial y^T}{\partial x}=\frac {\begin{bmatrix} y_1&y_2&\cdots &y_n\end{bmatrix}}{\begin{bmatrix}x_1 \\ x_2 \\ \vdots \\x_n \end{bmatrix}}=\begin{bmatrix} \frac{\partial y_{1}}{\partial x} \\ \frac{\partial y_{2}}{\partial x} \\ \vdots \\ \frac {\partial y_{n}}{\partial x}\end{bmatrix} xyT=x1x2xn[y1y2yn]=xy1xy2xyn

有如下公式:

∂ x T x = I \frac{\partial x^T}{x}=I xxT=I; ② ∂ ( A x ) T ∂ x = A T \frac{\partial {(Ax)}^T}{\partial x}=A^T x(Ax)T=AT

4. 列向量 y y y 对行向量 x T x^T xT 求导:

注: M × 1 M \times 1 M×1 矩阵对 1 × N 1 \times N 1×N 矩阵求导后变 M × N M \times N M×N 矩阵

∂ y ∂ x T = ( ∂ y T ∂ x ) T \frac{\partial y}{\partial x^T}=(\frac{\partial y^T}{\partial x})^T xTy=(xyT)T

5. 向量积对列向量 x x x 求导:

∂ u v T ∂ x = ( ∂ u ∂ x ) v T + u ( ∂ v T ∂ x ) \frac {\partial uv^T}{\partial x}=(\frac{\partial u}{\partial x})v^T+u(\frac{\partial v^T}{\partial x}) xuvT=(xu)vT+u(xvT)

∂ v u T ∂ x = ( ∂ u T ∂ x ) v + u T ( ∂ v T ∂ x ) \frac {\partial vu^T}{\partial x}=(\frac{\partial u^T}{\partial x})v+u^T(\frac{\partial v^T}{\partial x}) xvuT=(xuT)v+uT(xvT)

∂ ( x T A ) ∂ x = ( ∂ x T ∂ x ) A + x T ( ∂ A ∂ x ) = I A + 0 x T = A \frac {\partial (x^TA)}{\partial x}=(\frac{\partial x^T}{\partial x})A+x^T(\frac{\partial A}{\partial x})=IA+0x^T=A x(xTA)=(xxT)A+xT(xA)=IA+0xT=A

∂ ( A x ) ∂ x T = [ ∂ ( x T A T ) ∂ x ] T = ( A T ) T = A \frac{\partial (Ax)}{\partial x^T}=[\frac{\partial (x^TA^T)}{\partial x}]^T=(A^T)^T=A xT(Ax)=[x(xTAT)]T=(AT)T=A

∂ ( x T A x ) ∂ x = ( ∂ x T ∂ x ) A x + [ ∂ ( A x ) T ∂ x ] x = A x + A T x \frac{\partial (x^TAx)}{\partial x}=(\frac{\partial x^T}{\partial x})Ax+[\frac{\partial (Ax)^T}{\partial x}]x=Ax+A^Tx x(xTAx)=(xxT)Ax+[x(Ax)T]x=Ax+ATx

6. 矩阵 Y Y Y 对列向量 x x x 求导:

Y Y Y x x x 的每个分量求偏导构成一个超向量(该向量每个元素都为一个矩阵)

[ ∂ y i j ] ∂ [ x 1 x 2 ⋮ x n ] = [ ∂ [ y i j ] ∂ x 1 ∂ [ y i j ] ∂ x 2 ⋮ ∂ [ y i j ] ∂ x n ] \frac{\begin{bmatrix} \partial y_{ij} \end{bmatrix}}{\partial \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}}=\begin{bmatrix} \frac{\partial [y_{ij}]}{\partial x_1} \\ \frac{\partial [y_{ij}]}{\partial x_2}\\ \vdots \\ \frac{\partial [y_{ij}]}{\partial x_n}\end{bmatrix} x1x2xn[yij]=x1[yij]x2[yij]xn[yij]

注: ∂ [ y i j ] ∂ x n \frac{\partial [y_{ij}]}{\partial x_n} xn[yij] 为一个矩阵。

7. 矩阵积对列向量 x x x

∂ ( u v ) ∂ x = ( ∂ u ∂ x ) v + u ( ∂ v ∂ x ) \frac{\partial(uv)}{\partial x}=(\frac{\partial u}{\partial x})v+u(\frac{\partial v}{\partial x}) x(uv)=(xu)v+u(xv)

∂ ( x T A ) ∂ x = ( ∂ x T ∂ x ) A + x T ( ∂ A ∂ x ) = I A + x T 0 = A \frac{\partial(x^TA)}{\partial x}=(\frac{\partial x^T}{\partial x})A+x^T(\frac{\partial A}{\partial x})=IA+x^T0=A x(xTA)=(xxT)A+xT(xA)=IA+xT0=A

8. 标量 y i y_i yi 对矩阵 X X X 的导数:

y i y_i yi X X X 每个元素求导

∂ y i ∂ X = ∂ y i ∂ [ x i j ] \frac{\partial y_i}{\partial X}=\frac{\partial y_i}{\partial [x_{ij}]} Xyi=[xij]yi

y i = u T X T v = ∑ ∑ u ( i ) x ( i j ) v ( j ) ⇒ ∂ y i ∂ X = u v T y_i=u^TX^Tv=\sum\sum u(i)x(ij)v(j) \Rightarrow \frac{\partial y_i}{\partial X}=uv^T yi=uTXTv=u(i)x(ij)v(j)Xyi=uvT

y i = u T X T X u y_i=u^TX^TXu yi=uTXTXu ∂ y i ∂ X = 2 X u u T \frac{\partial y_i}{\partial X}=2Xuu^T Xyi=2XuuT

y i = ( X u − v ) T ( X u − v ) y_i=(Xu-v)^T(Xu-v) yi=(Xuv)T(Xuv) ∂ y i ∂ X = ∂ ( u T X T X u − 2 v T X u + v T v ) ∂ X = 2 X u u T − 2 v u T + 0 = 2 ( X u − v ) u T \frac{\partial y_i}{\partial X}=\frac{\partial (u^TX^TXu-2v^TXu+v^Tv)}{\partial X}=2Xuu^T-2vu^T+0=2(Xu-v)u^T Xyi=X(uTXTXu2vTXu+vTv)=2XuuT2vuT+0=2(Xuv)uT

9. 矩阵 Y Y Y 对矩阵 X X X 求导:

Y Y Y 的每个元素对 X X X 求导,构成一个超级矩阵。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值