6 矩阵对标量求导
6.1 定义
对于
m
×
n
m\times n
m×n的矩阵
F
m
×
n
(
x
)
=
[
f
11
(
x
)
f
12
(
x
)
⋯
f
1
n
(
x
)
f
21
(
x
)
f
22
(
x
)
⋯
f
2
n
(
x
)
⋮
⋮
⋱
⋮
f
m
1
(
x
)
f
m
2
(
x
)
⋯
f
m
n
(
x
)
]
m
×
n
\boldsymbol{F}_{m\times n}\left( x \right) =\left[ \begin{matrix} f_{11}\left( x \right)& f_{12}\left( x \right)& \cdots& f_{1n}\left( x \right)\\ \\ f_{21}\left( x \right)& f_{22}\left( x \right)& \cdots& f_{2n}\left( x \right)\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ f_{m1}\left( x \right)& f_{m2}\left( x \right)& \cdots& f_{mn}\left( x \right)\\ \end{matrix} \right] _{m\times n}
Fm×n(x)=
f11(x)f21(x)⋮fm1(x)f12(x)f22(x)⋮fm2(x)⋯⋯⋱⋯f1n(x)f2n(x)⋮fmn(x)
m×n
逐元素求导,得:
∂
F
∂
x
=
[
∂
f
11
∂
x
∂
f
12
∂
x
⋯
∂
f
1
n
∂
x
∂
f
21
∂
x
∂
f
22
∂
x
⋯
∂
f
2
n
∂
x
⋮
⋮
⋱
⋮
∂
f
m
1
∂
x
∂
f
m
2
∂
x
⋯
∂
f
m
n
∂
x
]
\frac{\partial \boldsymbol{F}}{\partial x}=\left[ \begin{matrix} \frac{\partial f_{11}}{\partial x}& \frac{\partial f_{12}}{\partial x}& \cdots& \frac{\partial f_{1n}}{\partial x}\\ \\ \frac{\partial f_{21}}{\partial x}& \frac{\partial f_{22}}{\partial x}& \cdots& \frac{\partial f_{2n}}{\partial x}\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ \frac{\partial f_{m1}}{\partial x}& \frac{\partial f_{m2}}{\partial x}& \cdots& \frac{\partial f_{mn}}{\partial x}\\ \end{matrix} \right]
∂x∂F=
∂x∂f11∂x∂f21⋮∂x∂fm1∂x∂f12∂x∂f22⋮∂x∂fm2⋯⋯⋱⋯∂x∂f1n∂x∂f2n⋮∂x∂fmn
6.2 运算法则
∂
[
A
(
t
)
±
B
(
t
)
]
∂
t
=
∂
A
(
t
)
∂
t
±
∂
B
(
t
)
∂
t
\frac{\partial \left[ \boldsymbol{A}\left( t \right) \pm \boldsymbol{B}\left( t \right) \right]}{\partial t}=\frac{\partial \boldsymbol{A}\left( t \right)}{\partial t}\pm \frac{\partial \boldsymbol{B}\left( t \right)}{\partial t}
∂t∂[A(t)±B(t)]=∂t∂A(t)±∂t∂B(t)
∂
[
λ
(
t
)
⋅
A
(
t
)
]
∂
t
=
∂
λ
(
t
)
∂
t
⋅
A
(
t
)
+
λ
(
t
)
⋅
∂
A
(
t
)
∂
t
\frac{\partial \left[ \lambda \left( t \right) \cdot \boldsymbol{A}\left( t \right) \right]}{\partial t}=\frac{\partial \lambda \left( t \right)}{\partial t}\cdot \boldsymbol{A}\left( t \right) +\lambda \left( t \right) \cdot \frac{\partial \boldsymbol{A}\left( t \right)}{\partial t}
∂t∂[λ(t)⋅A(t)]=∂t∂λ(t)⋅A(t)+λ(t)⋅∂t∂A(t)
其中
λ
(
t
)
\lambda \left( t \right)
λ(t)为变量
t
t
t的数量函数。
∂
[
A
(
t
)
⋅
B
(
t
)
]
∂
t
=
∂
A
(
t
)
∂
t
⋅
B
(
t
)
+
A
(
t
)
⋅
∂
B
(
t
)
∂
t
\frac{\partial \left[ \boldsymbol{A}\left( t \right) \cdot \boldsymbol{B}\left( t \right) \right]}{\partial t}=\frac{\partial \boldsymbol{A}\left( t \right)}{\partial t}\cdot \boldsymbol{B}\left( t \right) +\boldsymbol{A}\left( t \right) \cdot \frac{\partial \boldsymbol{B}\left( t \right)}{\partial t}
∂t∂[A(t)⋅B(t)]=∂t∂A(t)⋅B(t)+A(t)⋅∂t∂B(t)
6.3 示例
【例6.1】求
x
T
A
x
\boldsymbol{x}^T\boldsymbol{Ax}
xTAx对
t
t
t的导数,其中
x
=
[
x
1
(
t
)
x
2
(
t
)
⋯
x
n
(
t
)
]
T
A
=
[
a
11
a
12
⋯
a
1
n
a
12
a
11
⋯
a
2
n
⋮
⋮
⋱
⋮
a
1
n
a
2
n
⋯
a
n
n
]
\boldsymbol{x}=\left[ \begin{matrix} x_1\left( t \right)& x_2\left( t \right)& \cdots& x_n\left( t \right)\\ \end{matrix} \right] ^T \\ \ \ \\ \boldsymbol{A}=\left[ \begin{matrix} a_{11}& a_{12}& \cdots& a_{1n}\\ a_{12}& a_{11}& \cdots& a_{2n}\\ \vdots& \vdots& \ddots& \vdots\\ a_{1n}& a_{2n}& \cdots& a_{nn}\\ \end{matrix} \right]
x=[x1(t)x2(t)⋯xn(t)]T A=
a11a12⋮a1na12a11⋮a2n⋯⋯⋱⋯a1na2n⋮ann
A
\boldsymbol{A}
A为对称阵。
【解】
∂
(
x
T
A
x
)
∂
t
=
∂
x
T
∂
t
⋅
A
x
+
x
T
⋅
∂
(
A
x
)
∂
t
=
∂
x
T
∂
t
⋅
A
x
+
x
T
⋅
(
∂
A
∂
t
⋅
x
+
A
∂
x
∂
t
)
=
x
^
T
A
x
+
x
T
A
x
^
=
(
x
^
T
A
x
)
T
+
x
T
A
x
^
=
x
T
A
x
^
+
x
T
A
x
^
=
2
x
T
A
x
^
\begin{aligned} \frac{\partial \left( \boldsymbol{x}^T\boldsymbol{Ax} \right)}{\partial t}&=\frac{\partial \boldsymbol{x}^T}{\partial t}\cdot \boldsymbol{Ax}+\boldsymbol{x}^T\cdot \frac{\partial \left( \boldsymbol{Ax} \right)}{\partial t} \\ \ \ \\ &=\frac{\partial \boldsymbol{x}^T}{\partial t}\cdot \boldsymbol{Ax}+\boldsymbol{x}^T\cdot \left( \frac{\partial \boldsymbol{A}}{\partial t}\cdot \boldsymbol{x}+\boldsymbol{A}\frac{\partial \boldsymbol{x}}{\partial t} \right) \\ \ \ \\ &=\hat{\boldsymbol{x}}^T\boldsymbol{Ax}+\boldsymbol{x}^T\boldsymbol{A}\hat{\boldsymbol{x}}=\left( \hat{\boldsymbol{x}}^T\boldsymbol{Ax} \right) ^T+\boldsymbol{x}^T\boldsymbol{A}\hat{\boldsymbol{x}} \\ \ \ \\ &=\boldsymbol{x}^T\boldsymbol{A}\hat{\boldsymbol{x}}+\boldsymbol{x}^T\boldsymbol{A}\hat{\boldsymbol{x}}=2\boldsymbol{x}^T\boldsymbol{A}\hat{\boldsymbol{x}} \end{aligned}
∂t∂(xTAx) =∂t∂xT⋅Ax+xT⋅∂t∂(Ax)=∂t∂xT⋅Ax+xT⋅(∂t∂A⋅x+A∂t∂x)=x^TAx+xTAx^=(x^TAx)T+xTAx^=xTAx^+xTAx^=2xTAx^
【注】
x
T
A
x
^
\boldsymbol{x}^T\boldsymbol{A}\hat{\boldsymbol{x}}
xTAx^与
x
^
T
A
x
\hat{\boldsymbol{x}}^T\boldsymbol{Ax}
x^TAx都是数量函数,且
A
\boldsymbol{A}
A为对称阵,他们等于自己的转置。
【例6.2】证明:
∂
[
A
(
t
)
⋅
B
(
t
)
]
∂
t
=
∂
A
(
t
)
∂
t
⋅
B
(
t
)
+
A
(
t
)
⋅
∂
B
(
t
)
∂
t
\frac{\partial \left[ \boldsymbol{A}\left( t \right) \cdot \boldsymbol{B}\left( t \right) \right]}{\partial t}=\frac{\partial \boldsymbol{A}\left( t \right)}{\partial t}\cdot \boldsymbol{B}\left( t \right) +\boldsymbol{A}\left( t \right) \cdot \frac{\partial \boldsymbol{B}\left( t \right)}{\partial t}
∂t∂[A(t)⋅B(t)]=∂t∂A(t)⋅B(t)+A(t)⋅∂t∂B(t)
【证】
设
A
(
t
)
\boldsymbol{A}\left( t \right)
A(t)和
B
(
t
)
\boldsymbol{B}\left( t \right)
B(t)分别为
n
×
m
n\times m
n×m和
m
×
l
m\times l
m×l矩阵:
A
(
t
)
=
[
a
11
(
t
)
a
12
(
t
)
⋯
a
1
m
(
t
)
a
21
(
t
)
a
22
(
t
)
⋯
a
2
m
(
t
)
⋮
⋮
⋱
⋮
a
n
1
(
t
)
a
n
2
(
t
)
⋯
a
n
m
(
t
)
]
=
[
α
T
(
t
)
α
T
(
t
)
⋮
α
T
(
t
)
]
\boldsymbol{A}\left( t \right) =\left[ \begin{matrix} a_{11}\left( t \right)& a_{12}\left( t \right)& \cdots& a_{1m}\left( t \right)\\ \\ a_{21}\left( t \right)& a_{22}\left( t \right)& \cdots& a_{2m}\left( t \right)\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ a_{n1}\left( t \right)& a_{n2}\left( t \right)& \cdots& a_{nm}\left( t \right)\\ \end{matrix} \right] =\left[ \begin{array}{c} \boldsymbol{\alpha }^T\left( t \right)\\ \\ \boldsymbol{\alpha }^T\left( t \right)\\ \\ \vdots\\ \\ \boldsymbol{\alpha }^T\left( t \right)\\ \end{array} \right]
A(t)=
a11(t)a21(t)⋮an1(t)a12(t)a22(t)⋮an2(t)⋯⋯⋱⋯a1m(t)a2m(t)⋮anm(t)
=
αT(t)αT(t)⋮αT(t)
B
(
t
)
=
[
b
11
(
t
)
b
12
(
t
)
⋯
b
1
l
(
t
)
b
21
(
t
)
b
22
(
t
)
⋯
b
2
l
(
t
)
⋮
⋮
⋱
⋮
b
m
1
(
t
)
b
m
2
(
t
)
⋯
b
m
l
(
t
)
]
=
[
β
1
(
t
)
β
2
(
t
)
⋯
β
l
(
t
)
]
\boldsymbol{B}\left( t \right) =\left[ \begin{matrix} b_{11}\left( t \right)& b_{12}\left( t \right)& \cdots& b_{1l}\left( t \right)\\ \\ b_{21}\left( t \right)& b_{22}\left( t \right)& \cdots& b_{2l}\left( t \right)\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ b_{m1}\left( t \right)& b_{m2}\left( t \right)& \cdots& b_{ml}\left( t \right)\\ \end{matrix} \right] =\left[ \begin{matrix} \boldsymbol{\beta }_1\left( t \right)& \boldsymbol{\beta }_2\left( t \right)& \cdots& \boldsymbol{\beta }_l\left( t \right)\\ \end{matrix} \right]
B(t)=
b11(t)b21(t)⋮bm1(t)b12(t)b22(t)⋮bm2(t)⋯⋯⋱⋯b1l(t)b2l(t)⋮bml(t)
=[β1(t)β2(t)⋯βl(t)]
A
(
t
)
⋅
B
(
t
)
=
[
α
1
T
(
t
)
β
1
(
t
)
α
1
T
(
t
)
β
2
(
t
)
⋯
α
1
T
(
t
)
β
l
(
t
)
α
2
T
(
t
)
β
1
(
t
)
α
2
T
(
t
)
β
2
(
t
)
⋯
α
2
T
(
t
)
β
l
(
t
)
⋮
⋮
⋱
⋮
α
n
T
(
t
)
β
1
(
t
)
α
n
T
(
t
)
β
2
(
t
)
⋯
α
n
T
(
t
)
β
l
(
t
)
]
=
[
α
i
T
(
t
)
β
j
(
t
)
]
n
×
l
\begin{aligned} \boldsymbol{A}\left( t \right) \cdot \boldsymbol{B}\left( t \right) &=\left[ \begin{matrix} {\boldsymbol{\alpha }_1}^T\left( t \right) \boldsymbol{\beta }_1\left( t \right)& {\boldsymbol{\alpha }_1}^T\left( t \right) \boldsymbol{\beta }_2\left( t \right)& \cdots& {\boldsymbol{\alpha }_1}^T\left( t \right) \boldsymbol{\beta }_l\left( t \right)\\ \\ {\boldsymbol{\alpha }_2}^T\left( t \right) \boldsymbol{\beta }_1\left( t \right)& {\boldsymbol{\alpha }_2}^T\left( t \right) \boldsymbol{\beta }_2\left( t \right)& \cdots& {\boldsymbol{\alpha }_2}^T\left( t \right) \boldsymbol{\beta }_l\left( t \right)\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ {\boldsymbol{\alpha }_n}^T\left( t \right) \boldsymbol{\beta }_1\left( t \right)& {\boldsymbol{\alpha }_n}^T\left( t \right) \boldsymbol{\beta }_2\left( t \right)& \cdots& {\boldsymbol{\alpha }_n}^T\left( t \right) \boldsymbol{\beta }_l\left( t \right)\\ \end{matrix} \right] \\ \ \ \\ &=\left[ {\boldsymbol{\alpha }_i}^T\left( t \right) \boldsymbol{\beta }_j\left( t \right) \right] _{n\times l} \end{aligned}
A(t)⋅B(t) =
α1T(t)β1(t)α2T(t)β1(t)⋮αnT(t)β1(t)α1T(t)β2(t)α2T(t)β2(t)⋮αnT(t)β2(t)⋯⋯⋱⋯α1T(t)βl(t)α2T(t)βl(t)⋮αnT(t)βl(t)
=[αiT(t)βj(t)]n×l
因此:
∂
[
A
(
t
)
⋅
B
(
t
)
]
∂
t
=
∂
[
α
i
T
(
t
)
β
j
(
t
)
]
n
×
l
∂
t
=
[
∂
α
i
T
(
t
)
∂
t
⋅
β
j
(
t
)
+
α
i
T
(
t
)
⋅
∂
β
j
(
t
)
∂
t
]
n
×
l
=
∂
A
(
t
)
∂
t
⋅
B
(
t
)
+
A
(
t
)
⋅
∂
B
(
t
)
∂
t
\begin{aligned} \frac{\partial \left[ \boldsymbol{A}\left( t \right) \cdot \boldsymbol{B}\left( t \right) \right]}{\partial t}&=\frac{\partial \left[ {\boldsymbol{\alpha }_i}^T\left( t \right) \boldsymbol{\beta }_j\left( t \right) \right] _{n\times l}}{\partial t} \\ \ \ \\ &=\left[ \frac{\partial {\boldsymbol{\alpha }_i}^T\left( t \right)}{\partial t}\cdot \boldsymbol{\beta }_j\left( t \right) +{\boldsymbol{\alpha }_i}^T\left( t \right) \cdot \frac{\partial \boldsymbol{\beta }_j\left( t \right)}{\partial t} \right] _{n\times l} \\ \ \ \\ &=\frac{\partial \boldsymbol{A}\left( t \right)}{\partial t}\cdot \boldsymbol{B}\left( t \right) +\boldsymbol{A}\left( t \right) \cdot \frac{\partial \boldsymbol{B}\left( t \right)}{\partial t} \end{aligned}
∂t∂[A(t)⋅B(t)] =∂t∂[αiT(t)βj(t)]n×l=[∂t∂αiT(t)⋅βj(t)+αiT(t)⋅∂t∂βj(t)]n×l=∂t∂A(t)⋅B(t)+A(t)⋅∂t∂B(t)
7 矩阵对向量求导
7.1 定义
对于
m
×
l
m\times l
m×l的数量函数矩阵
F
m
×
n
(
x
)
=
[
f
11
(
x
)
f
12
(
x
)
⋯
f
1
l
(
x
)
f
21
(
x
)
f
22
(
x
)
⋯
f
2
l
(
x
)
⋮
⋮
⋱
⋮
f
m
1
(
x
)
f
m
2
(
x
)
⋯
f
m
l
(
x
)
]
m
×
l
\boldsymbol{F}_{m\times n}\left(\boldsymbol{x} \right) =\left[ \begin{matrix} f_{11}\left( \boldsymbol{x} \right)& f_{12}\left( \boldsymbol{x} \right)& \cdots& f_{1l}\left( \boldsymbol{x} \right)\\ & & & \\ f_{21}\left( \boldsymbol{x} \right)& f_{22}\left( \boldsymbol{x} \right)& \cdots& f_{2l}\left( \boldsymbol{x} \right)\\ & & & \\ \vdots& \vdots& \ddots& \vdots\\ & & & \\ f_{m1}\left( \boldsymbol{x} \right)& f_{m2}\left( \boldsymbol{x} \right)& \cdots& f_{ml}\left( \boldsymbol{x} \right)\\ \end{matrix} \right] _{m\times l}
Fm×n(x)=
f11(x)f21(x)⋮fm1(x)f12(x)f22(x)⋮fm2(x)⋯⋯⋱⋯f1l(x)f2l(x)⋮fml(x)
m×l
n
n
n维列向量
x
=
[
x
1
,
x
2
,
⋯
,
x
n
]
T
\boldsymbol{x}=\left[ x_1,x_2,\cdots ,x_n \right]^T
x=[x1,x2,⋯,xn]T,有:
∂
F
(
x
)
∂
x
=
[
∂
F
(
x
)
∂
x
1
∂
F
(
x
)
∂
x
2
⋮
∂
F
(
x
)
∂
x
n
]
n
m
×
l
\frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial \boldsymbol{x}}=\left[ \begin{array}{c} \frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial x_1}\\ \\ \frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial x_2}\\ \\ \vdots\\ \\ \frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial x_n}\\ \end{array} \right] _{nm\times l}
∂x∂F(x)=
∂x1∂F(x)∂x2∂F(x)⋮∂xn∂F(x)
nm×l
∂
F
(
x
)
∂
x
T
=
[
∂
F
(
x
)
∂
x
1
∂
F
(
x
)
∂
x
2
⋯
∂
F
(
x
)
∂
x
n
]
m
×
l
n
\frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial \boldsymbol{x}^T}=\left[ \begin{matrix} \frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial x_1}& \frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial x_2}& \cdots& \frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial x_n}\\ \end{matrix} \right] _{m\times ln}
∂xT∂F(x)=[∂x1∂F(x)∂x2∂F(x)⋯∂xn∂F(x)]m×ln
其中:
∂
F
(
x
)
∂
x
i
=
[
∂
f
11
∂
x
i
∂
f
12
∂
x
i
⋯
∂
f
1
l
∂
x
i
∂
f
21
∂
x
i
∂
f
22
∂
x
i
⋯
∂
f
2
l
∂
x
i
⋮
⋮
⋱
⋮
∂
f
m
1
∂
x
i
∂
f
m
2
∂
x
i
⋯
∂
f
m
l
∂
x
i
]
m
×
l
\frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial x_i}=\left[ \begin{matrix} \frac{\partial f_{11}}{\partial x_i}& \frac{\partial f_{12}}{\partial x_i}& \cdots& \frac{\partial f_{1l}}{\partial x_i}\\ & & & \\ \frac{\partial f_{21}}{\partial x_i}& \frac{\partial f_{22}}{\partial x_i}& \cdots& \frac{\partial f_{2l}}{\partial x_i}\\ & & & \\ \vdots& \vdots& \ddots& \vdots\\ & & & \\ \frac{\partial f_{m1}}{\partial x_i}& \frac{\partial f_{m2}}{\partial x_i}& \cdots& \frac{\partial f_{ml}}{\partial x_i}\\ \end{matrix} \right] _{m\times l}
∂xi∂F(x)=
∂xi∂f11∂xi∂f21⋮∂xi∂fm1∂xi∂f12∂xi∂f22⋮∂xi∂fm2⋯⋯⋱⋯∂xi∂f1l∂xi∂f2l⋮∂xi∂fml
m×l
7.2 运算法则
7.2.1 加法运算
∂ [ A ( x ) ± B ( x ) ] ∂ x = ∂ A ( x ) ∂ x ± ∂ B ( x ) ∂ x \frac{\partial \left[ \boldsymbol{A}\left( \boldsymbol{x} \right) \pm \boldsymbol{B}\left( \boldsymbol{x} \right) \right]}{\partial \boldsymbol{x}}=\frac{\partial \boldsymbol{A}\left( \boldsymbol{x} \right)}{\partial \boldsymbol{x}}\pm \frac{\partial \boldsymbol{B}\left( \boldsymbol{x} \right)}{\partial \boldsymbol{x}} ∂x∂[A(x)±B(x)]=∂x∂A(x)±∂x∂B(x)
7.2.2 数乘运算
∂
(
λ
A
)
∂
x
=
[
∂
λ
∂
x
]
⋅
A
+
λ
⋅
∂
A
∂
x
\frac{\partial \left( \boldsymbol{\lambda A} \right)}{\partial \boldsymbol{x}}=\left[ \frac{\partial \boldsymbol{\lambda }}{\partial \boldsymbol{x}} \right] \cdot \boldsymbol{A}+\boldsymbol{\lambda }\cdot \frac{\partial \boldsymbol{A}}{\partial \boldsymbol{x}}
∂x∂(λA)=[∂x∂λ]⋅A+λ⋅∂x∂A
其中:
[
∂
λ
∂
x
]
⋅
A
=
[
∂
λ
∂
x
1
⋅
A
∂
λ
∂
x
2
⋅
A
⋯
∂
λ
∂
x
n
⋅
A
]
T
\left[ \frac{\partial \boldsymbol{\lambda }}{\partial \boldsymbol{x}} \right] \cdot \boldsymbol{A}=\left[ \begin{matrix} \frac{\partial \boldsymbol{\lambda }}{\partial x_1}\cdot \boldsymbol{A}& \frac{\partial \boldsymbol{\lambda }}{\partial x_2}\cdot \boldsymbol{A}& \cdots& \frac{\partial \boldsymbol{\lambda }}{\partial x_n}\\ \end{matrix}\cdot \boldsymbol{A} \right] ^T
[∂x∂λ]⋅A=[∂x1∂λ⋅A∂x2∂λ⋅A⋯∂xn∂λ⋅A]T
7.2.3 乘法运算
∂
(
A
B
)
∂
x
=
∂
A
∂
x
⋅
B
+
A
⋅
∂
B
∂
x
\frac{\partial \left( \boldsymbol{AB} \right)}{\partial \boldsymbol{x}}=\frac{\partial \boldsymbol{A}}{\partial \boldsymbol{x}}\cdot \boldsymbol{B}+\boldsymbol{A}\cdot \frac{\partial \boldsymbol{B}}{\partial \boldsymbol{x}}
∂x∂(AB)=∂x∂A⋅B+A⋅∂x∂B
其中:
A
⋅
∂
B
∂
x
=
[
A
⋅
∂
B
∂
x
1
A
⋅
∂
B
∂
x
2
⋯
A
⋅
∂
B
∂
x
n
]
T
\boldsymbol{A}\cdot \frac{\partial \boldsymbol{B}}{\partial \boldsymbol{x}}=\left[ \begin{matrix} \boldsymbol{A}\cdot \frac{\partial \boldsymbol{B}}{\partial x_1}& \boldsymbol{A}\cdot \frac{\partial \boldsymbol{B}}{\partial x_2}& \cdots& \boldsymbol{A}\cdot \frac{\partial \boldsymbol{B}}{\partial x_n}\\ \end{matrix} \right] ^T
A⋅∂x∂B=[A⋅∂x1∂BA⋅∂x2∂B⋯A⋅∂xn∂B]T
8 矩阵对矩阵求导
8.1 定义
对于
n
×
l
n\times l
n×l的数量函数矩阵
F
n
×
l
(
X
)
=
[
f
11
(
X
)
f
12
(
X
)
⋯
f
1
l
(
X
)
f
21
(
X
)
f
22
(
X
)
⋯
f
2
l
(
X
)
⋮
⋮
⋱
⋮
f
n
1
(
X
)
f
n
2
(
X
)
⋯
f
n
l
(
X
)
]
n
×
l
\boldsymbol{F}_{n\times l}\left( \boldsymbol{X} \right) =\left[ \begin{matrix} f_{11}\left( \boldsymbol{X} \right)& f_{12}\left( \boldsymbol{X} \right)& \cdots& f_{1l}\left( \boldsymbol{X} \right)\\ & & & \\ f_{21}\left( \boldsymbol{X} \right)& f_{22}\left( \boldsymbol{X} \right)& \cdots& f_{2l}\left( \boldsymbol{X} \right)\\ & & & \\ \vdots& \vdots& \ddots& \vdots\\ & & & \\ f_{n1}\left( \boldsymbol{X} \right)& f_{n2}\left( \boldsymbol{X} \right)& \cdots& f_{nl}\left( \boldsymbol{X} \right)\\ \end{matrix} \right] _{n\times l}
Fn×l(X)=
f11(X)f21(X)⋮fn1(X)f12(X)f22(X)⋮fn2(X)⋯⋯⋱⋯f1l(X)f2l(X)⋮fnl(X)
n×l
对于
p
×
m
p\times m
p×m的数量矩阵
X
p
×
m
=
[
x
11
x
12
⋯
x
1
m
x
21
x
22
⋯
x
2
m
⋮
⋮
⋱
⋮
x
p
1
x
p
2
⋯
x
p
m
]
\boldsymbol{X}_{p\times m}=\left[ \begin{matrix} x_{11}& x_{12}& \cdots& x_{1m}\\ x_{21}& x_{22}& \cdots& x_{2m}\\ \vdots& \vdots& \ddots& \vdots\\ x_{p1}& x_{p2}& \cdots& x_{pm}\\ \end{matrix} \right]
Xp×m=
x11x21⋮xp1x12x22⋮xp2⋯⋯⋱⋯x1mx2m⋮xpm
有:
∂
F
(
X
)
∂
X
=
[
∂
F
(
X
)
∂
x
11
∂
F
(
X
)
∂
x
12
⋯
∂
F
(
X
)
∂
x
1
m
∂
F
(
X
)
∂
x
21
∂
F
(
X
)
∂
x
22
⋯
∂
F
(
X
)
∂
x
2
m
⋮
⋮
⋱
⋮
∂
F
(
X
)
∂
x
p
1
∂
F
(
X
)
∂
x
p
2
⋯
∂
F
(
X
)
∂
x
p
m
]
\frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial \boldsymbol{X}}=\left[ \begin{matrix} \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{11}}& \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{12}}& \cdots& \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{1m}}\\ \\ \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{21}}& \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{22}}& \cdots& \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{2m}}\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{p1}}& \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{p2}}& \cdots& \frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial x_{pm}}\\ \end{matrix} \right]
∂X∂F(X)=
∂x11∂F(X)∂x21∂F(X)⋮∂xp1∂F(X)∂x12∂F(X)∂x22∂F(X)⋮∂xp2∂F(X)⋯⋯⋱⋯∂x1m∂F(X)∂x2m∂F(X)⋮∂xpm∂F(X)
对于每个分块矩阵,有:
[
∂
F
(
x
)
∂
x
i
j
]
=
[
∂
f
11
(
X
)
∂
x
i
j
∂
f
12
(
X
)
∂
x
i
j
⋯
∂
f
1
l
(
X
)
∂
x
i
j
∂
f
21
(
X
)
∂
x
i
j
∂
f
22
(
X
)
∂
x
i
j
⋯
∂
f
2
l
(
X
)
∂
x
i
j
⋮
⋮
⋱
⋮
∂
f
n
1
(
X
)
∂
x
i
j
∂
f
n
2
(
X
)
∂
x
i
j
⋯
∂
f
n
l
(
X
)
∂
x
i
j
]
\left[ \frac{\partial \boldsymbol{F}\left( \boldsymbol{x} \right)}{\partial x_{ij}} \right] =\left[ \begin{matrix} \frac{\partial f_{11}\left( \boldsymbol{X} \right)}{\partial x_{ij}}& \frac{\partial f_{12}\left( \boldsymbol{X} \right)}{\partial x_{ij}}& \cdots& \frac{\partial f_{1l}\left( \boldsymbol{X} \right)}{\partial x_{ij}}\\ \\ \frac{\partial f_{21}\left( \boldsymbol{X} \right)}{\partial x_{ij}}& \frac{\partial f_{22}\left( \boldsymbol{X} \right)}{\partial x_{ij}}& \cdots& \frac{\partial f_{2l}\left( \boldsymbol{X} \right)}{\partial x_{ij}}\\ \\ \vdots& \vdots& \ddots& \vdots\\ \\ \frac{\partial f_{n1}\left( \boldsymbol{X} \right)}{\partial x_{ij}}& \frac{\partial f_{n2}\left( \boldsymbol{X} \right)}{\partial x_{ij}}& \cdots& \frac{\partial f_{nl}\left( \boldsymbol{X} \right)}{\partial x_{ij}}\\ \end{matrix} \right]
[∂xij∂F(x)]=
∂xij∂f11(X)∂xij∂f21(X)⋮∂xij∂fn1(X)∂xij∂f12(X)∂xij∂f22(X)⋮∂xij∂fn2(X)⋯⋯⋱⋯∂xij∂f1l(X)∂xij∂f2l(X)⋮∂xij∂fnl(X)
易知
∂
F
(
X
)
∂
X
\frac{\partial \boldsymbol{F}\left( \boldsymbol{X} \right)}{\partial \boldsymbol{X}}
∂X∂F(X)的结果为
p
n
×
m
l
pn\times ml
pn×ml维矩阵。
参考文献
[1] 向量对矩阵求导
[2] 向量,标量对向量求导数