1. 矩阵Y对标量x求导
相当于每个元素求导数后转置一下,注意M×N矩阵求导后变成N×M了
Y
=
[
y
i
j
]
−
−
>
d
Y
d
x
=
d
y
i
j
d
x
Y = [y_{ij}] --> \frac{dY}{dx} = \frac{dy_{ij}}{dx}
Y=[yij]−−>dxdY=dxdyij
2. 标量y对列向量X求导:
注意与上面不同,这次括号内是求偏导,不转置,对N×1向量求导后还是N×1向量
y
=
f
(
x
1
,
x
2
,
.
.
,
x
n
)
−
−
>
d
y
/
d
X
=
(
D
y
/
D
x
1
,
D
y
/
D
x
2
,
.
.
,
D
y
/
D
x
n
)
′
y = f(x_1,x_2,..,x_n) --> dy/dX = (Dy/Dx_1,Dy/Dx_2,..,Dy/Dx_n)'
y=f(x1,x2,..,xn)−−>dy/dX=(Dy/Dx1,Dy/Dx2,..,Dy/Dxn)′
3. 行向量Y’对列向量X求导:
注意1×M向量对N×1向量求导后是N×M矩阵。
将Y的每一列对X求偏导,将各列构成一个矩阵。
重要结论:
d
X
′
/
d
X
=
I
d
(
A
X
)
′
/
d
X
=
A
′
dX'/dX = I \\ d(AX)'/dX = A'
dX′/dX=Id(AX)′/dX=A′
4. 列向量Y对行向量 X ′ X' X′求导:
转化为行向量Y’对列向量X的导数,然后转置。
注意M×1向量对1×N向量求导结果为M×N矩阵。
d
Y
/
d
X
′
=
(
d
Y
′
/
d
X
)
′
dY/dX' = (dY'/dX)'
dY/dX′=(dY′/dX)′
5. 向量积对列向量X求导运算法则:
注意与标量求导有点不同。
d
(
U
V
′
)
/
d
X
=
(
d
U
/
d
X
)
V
′
+
U
(
d
V
′
/
d
X
)
d
(
U
′
V
)
/
d
X
=
(
d
U
′
/
d
X
)
V
+
(
d
V
′
/
d
X
)
U
′
d(UV')/dX = (dU/dX)V' + U(dV'/dX) \\ d(U'V)/dX = (dU'/dX)V + (dV'/dX)U'
d(UV′)/dX=(dU/dX)V′+U(dV′/dX)d(U′V)/dX=(dU′/dX)V+(dV′/dX)U′
重要结论:
d
(
X
′
A
)
/
d
X
=
(
d
X
′
/
d
X
)
A
+
(
d
A
/
d
X
)
X
′
=
I
A
+
0
X
′
=
A
d
(
A
X
)
/
d
X
′
=
(
d
(
X
′
A
′
)
/
d
X
)
′
=
(
A
′
)
′
=
A
d
(
X
′
A
X
)
/
d
X
=
(
d
X
′
/
d
X
)
A
X
+
(
d
(
A
X
)
′
/
d
X
)
X
=
A
X
+
A
′
X
d(X'A)/dX = (dX'/dX)A + (dA/dX)X' = IA + 0X' = A \\ d(AX)/dX' = (d(X'A')/dX)' = (A')' = A \\ d(X'AX)/dX = (dX'/dX)AX + (d(AX)'/dX)X = AX + A'X
d(X′A)/dX=(dX′/dX)A+(dA/dX)X′=IA+0X′=Ad(AX)/dX′=(d(X′A′)/dX)′=(A′)′=Ad(X′AX)/dX=(dX′/dX)AX+(d(AX)′/dX)X=AX+A′X
6. 矩阵Y对列向量X求导:
将Y对X的每一个分量求偏导,构成一个超向量。
注意该向量的每一个元素都是一个矩阵。
7. 矩阵积对列向量求导法则:
d
(
u
V
)
/
d
X
=
(
d
u
/
d
X
)
V
+
u
(
d
V
/
d
X
)
d
(
U
V
)
/
d
X
=
(
d
U
/
d
X
)
V
+
U
(
d
V
/
d
X
)
d(uV)/dX = (du/dX)V + u(dV/dX) \\ d(UV)/dX = (dU/dX)V + U(dV/dX)
d(uV)/dX=(du/dX)V+u(dV/dX)d(UV)/dX=(dU/dX)V+U(dV/dX)
重要结论:
d
(
X
′
A
)
/
d
X
=
(
d
X
′
/
d
X
)
A
+
X
′
(
d
A
/
d
X
)
=
I
A
+
X
′
0
=
A
d(X'A)/dX = (dX'/dX)A + X'(dA/dX) = IA + X'0 = A
d(X′A)/dX=(dX′/dX)A+X′(dA/dX)=IA+X′0=A
8. 标量y对矩阵X的导数:
类似标量y对列向量X的导数,
把y对每个X的元素求偏导,不用转置。
d
y
/
d
X
=
[
D
y
/
D
x
(
i
j
)
]
dy/dX = [ Dy/Dx(ij) ]
dy/dX=[Dy/Dx(ij)]
重要结论:
(1)
y
=
U
′
X
V
=
∑
∑
u
(
i
)
x
(
i
j
)
v
(
j
)
y = U'XV = \sum\sum u(i)x(ij)v(j)
y=U′XV=∑∑u(i)x(ij)v(j),
则,
d
y
/
d
X
=
[
u
(
i
)
v
(
j
)
]
=
U
V
′
dy/dX = [u(i)v(j)] = UV'
dy/dX=[u(i)v(j)]=UV′
(2)
y
=
U
′
X
′
X
U
y = U'X'XU
y=U′X′XU
则,
d
y
/
d
X
=
2
X
U
U
′
)
dy/dX = 2XUU')
dy/dX=2XUU′)
(3)
y
=
(
X
U
−
V
)
′
(
X
U
−
V
)
y = (XU-V)'(XU-V)
y=(XU−V)′(XU−V)
则
d
y
/
d
X
=
d
(
U
′
X
′
X
U
−
2
V
′
X
U
+
V
′
V
)
/
d
X
=
2
X
U
U
′
−
2
V
U
′
+
0
=
2
(
X
U
−
V
)
U
′
dy/dX = d(U'X'XU - 2V'XU + V'V)/dX = 2XUU' - 2VU' + 0 = 2(XU-V)U'
dy/dX=d(U′X′XU−2V′XU+V′V)/dX=2XUU′−2VU′+0=2(XU−V)U′
9. 矩阵Y对矩阵X的导数:
将Y的每个元素对X求导,然后排在一起形成超级矩阵。
10.乘积的导数
d ( f ∗ g ) / d x = ( d f ′ / d x ) g + ( d g / d x ) f ′ d(f*g)/dx=(df'/dx)g+(dg/dx)f' d(f∗g)/dx=(df′/dx)g+(dg/dx)f′
结论
d
(
x
′
A
x
)
=
(
d
(
x
′
′
)
/
d
x
)
A
x
+
(
d
(
A
x
)
/
d
x
)
(
x
′
′
)
=
A
x
+
A
′
x
d(x'Ax)=(d(x'')/dx)Ax+(d(Ax)/dx)(x'')=Ax+A'x
d(x′Ax)=(d(x′′)/dx)Ax+(d(Ax)/dx)(x′′)=Ax+A′x
(注意:''是表示两次转置)
11. 假设A为m*n的矩阵,x为n维列向量,则
(1)
d
(
A
x
)
d
x
=
A
′
\frac{d(Ax)}{dx} = A'
dxd(Ax)=A′
(2)
d
(
x
′
A
)
d
x
=
A
\frac{d(x'A)}{dx} = A
dxd(x′A)=A
(3)
d
(
x
′
A
x
)
d
x
=
(
A
′
+
A
)
x
\frac{d(x'Ax)}{dx} = (A'+A)x
dxd(x′Ax)=(A′+A)x