在进行对向量的求导时,非常好用的三个公式
分别是
1.对于向量x求导
∇
x
w
T
x
=
w
\nabla_x w^Tx=w
∇xwTx=w
2.对向量x求导
∇
x
x
T
A
x
=
(
A
+
A
T
)
x
\nabla_x x^TAx=(A+A^T)x
∇xxTAx=(A+AT)x其中x为向量,A为矩阵
3.对向量x求二阶导(即Hessian矩阵)
∇
2
x
T
A
x
=
A
+
A
T
\nabla ^2 x^TAx=A+A^T
∇2xTAx=A+AT
详细的证明
1.对于向量x求导
∇
x
w
T
x
=
w
\nabla_x w^Tx=w
∇xwTx=w
证明:
w
T
x
=
(
w
1
w
2
.
.
.
w
n
)
⋅
(
x
1
x
2
.
.
.
x
n
)
=
∑
i
=
1
n
w
i
x
i
w^Tx=\begin{pmatrix}w_1&w_2&...&w_n\end{pmatrix}\cdot\begin{pmatrix}x_1\\x_2\\...\\x_n\end{pmatrix}\\ =\sum\limits_{i=1}^nw_ix_i
wTx=(w1w2...wn)⋅⎝⎜⎜⎛x1x2...xn⎠⎟⎟⎞=i=1∑nwixi
所以对
x
i
x_i
xi求导,对应的导数为
w
i
w_i
wi
故
∇
x
w
T
x
=
w
\nabla_x w^Tx=w
∇xwTx=w
2.对向量x求导
∇
x
x
T
A
x
=
(
A
+
A
T
)
x
\nabla_x x^TAx=(A+A^T)x
∇xxTAx=(A+AT)x
其中x为向量,A为矩阵
证明:
对于二次型
x
T
A
x
x^TAx
xTAx
x
T
A
x
=
(
x
1
x
2
.
.
.
x
n
)
(
a
11
a
12
.
.
.
a
1
n
a
21
a
22
.
.
.
a
2
n
.
.
a
n
1
a
n
2
.
.
.
a
n
n
)
(
x
1
x
2
.
.
.
x
n
)
=
(
x
1
x
2
.
.
.
x
n
)
(
a
11
x
1
+
a
12
x
2
+
.
.
.
+
a
1
n
x
n
a
21
x
1
+
a
22
x
2
+
.
.
.
+
a
2
n
x
n
.
.
.
a
n
1
x
1
+
a
n
2
x
2
+
.
.
.
+
a
n
n
x
n
)
=
a
11
x
1
x
1
+
a
12
x
1
x
2
+
.
.
.
+
a
1
n
x
1
x
n
+
a
21
x
2
x
1
+
a
22
x
2
x
2
+
.
.
.
+
a
2
n
x
2
x
n
+
.
.
.
+
a
n
1
x
n
x
1
+
a
n
2
x
n
x
2
+
.
.
.
+
a
n
n
x
n
x
n
=
∑
i
=
1
n
∑
j
=
1
n
a
i
j
x
i
x
j
x^TAx=\begin{pmatrix}x_1&x_2&...&x_n\end{pmatrix}\begin{pmatrix} a_{11}&a_{12}&...&a_{1n}\\a_{21}&a_{22}&...&a_{2n}\\.\\.\\a_{n1}&a_{n2}&...&a_{nn}\end{pmatrix}\begin{pmatrix}x_1\\x_2\\...\\\\x_n\end{pmatrix}\\ =\begin{pmatrix}x_1&x_2&...&x_n\end{pmatrix}\begin{pmatrix}a_{11}x_1+a_{12}x_2+...+a_{1n}x_{n}\\a_{21}x_1+a_{22}x_2+...+a_{2n}x_n\\...\\a_{n1}x_1+a_{n2}x_2+...+a_{nn}x_n\end{pmatrix}\\ =a_{11}x_1x_1+a_{12}x_1x_2+...+a_{1n}x_1x_n+a_{21}x_2x_1+a_{22}x_2x_2+...+a_{2n}x_2x_n +...+a_{n1}x_nx_1+a_{n2}x_nx_2+...+a_{nn}x_nx_n \\ =\sum\limits_{i=1}^n\sum\limits_{j=1}^na_{ij}x_ix_j
xTAx=(x1x2...xn)⎝⎜⎜⎜⎜⎛a11a21..an1a12a22an2.........a1na2nann⎠⎟⎟⎟⎟⎞⎝⎜⎜⎜⎜⎛x1x2...xn⎠⎟⎟⎟⎟⎞=(x1x2...xn)⎝⎜⎜⎛a11x1+a12x2+...+a1nxna21x1+a22x2+...+a2nxn...an1x1+an2x2+...+annxn⎠⎟⎟⎞=a11x1x1+a12x1x2+...+a1nx1xn+a21x2x1+a22x2x2+...+a2nx2xn+...+an1xnx1+an2xnx2+...+annxnxn=i=1∑nj=1∑naijxixj
其中,若只对
x
1
x_1
x1求导则整理上式
x
T
A
x
=
a
11
x
1
x
1
+
∑
i
=
2
n
a
i
1
x
i
x
1
+
∑
j
=
2
n
a
1
j
x
j
x
1
+
c
x^TAx=a_{11}x_1x_1+\sum\limits_{i=2}^na_{i1}x_ix_1+\sum\limits_{j=2}^na_{1j}x_{j}x_1+c
xTAx=a11x1x1+i=2∑nai1xix1+j=2∑na1jxjx1+c
对
x
1
x_1
x1求导,则上式为
2
a
11
x
1
+
∑
i
=
2
n
a
i
1
x
i
+
∑
j
=
2
n
a
1
j
x
j
=
∑
j
=
1
n
a
1
j
x
j
+
∑
j
=
1
n
a
1
j
x
j
=
A
[
1
,
:
]
⋅
x
+
A
T
[
1
,
:
]
⋅
x
2a_{11}x_1+\sum\limits_{i=2}^na_{i1}x_i+\sum\limits_{j=2}^na_{1j}x_{j}\\ =\sum\limits_{j=1}^na_{1j}x_{j}+\sum\limits_{j=1}^na_{1j}x_{j}\\ =A[1,:]\cdot x +A^T[1,:]\cdot x
2a11x1+i=2∑nai1xi+j=2∑na1jxj=j=1∑na1jxj+j=1∑na1jxj=A[1,:]⋅x+AT[1,:]⋅x
由此可知,对x求导后,导数为
(
A
+
A
T
)
⋅
x
(A+A^T)\cdot x
(A+AT)⋅x
3.对向量x求二阶导(即Hessian矩阵) ∇ 2 x T A x = A + A T \nabla ^2 x^TAx=A+A^T ∇2xTAx=A+AT
证明:
对于二次型
x
T
A
x
=
∑
i
=
1
n
∑
j
=
1
n
x
i
x
j
a
i
j
x^TAx=\sum\limits_{i=1}^n\sum\limits_{j=1}^nx_ix_ja_{ij}
xTAx=i=1∑nj=1∑nxixjaij
而海森矩阵的每一个元素
H
i
j
=
∂
2
f
∂
x
i
∂
x
j
H_{ij}=\frac{\partial ^2f}{\partial x_i\partial x_j}
Hij=∂xi∂xj∂2f
如求
H
i
j
H_{ij}
Hij
则需要找到原等式中,存在
x
i
x
j
x_ix_j
xixj的项
故
H
i
j
=
∂
2
f
∂
x
i
∂
x
j
=
a
i
j
+
a
j
i
H_{ij}=\frac{\partial ^2f}{\partial x_i\partial x_j}=a_{ij}+a_{ji}
Hij=∂xi∂xj∂2f=aij+aji
故
H
=
A
+
A
T
H=A+A^T
H=A+AT
这个二阶偏导数的形式也与一元函数的二阶导数形式上统一