1、函数对向量的微分
\quad
定义
多元函数
f
(
x
)
=
f
(
x
1
,
x
2
,
.
.
,
x
n
)
f(x)=f(x_1, x_2, .., x_n)
f(x)=f(x1,x2,..,xn),
x
∈
x\in
x∈
R
n
R^n
Rn,称列向量
(
∂
f
(
x
)
∂
x
1
,
∂
f
(
x
)
∂
x
2
,
⋯
 
,
∂
f
(
x
)
∂
x
n
)
T
(\dfrac{\partial f(x)}{\partial x_1}, \dfrac{\partial f(x)}{\partial x_2}, \cdots, \dfrac{\partial f(x)}{\partial x_n} )^T
(∂x1∂f(x),∂x2∂f(x),⋯,∂xn∂f(x))T
为函数 f ( x ) f(x) f(x)对向量 x x x的微分或梯度,记为 d f ( x ) d x \dfrac{df(x)}{dx} dxdf(x)或 ∇ x f ( x ) \nabla_xf(x) ∇xf(x),也记为 g r a d   f ( x ) grad\,f(x) gradf(x)或 ∇ f ( x ) \nabla f(x) ∇f(x)。
\quad
(1)
f
(
x
)
=
A
x
f(x)=Ax
f(x)=Ax,则
∇
f
(
x
)
=
A
T
\nabla f(x)=A^T
∇f(x)=AT,下式中
α
i
\alpha_i
αi为列向量
。
f
(
x
)
=
(
α
1
,
α
2
,
⋯
 
,
α
n
)
(
x
1
,
x
2
,
⋯
 
,
x
n
)
T
=
α
1
x
1
+
α
2
x
2
+
⋯
+
α
n
x
n
∇
f
(
x
)
=
(
α
1
,
α
2
,
⋯
 
,
α
n
)
T
=
A
T
\begin{aligned} &f(x)=(\alpha_1,\alpha_2,\cdots,\alpha_n)(x_1,x_2,\cdots,x_n)^T =\alpha_1x_1+\alpha_2x_2+\cdots+\alpha_nx_n \\\\ & \nabla f(x)=(\alpha_1,\alpha_2,\cdots,\alpha_n)^T=A^T \end{aligned}
f(x)=(α1,α2,⋯,αn)(x1,x2,⋯,xn)T=α1x1+α2x2+⋯+αnxn∇f(x)=(α1,α2,⋯,αn)T=AT
\quad
(2)
f
(
x
)
=
x
T
A
f(x)=x^TA
f(x)=xTA,则
∇
f
(
x
)
=
A
\nabla f(x)=A
∇f(x)=A,下式中
α
i
\alpha_i
αi为行向量
。
f
(
x
)
=
(
x
1
,
x
2
,
⋯
 
,
x
n
)
(
α
1
,
α
2
,
⋯
 
,
α
n
)
T
=
α
1
x
1
+
α
2
x
2
+
⋯
+
α
n
x
n
∇
f
(
x
)
=
(
α
1
,
α
2
,
⋯
 
,
α
n
)
T
=
A
\begin{aligned} &f(x)=(x_1,x_2,\cdots,x_n)(\alpha_1,\alpha_2,\cdots,\alpha_n)^T =\alpha_1x_1+\alpha_2x_2+\cdots+\alpha_nx_n \\\\ &\nabla f(x)=(\alpha_1,\alpha_2,\cdots,\alpha_n)^T=A \end{aligned}
f(x)=(x1,x2,⋯,xn)(α1,α2,⋯,αn)T=α1x1+α2x2+⋯+αnxn∇f(x)=(α1,α2,⋯,αn)T=A
\quad
(3)
f
(
x
)
=
y
T
A
x
f(x)=y^TAx
f(x)=yTAx,则
∇
f
(
x
)
=
A
T
y
\nabla f(x)=A^Ty
∇f(x)=ATy。
\quad
(4)
f
(
x
)
=
x
T
A
x
f(x)=x^TAx
f(x)=xTAx,则
∇
f
(
x
)
=
(
A
T
+
A
)
x
\nabla f(x)=(A^T+A)x
∇f(x)=(AT+A)x。
\quad\quad\quad
方法一:
f
(
x
)
=
(
x
1
x
2
⋯
x
n
)
(
a
11
a
12
⋯
a
1
n
a
21
a
22
⋯
a
2
n
⋮
⋮
⋱
⋮
a
n
1
a
n
2
⋯
a
n
n
)
(
x
1
x
2
⋮
x
n
)
=
(
∑
i
=
1
n
x
i
a
i
1
,
∑
i
=
1
n
x
i
a
i
1
,
⋯
 
,
∑
i
=
1
n
x
i
a
i
1
)
(
x
1
,
x
2
,
⋯
 
,
x
n
T
)
=
∑
j
=
1
n
∑
i
=
1
n
x
i
a
i
j
x
j
∇
f
(
x
)
=
(
A
+
A
T
)
x
\begin{aligned} & \begin{aligned} f(x)&= \begin{pmatrix} x_1 & x_2 & \cdots & x_n \end{pmatrix} \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \\ \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \\&= (\sum_{i=1}^n x_ia_{i1},\sum_{i=1}^n x_ia_{i1},\cdots,\sum_{i=1}^n x_ia_{i1}) (x_1 ,x_2,\cdots,x_n^T)\\ &=\sum_{j=1}^n\sum_{i=1}^n x_i a_{ij} x_j \end{aligned} \\ &\nabla f(x)=(A+A^T)x \end{aligned}
f(x)=(x1x2⋯xn)⎝⎜⎜⎜⎛a11a21⋮an1a12a22⋮an2⋯⋯⋱⋯a1na2n⋮ann⎠⎟⎟⎟⎞⎝⎜⎜⎜⎛x1x2⋮xn⎠⎟⎟⎟⎞=(i=1∑nxiai1,i=1∑nxiai1,⋯,i=1∑nxiai1)(x1,x2,⋯,xnT)=j=1∑ni=1∑nxiaijxj∇f(x)=(A+AT)x
\quad\quad\quad
方法二:
∇
f
(
x
)
=
d
x
T
A
x
d
x
=
(
d
x
T
)
A
x
d
x
+
x
T
A
d
x
d
x
=
(
A
+
A
T
)
x
\nabla f(x)=\frac{dx^TAx}{dx}=\frac{(dx^T)Ax}{dx}+\frac{x^TAdx}{dx}=(A+A^T)x
∇f(x)=dxdxTAx=dx(dxT)Ax+dxxTAdx=(A+AT)x