1、梯度与黑塞矩阵
定义1:设
n
~n~
n 元函数
f
(
x
)
~f(x)~
f(x) 对自变量
x
=
(
x
1
,
x
2
,
…
,
x
n
)
T
~x=(x_1,x_2,\dots,x_n)^T~
x=(x1,x2,…,xn)T 各自分量
x
i
~x_i~
xi 的一阶偏导数为
∂
f
(
x
)
∂
x
i
,
i
=
1
,
2
,
…
,
n
\frac{\partial f(x)}{\partial x_i},~~~i=1,2,\dots,n
∂xi∂f(x), i=1,2,…,n
那么称向量
∇
f
(
x
)
=
(
∂
f
(
x
)
∂
x
1
,
∂
f
(
x
)
∂
x
2
,
…
,
∂
f
(
x
)
∂
x
n
)
T
\nabla f(x)=(\frac{\partial f(x)}{\partial x_1},\frac{\partial f(x)}{\partial x_2},\dots,\frac{\partial f(x)}{\partial x_n})^T
∇f(x)=(∂x1∂f(x),∂x2∂f(x),…,∂xn∂f(x))T
为函数
f
(
x
)
~f(x)~
f(x) 在
x
~x~
x 处的一阶导数或梯度
定义2:设
n
~n~
n 元函数
f
(
x
)
~f(x)~
f(x) 对自变量
x
=
(
x
1
,
x
2
,
…
,
x
n
)
T
~x=(x_1,x_2,\dots,x_n)^T~
x=(x1,x2,…,xn)T 各自分量
x
i
~x_i~
xi 的二阶偏导数为
∂
2
f
(
x
)
∂
x
i
x
j
,
i
,
j
=
1
,
2
,
…
,
n
\frac{\partial^2 f(x)}{\partial x_i x_j},~~~i,j=1,2,\dots,n
∂xixj∂2f(x), i,j=1,2,…,n
那么称矩阵
∇
2
f
(
x
)
=
(
∂
2
f
(
x
)
∂
x
1
2
∂
2
f
(
x
)
∂
x
1
x
2
…
∂
2
f
(
x
)
∂
x
i
x
n
∂
2
f
(
x
)
∂
x
2
x
1
∂
2
f
(
x
)
∂
x
2
2
…
∂
2
f
(
x
)
∂
x
2
x
n
⋮
⋮
⋮
∂
2
f
(
x
)
∂
x
n
x
1
∂
2
f
(
x
)
∂
x
n
x
2
⋯
∂
2
f
(
x
)
∂
x
n
2
)
\nabla^2 f(x)=\begin{pmatrix} \frac{\partial^2 f(x)}{\partial x_1^2 }&\frac{\partial^2 f(x)}{\partial x_1 x_2}&\dots&\frac{\partial^2 f(x)}{\partial x_i x_n}\\ \frac{\partial^2 f(x)}{\partial x_2 x_1}&\frac{\partial^2 f(x)}{\partial x_2^2 }&\dots&\frac{\partial^2 f(x)}{\partial x_2 x_n}\\ \vdots&\vdots&&\vdots\\ \frac{\partial^2 f(x)}{\partial x_n x_1 }&\frac{\partial^2 f(x)}{\partial x_n x_2 }&\cdots&\frac{\partial^2 f(x)}{\partial x_n ^2 } \end{pmatrix}
∇2f(x)=⎝⎜⎜⎜⎜⎜⎛∂x12∂2f(x)∂x2x1∂2f(x)⋮∂xnx1∂2f(x)∂x1x2∂2f(x)∂x22∂2f(x)⋮∂xnx2∂2f(x)……⋯∂xixn∂2f(x)∂x2xn∂2f(x)⋮∂xn2∂2f(x)⎠⎟⎟⎟⎟⎟⎞
为函数
f
(
x
)
~f(x)~
f(x) 在
x
~x~
x 处的二阶导数矩阵或
H
e
s
s
e
n
~Hessen~
Hessen 矩阵
定义3:如果
f
(
x
)
~f(x)~
f(x) 梯度的所有分量函数在
x
~x~
x 都连续,则称
f
(
x
)
~f(x)~
f(x) 在
x
~x~
x 连续可微;如果
f
(
x
)
~f(x)~
f(x) 的
H
e
s
s
e
n
~Hessen~
Hessen 矩阵的各个分量函数都连续,则
f
(
x
)
~f(x)~
f(x) 在
x
~x~
x 二阶连续可微。
定义4:如果
f
~f~
f 在开集
D
~D~
D 上每一点都连续可微,则称
f
~f~
f 在
D
~D~
D 上一阶连续可微;如果如果
f
~f~
f 在开集
D
~D~
D 上每一点上二阶连续可微,则称
f
~f~
f 在
D
~D~
D 上二阶连续可微
注:(1)、定义4中之所以选择开集
D
~D~
D ,而不是闭集,是因为闭集的边界不可微
(2)、如果
f
(
x
)
~f(x)~
f(x) 在
x
~x~
x 二阶连续可微,则
∂
2
f
(
x
)
∂
x
i
x
j
=
∂
2
f
(
x
)
∂
x
j
x
x
\frac{\partial^2 f(x)}{\partial x_i x_j }=\frac{\partial^2 f(x)}{\partial x_j x_x }
∂xixj∂2f(x)=∂xjxx∂2f(x)
即表明
∇
2
f
(
x
)
~\nabla^2 f(x)~
∇2f(x) 是一个对称矩阵
例1:设
A
∈
R
n
x
n
A\in\mathbb{R}^{nxn}
A∈Rnxn,
b
∈
R
n
~b\in\mathbb{R}^n~
b∈Rn ,求二次函数
f
(
x
)
=
1
2
x
T
A
x
+
b
T
x
f(x)=\frac{1}{2}x^TAx+b^Tx
f(x)=21xTAx+bTx
在
x
~x~
x 的梯度和
H
e
s
s
e
~Hesse~
Hesse 矩阵
解:由于
f
(
x
)
=
1
2
∑
i
=
1
i
=
n
∑
j
=
1
j
=
n
a
i
j
x
i
x
j
+
∑
i
=
1
i
=
n
b
i
x
i
\begin{aligned} f(x)&=\frac{1}{2}\sum_{i=1}^{i=n}\sum_{j=1}^{j=n}a_{ij}x_ix_j+\sum_{i=1}^{i=n}b_ix_i\\ \end{aligned}
f(x)=21i=1∑i=nj=1∑j=naijxixj+i=1∑i=nbixi
则
k
=
1
,
2
,
⋯
,
n
~k=1,2,\cdots,n~
k=1,2,⋯,n 时
∂
f
(
x
)
∂
x
k
=
1
2
∑
j
=
1
,
j
≠
k
j
=
n
a
k
j
x
j
+
1
2
∑
i
=
1
,
i
≠
k
i
=
n
a
i
k
x
i
+
2
a
k
k
x
k
+
b
k
=
1
2
∑
j
=
1
j
=
n
a
k
j
x
j
+
1
2
∑
i
=
1
i
=
n
a
i
k
x
i
+
b
k
\begin{aligned} \frac{\partial f(x)}{\partial x_k}&=\frac{1}{2}\sum_{j=1,j\neq k}^{j=n}a_{kj}x_j+\frac{1}{2}\sum_{i=1,i\neq k}^{i=n}a_{ik}x_i+2a_{kk}x_k+b_k\\ &=\frac{1}{2}\sum_{j=1}^{j=n}a_{kj}x_j+\frac{1}{2}\sum_{i=1}^{i=n}a_{ik}x_i+b_k \end{aligned}
∂xk∂f(x)=21j=1,j=k∑j=nakjxj+21i=1,i=k∑i=naikxi+2akkxk+bk=21j=1∑j=nakjxj+21i=1∑i=naikxi+bk
故
∂
f
(
x
)
∂
x
=
1
2
(
A
+
A
T
)
x
+
b
~\frac{\partial f(x)}{\partial x}=\frac{1}{2}(A+A^T)x+b~
∂x∂f(x)=21(A+AT)x+b
和上面的分析类似,我们可以证明
∇
2
f
(
x
)
=
1
2
(
A
+
A
T
)
~\nabla^2f(x)=\frac{1}{2}(A+A^T)~
∇2f(x)=21(A+AT)
2、方向导数
定义5:设
f
:
R
n
→
R
~f:\mathbb{R}^n\rightarrow\mathbb{R}~
f:Rn→R 在开集
D
~D~
D 上连续可微,对于
x
∈
R
n
,
d
∈
R
n
~x\in\mathbb{R}^n,d\in\mathbb{R}^n~
x∈Rn,d∈Rn ,则
f
~f~
f 在点
x
~x~
x 关于方向
d
~d~
d 的方向导数定义为
∂
f
∂
d
(
x
)
=
lim
θ
→
0
f
(
x
+
θ
d
)
−
f
(
x
)
θ
\frac{\partial f}{\partial d}(x)=\lim_{\theta\rightarrow0}\frac{f(x+\theta d)-f(x)}{\theta}
∂d∂f(x)=θ→0limθf(x+θd)−f(x)
上述定义的方向导数等于
∇
f
(
x
)
T
d
~\nabla f(x)^Td~
∇f(x)Td ,其中
∇
f
(
x
)
~\nabla f(x)~
∇f(x) 表示
f
~f~
f 在
x
~x~
x 处的梯度,
d
~d~
d 为方向.
注:(1)、显然方向导数是偏导数的推广,偏导数刻画的函数沿着特定方向的微商,而方向导数是任意方向的微商
(2)、就是关于这里方向导数的定义,采用的我后面参考的几本书上其中的定义,不过我当时一看觉得有问题,我当时认为方向导数应该这样定义
∂
f
∂
d
(
x
)
=
lim
θ
→
0
f
(
x
+
θ
d
)
−
f
(
x
)
θ
∥
d
∥
\frac{\partial f}{\partial d}(x)=\lim_{\theta\rightarrow0}\frac{f(x+\theta d)-f(x)}{\theta \Vert d\Vert}
∂d∂f(x)=θ→0limθ∥d∥f(x+θd)−f(x)
上面的范数我们就取欧式范数,或者原始的定义方向选取的是单位方向。后来在维基百科发现方向导数的定义,它认为两者都可以,仔细一想,才是我狭隘了。如果有人留意此贴,希望大家思考一下。
3、多元函数的泰勒公式
定义6:若
f
(
x
)
~f(x)~
f(x) 在
D
~D~
D 上一阶连续可微,对任何
x
,
x
+
d
∈
D
~x,x+d\in D~
x,x+d∈D 则有
f
(
x
+
d
)
=
f
(
x
)
+
∇
f
(
x
)
T
d
+
o
(
∥
d
∥
)
麦
克
劳
林
余
项
f(x+d)=f(x)+\nabla f(x)^Td+o(\Vert d\Vert)~~~~~~~麦克劳林余项
f(x+d)=f(x)+∇f(x)Td+o(∥d∥) 麦克劳林余项
f
(
x
+
d
)
=
f
(
x
)
+
∇
f
(
x
+
t
d
)
T
d
,
t
∈
(
0
,
1
)
柯
西
余
项
f(x+d)=f(x)+\nabla f(x+td)^Td,~~t\in(0,1)~~~柯西余项
f(x+d)=f(x)+∇f(x+td)Td, t∈(0,1) 柯西余项
f
(
x
+
d
)
=
f
(
x
)
+
∫
0
1
∇
f
(
x
+
t
d
)
T
d
d
t
积
分
余
项
f(x+d)=f(x)+\int_{0}^{1}\nabla f(x+td)^Tddt~~~~积分余项
f(x+d)=f(x)+∫01∇f(x+td)Tddt 积分余项
定义7:若
f
(
x
)
~f(x)~
f(x) 在
D
~D~
D 上二阶连续可微,对任何
x
,
x
+
d
∈
D
~x,x+d\in D~
x,x+d∈D 则有
f
(
x
+
d
)
=
f
(
x
)
+
∇
f
(
x
)
T
d
+
1
2
d
T
∇
2
f
(
x
)
d
+
o
(
∥
d
∥
2
)
麦
克
劳
林
余
项
f(x+d)=f(x)+\nabla f(x)^Td+\frac{1}{2}d^T\nabla^2f(x)d+o(\Vert d\Vert^2)~~~~~~~麦克劳林余项
f(x+d)=f(x)+∇f(x)Td+21dT∇2f(x)d+o(∥d∥2) 麦克劳林余项
f
(
x
+
d
)
=
f
(
x
)
+
∇
f
(
x
)
T
d
+
1
2
d
T
∇
2
f
(
x
+
t
d
)
d
t
∈
(
0
,
1
)
柯
西
余
项
f(x+d)=f(x)+\nabla f(x)^Td+\frac{1}{2}d^T\nabla^2f(x+td)d~~t\in(0,1)~~~柯西余项
f(x+d)=f(x)+∇f(x)Td+21dT∇2f(x+td)d t∈(0,1) 柯西余项
f
(
x
+
d
)
=
f
(
x
)
+
∇
f
(
x
)
T
d
+
∫
0
1
(
1
−
t
)
[
d
T
∇
2
f
(
x
+
t
d
)
d
]
d
t
积
分
余
项
f(x+d)=f(x)+\nabla f(x)^Td+\int_{0}^{1}(1-t)[d^T\nabla^2 f(x+td)d]dt~~~~积分余项
f(x+d)=f(x)+∇f(x)Td+∫01(1−t)[dT∇2f(x+td)d]dt 积分余项
证明:因为这个不是很显然
我们利用一元函数的泰勒展开证明,令
ϕ
(
t
)
=
f
(
x
+
t
d
)
~\phi(t)=f(x+td)~
ϕ(t)=f(x+td)
则
ϕ
′
(
t
)
=
∇
f
(
x
+
t
d
)
T
d
~\phi'(t)=\nabla f(x+td)^Td~
ϕ′(t)=∇f(x+td)Td ,
ϕ
′
′
(
t
)
=
d
T
∇
2
f
(
x
+
t
d
)
d
~\phi''(t)=d^T\nabla ^2f(x+td)d~
ϕ′′(t)=dT∇2f(x+td)d ,由
ϕ
(
1
)
−
ϕ
(
0
)
=
∫
0
1
ϕ
′
(
t
)
d
t
~\phi(1)-\phi(0)=\int_{0}^{1}\phi'(t)dt~
ϕ(1)−ϕ(0)=∫01ϕ′(t)dt 知
f
(
x
+
d
)
−
f
(
x
)
=
∫
0
1
[
∇
f
(
x
+
t
d
)
T
d
]
d
t
=
−
∫
0
1
[
∇
f
(
x
+
t
d
)
T
d
]
d
(
1
−
t
)
=
(
t
−
1
)
∇
f
(
x
+
t
d
)
T
d
∣
0
1
+
∫
0
1
(
1
−
t
)
d
[
∇
f
(
x
+
t
d
)
T
d
]
=
∇
f
(
x
)
T
d
+
∫
0
1
(
1
−
t
)
[
d
T
∇
f
(
x
+
t
d
)
T
d
]
d
t
\begin{aligned} f(x+d)-f(x)&=\int_{0}^{1}[\nabla f(x+td)^Td]dt=-\int_{0}^{1}[\nabla f(x+td)^Td]d(1-t)\\ &=(t-1)\nabla f(x+td)^Td|_0^1+\int_0^1(1-t)d[\nabla f(x+td)^Td]\\ &=\nabla f(x)^Td+\int_0^1(1-t)[d^T\nabla f(x+td)^Td]dt \end{aligned}
f(x+d)−f(x)=∫01[∇f(x+td)Td]dt=−∫01[∇f(x+td)Td]d(1−t)=(t−1)∇f(x+td)Td∣01+∫01(1−t)d[∇f(x+td)Td]=∇f(x)Td+∫01(1−t)[dT∇f(x+td)Td]dt
4、两个普通公式的证明
此处是我临时起意加上的,肯定很多书上也找不到,主要的是
定义8:若
f
(
x
)
~f(x)~
f(x) 在
开
集
D
∈
R
n
~开集D~\in\mathbb{R}^n
开集D ∈Rn上二阶连续可微,对任何
x
,
x
+
t
d
∈
D
~x,x+td\in D~
x,x+td∈D 则有
d
f
(
x
+
t
d
)
d
t
=
∇
f
(
x
+
t
d
)
T
d
\frac{d f(x+td)}{d t}=\nabla f(x+td)^Td
dtdf(x+td)=∇f(x+td)Td
d
2
f
(
x
+
t
d
)
d
t
2
=
d
T
∇
2
f
(
x
+
t
d
)
d
\frac{d^2 f(x+td)}{d t^2}=d^T\nabla^2 f(x+td)d
dt2d2f(x+td)=dT∇2f(x+td)d
这个公式我们在上面的证明中用到,但是看起来却不是那么显然,我来证明一下:
证明:
d
f
(
x
+
t
d
)
d
t
=
d
f
(
x
1
+
t
d
1
,
x
2
+
t
d
2
,
⋯
,
x
n
+
t
d
n
)
d
t
=
∂
f
(
x
+
t
d
)
∂
(
x
1
+
t
d
1
)
d
1
+
∂
f
(
x
+
t
d
)
∂
(
x
2
+
t
d
2
)
d
2
+
⋯
+
∂
f
(
x
+
t
d
)
∂
(
x
n
+
t
d
n
)
d
n
=
(
∂
f
(
x
+
t
d
)
∂
(
x
1
+
t
d
1
)
,
∂
f
(
x
+
t
d
)
∂
(
x
2
+
t
d
2
)
,
⋯
,
∂
f
(
x
+
t
d
)
∂
(
x
n
+
t
d
n
)
)
(
d
1
d
2
⋮
d
n
)
=
(
∂
f
(
x
+
t
d
)
∂
x
1
,
∂
f
(
x
+
t
d
)
∂
x
2
,
⋯
,
∂
f
(
x
+
t
d
)
∂
x
n
)
(
d
1
d
2
⋮
d
n
)
=
∇
f
(
x
+
t
d
)
T
d
\begin{aligned} \frac{d f(x+td)}{d t}&=\frac{df(x_1+td_1,x_2+td_2,\cdots,x_n+td_n)}{dt}\\ &=\frac{\partial f(x+td)}{\partial (x_1+td_1)}d_1+\frac{\partial f(x+td)}{\partial (x_2+td_2)}d_2+\cdots+\frac{\partial f(x+td)}{\partial (x_n+td_n)}d_n\\ &=(\frac{\partial f(x+td)}{\partial (x_1+td_1)},\frac{\partial f(x+td)}{\partial (x_2+td_2)},\cdots,\frac{\partial f(x+td)}{\partial (x_n+td_n)})\begin{pmatrix} d_1\\d_2\\\vdots\\d_n \end{pmatrix}\\ &=(\frac{\partial f(x+td)}{\partial x_1},\frac{\partial f(x+td)}{\partial x_2},\cdots,\frac{\partial f(x+td)}{\partial x_n})\begin{pmatrix} d_1\\d_2\\\vdots\\d_n \end{pmatrix}\\ &=\nabla f(x+td)^Td \end{aligned}
dtdf(x+td)=dtdf(x1+td1,x2+td2,⋯,xn+tdn)=∂(x1+td1)∂f(x+td)d1+∂(x2+td2)∂f(x+td)d2+⋯+∂(xn+tdn)∂f(x+td)dn=(∂(x1+td1)∂f(x+td),∂(x2+td2)∂f(x+td),⋯,∂(xn+tdn)∂f(x+td))⎝⎜⎜⎜⎛d1d2⋮dn⎠⎟⎟⎟⎞=(∂x1∂f(x+td),∂x2∂f(x+td),⋯,∂xn∂f(x+td))⎝⎜⎜⎜⎛d1d2⋮dn⎠⎟⎟⎟⎞=∇f(x+td)Td
d
2
f
(
x
+
t
d
)
d
t
2
=
d
2
f
(
x
1
+
t
d
1
,
x
2
+
t
d
2
,
⋯
,
x
n
+
t
d
n
)
d
t
2
=
∂
2
f
(
x
+
t
d
)
∂
2
(
x
1
+
t
d
1
)
d
1
2
+
∂
2
f
(
x
+
t
d
)
∂
(
x
1
+
t
d
1
)
∂
(
x
2
+
t
d
2
)
d
1
d
2
+
⋯
+
∂
2
f
(
x
+
t
d
)
∂
(
x
1
+
t
d
1
)
∂
(
x
n
+
t
d
n
)
d
1
d
n
+
∂
2
f
(
x
+
t
d
)
∂
(
x
2
+
t
d
2
)
∂
(
x
1
+
t
d
1
)
d
2
d
1
+
∂
2
f
(
x
+
t
d
)
∂
2
(
x
2
+
t
d
2
)
d
2
2
+
⋯
+
∂
2
f
(
x
+
t
d
)
∂
(
x
2
+
t
d
2
)
∂
(
x
n
+
t
d
n
)
d
2
d
n
⋮
+
∂
2
f
(
x
+
t
d
)
∂
(
x
n
+
t
d
n
)
∂
(
x
1
+
t
d
1
)
d
n
d
1
+
∂
2
f
(
x
+
t
d
)
∂
(
x
n
+
t
d
n
)
∂
(
x
2
+
t
d
2
)
d
n
d
2
+
⋯
+
∂
2
f
(
x
+
t
d
)
∂
2
(
x
n
+
t
d
n
)
d
2
d
n
=
(
d
1
d
2
⋯
d
n
)
(
∂
2
f
(
x
+
t
d
)
∂
2
x
1
∂
2
f
(
x
+
t
d
)
∂
x
1
∂
x
2
⋯
∂
2
f
(
x
+
t
d
)
∂
x
1
∂
x
n
∂
2
f
(
x
+
t
d
)
∂
x
2
∂
x
1
∂
2
f
(
x
+
t
d
)
∂
2
x
2
⋯
∂
2
f
(
x
+
t
d
)
∂
x
2
∂
x
n
⋮
⋮
⋮
∂
2
f
(
x
+
t
d
)
∂
x
n
∂
x
1
∂
2
f
(
x
+
t
d
)
∂
x
n
∂
x
2
⋯
∂
2
f
(
x
+
t
d
)
∂
2
x
n
)
(
d
1
d
2
⋮
d
n
)
=
d
T
∇
2
f
(
x
+
t
d
)
d
\begin{aligned} \frac{d^2 f(x+td)}{d t^2}&=\frac{d^2f(x_1+td_1,x_2+td_2,\cdots,x_n+td_n)}{dt^2}\\ &=\frac{\partial^2 f(x+td)}{\partial^2(x_1+td_1)}d_1^2+\frac{\partial^2 f(x+td)}{\partial (x_1+td_1)\partial (x_2+td_2)}d_1d_2+\cdots+\frac{\partial^2 f(x+td)}{\partial (x_1+td_1)\partial (x_n+td_n)}d_1d_n\\ &+\frac{\partial^2 f(x+td)}{\partial(x_2+td_2)\partial(x_1+td_1)}d_2d_1+\frac{\partial^2f(x+td)}{\partial^2(x_2+td_2)}d_2^2+\cdots+\frac{\partial^2 f(x+td)}{\partial(x_2+td_2)\partial(x_n+td_n)}d_2d_n\\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\vdots\\ &+\frac{\partial^2 f(x+td)}{\partial(x_n+td_n)\partial(x_1+td_1)}d_nd_1+\frac{\partial^2 f(x+td)}{\partial(x_n+td_n)\partial(x_2+td_2)}d_nd_2+\cdots+\frac{\partial^2 f(x+td)}{\partial^2(x_n+td_n)}d_2d_n\\ &=\begin{pmatrix} d_1&d_2&\cdots&d_n \end{pmatrix}\begin{pmatrix} \frac{\partial^2 f(x+td)}{\partial^2x_1}&\frac{\partial^2 f(x+td)}{\partial x_1\partial x_2}\cdots&\frac{\partial^2 f(x+td)}{\partial x_1\partial x_n}\\ \frac{\partial^2 f(x+td)}{\partial x_2\partial x_1}&\frac{\partial^2 f(x+td)}{\partial^2x_2}\cdots&\frac{\partial^2 f(x+td)}{\partial x_2\partial x_n}\\ \vdots&\vdots&\vdots&\\ \frac{\partial^2 f(x+td)}{\partial x_n\partial x_1}&\frac{\partial^2 f(x+td)}{\partial x_n\partial x_2}\cdots&\frac{\partial^2 f(x+td)}{\partial^2x_n} \end{pmatrix}\begin{pmatrix} d_1\\d_2\\\vdots\\d_n \end{pmatrix}\\ &=d^T\nabla^2f(x+td)d \end{aligned}
dt2d2f(x+td)=dt2d2f(x1+td1,x2+td2,⋯,xn+tdn)=∂2(x1+td1)∂2f(x+td)d12+∂(x1+td1)∂(x2+td2)∂2f(x+td)d1d2+⋯+∂(x1+td1)∂(xn+tdn)∂2f(x+td)d1dn+∂(x2+td2)∂(x1+td1)∂2f(x+td)d2d1+∂2(x2+td2)∂2f(x+td)d22+⋯+∂(x2+td2)∂(xn+tdn)∂2f(x+td)d2dn ⋮+∂(xn+tdn)∂(x1+td1)∂2f(x+td)dnd1+∂(xn+tdn)∂(x2+td2)∂2f(x+td)dnd2+⋯+∂2(xn+tdn)∂2f(x+td)d2dn=(d1d2⋯dn)⎝⎜⎜⎜⎜⎜⎛∂2x1∂2f(x+td)∂x2∂x1∂2f(x+td)⋮∂xn∂x1∂2f(x+td)∂x1∂x2∂2f(x+td)⋯∂2x2∂2f(x+td)⋯⋮∂xn∂x2∂2f(x+td)⋯∂x1∂xn∂2f(x+td)∂x2∂xn∂2f(x+td)⋮∂2xn∂2f(x+td)⎠⎟⎟⎟⎟⎟⎞⎝⎜⎜⎜⎛d1d2⋮dn⎠⎟⎟⎟⎞=dT∇2f(x+td)d
此次内容参考书籍:
[1]、倪勤:最优化方法与程序设计
[2]、袁亚湘,孙文瑜:最优化理论与方法