1. 泰勒级数展开
实际优化问题的目标函数往往比较复杂。为了使问题简化,通常将目标函数在某点附近展开为泰勒(Taylor)多项式来逼近原函数。
1.1 (一阶)偏导数的概念
以二元函数为例:
设有二元函数
z
=
f
(
x
,
y
)
z=f(x,y)
z=f(x,y),若存在
d
d
x
f
(
x
,
y
0
)
∣
x
=
x
0
\frac{d}{dx}f(x,y_0)|_{x=x_0}
dxdf(x,y0)∣x=x0
则,称之为
z
=
f
(
x
,
y
)
z=f(x,y)
z=f(x,y)在点
(
x
0
,
y
0
)
(x_0,y_0)
(x0,y0)处对
x
x
x的偏导数(值)。
记为:
f
x
′
(
x
0
,
y
0
)
f'_x(x_0,y_0)
fx′(x0,y0) ,
∂
f
(
x
0
,
y
0
)
∂
x
\frac{\partial f(x_0,y_0)}{\partial x}
∂x∂f(x0,y0)或
∂
f
∂
x
∣
(
x
0
,
y
0
)
\frac{\partial f}{\partial x}|_{(x_0,y_0)}
∂x∂f∣(x0,y0)
若
z
=
f
(
x
,
y
)
z=f(x,y)
z=f(x,y)在区域
D
D
D的每一个点
(
x
,
y
)
(x,y)
(x,y)处都有偏导数(值),一般来说,它们仍是
x
,
y
x,y
x,y的函数,称为
f
(
x
,
y
)
f(x,y)
f(x,y)的偏导(函)数,简称偏导数,
记为:
f
x
′
(
x
,
y
)
f'_x(x,y)
fx′(x,y) 或
∂
f
∂
x
\frac{\partial f}{\partial x}
∂x∂f
1.2 二阶偏导数与混合偏导数的概念
若函数
z
=
f
(
x
,
y
)
z=f(x,y)
z=f(x,y)的一阶偏导(函)数
∂
f
∂
x
=
f
x
′
,
∂
f
∂
y
=
f
y
′
\frac{\partial f}{\partial x}=f'_x, \frac{\partial f}{\partial y}=f'_y
∂x∂f=fx′,∂y∂f=fy′关于
x
x
x和
y
y
y的偏导数仍然存在,
则,称一阶偏导数的偏导数是
z
=
f
(
x
,
y
)
z=f(x,y)
z=f(x,y)的二阶偏导数。
二元函数
z
=
f
(
x
,
y
)
z=f(x,y)
z=f(x,y)有四个二阶偏导数:
f
x
x
′
(
x
,
y
)
f'_{xx}(x,y)
fxx′(x,y),
f
x
y
′
(
x
,
y
)
f'_{xy}(x,y)
fxy′(x,y) ,
f
y
x
′
(
x
,
y
)
f'_{yx}(x,y)
fyx′(x,y) ,
f
y
y
′
(
x
,
y
)
f'_{yy}(x,y)
fyy′(x,y)
类似地可以定义三阶、四阶、n阶偏导数。
其中,对不同自变量求导的高阶偏导数称为混合偏导数。 如
f
x
y
′
(
x
,
y
)
f'_{xy}(x,y)
fxy′(x,y) ,
f
y
x
′
(
x
,
y
)
f'_{yx}(x,y)
fyx′(x,y)
1.3 函数的泰勒级数展开
-
一元函数 f ( x ) f(x) f(x)在点 x k x_k xk处的泰勒展开式为:
f ( x ) = f ( x k ) + ( x − x k ) f ′ ( x k ) + 1 2 ! ( x − x k ) 2 f ′ ′ ( x k ) + ⋯ + 1 n ! ( x − x k ) n f n ( x k ) + o ( n ) f(x)=f(x_k)+(x-x_k)f'(x_k)+ \frac {1}{2!}(x-x_k)^2f''(x_k)+\cdots+\frac{1}{n!}(x-x_k)^n f^n(x_k)+o(n) f(x)=f(xk)+(x−xk)f′(xk)+2!1(x−xk)2f′′(xk)+⋯+n!1(x−xk)nfn(xk)+o(n) -
二元函数 f ( x , y ) f(x,y) f(x,y)在点 ( x k , y k ) (x_k,y_k) (xk,yk)处的泰勒展开式为:
f ( x , y ) = f ( x k , y k ) + ( x − x k ) f x ′ ( x k , y k ) + ( y − y k ) f y ′ ( x k , y k ) + 1 2 ! ( x − x k ) 2 f x x ′ ′ ( x k , y k ) + 1 2 ! ( x − x k ) ( y − y k ) f x y ′ ′ ( x k , y k ) + 1 2 ! ( y − y k ) ( x − x k ) f y x ′ ′ ( x k , y k ) + 1 2 ! ( y − y k ) 2 f y y ′ ′ ( x k , y k ) + ⋯ + o ( n ) f(x,y)=f(x_k,y_k)+(x-x_k)f'_x(x_k,y_k)+(y-y_k)f'_y(x_k,y_k)+\\ \frac {1}{2!}(x-x_k)^2 f''_{xx}(x_k,y_k)+\frac {1}{2!}(x-x_k)(y-y_k) f''_{xy}(x_k,y_k)+\\ \frac {1}{2!}(y-y_k)(x-x_k) f''_{yx}(x_k,y_k)+\frac {1}{2!}(y-y_k)^2 f''_{yy}(x_k,y_k)+\\ \cdots+o(n) f(x,y)=f(xk,yk)+(x−xk)fx′(xk,yk)+(y−yk)fy′(xk,yk)+2!1(x−xk)2fxx′′(xk,yk)+2!1(x−xk)(y−yk)fxy′′(xk,yk)+2!1(y−yk)(x−xk)fyx′′(xk,yk)+2!1(y−yk)2fyy′′(xk,yk)+⋯+o(n)
- n元函数
f
(
x
1
,
x
2
,
⋯
 
,
x
n
)
f(x^1,x^2,\cdots,x^n)
f(x1,x2,⋯,xn)在点
(
x
k
1
,
x
k
2
,
⋯
 
,
x
k
n
)
(x^1_k,x^2_k,\cdots,x^n_k)
(xk1,xk2,⋯,xkn)处的泰勒展开为:
f ( x 1 , x 2 , ⋯   , x n ) = f ( x k 1 , x k 2 , ⋯   , x k n ) + ∑ i = 1 n ( x i − x k i ) f x i ′ ( x k 1 , x k 2 , ⋯   , x k n ) + 1 2 ! ∑ i , j = 1 n ( x i − x k i ) ( x j − x k j ) f x i x j ′ ( x k 1 , x k 2 , ⋯   , x k n ) + ⋯ + o ( n ) f(x^1,x^2,\cdots,x^n)=f(x^1_k,x^2_k,\cdots,x^n_k)+\\ \sum^n_{i=1}(x^i -x^i_k)f'_{x^i}(x^1_k,x^2_k,\cdots,x^n_k)+\\ \frac{1}{2!}\sum^n_{i,j=1}(x^i-x^i_k)(x^j-x^j_k)f'_{x^i x^j}(x^1_k,x^2_k,\cdots,x^n_k)+\\ \cdots+o(n) f(x1,x2,⋯,xn)=f(xk1,xk2,⋯,xkn)+i=1∑n(xi−xki)fxi′(xk1,xk2,⋯,xkn)+2!1i,j=1∑n(xi−xki)(xj−xkj)fxixj′(xk1,xk2,⋯,xkn)+⋯+o(n)
该式可以表示为矩阵形式,如下:
2. 矩阵形式的泰勒级数展开式
记
X
=
[
x
1
,
x
2
,
⋯
 
,
x
n
]
T
X=[x^1,x^2,\cdots,x^n]^T
X=[x1,x2,⋯,xn]T,
X
k
=
[
x
k
1
,
x
k
2
,
⋯
 
,
x
k
n
]
T
X_k=[x^1_k,x^2_k,\cdots,x^n_k]^T
Xk=[xk1,xk2,⋯,xkn]T
则,n元函数
f
(
X
)
f(X)
f(X)在点
X
k
X_k
Xk处的泰勒展开为:
f
(
X
)
=
f
(
X
k
)
+
[
∇
f
(
X
k
)
]
T
(
X
−
X
k
)
+
1
2
!
(
X
−
X
k
)
T
H
(
X
k
)
(
X
−
X
k
)
+
o
(
n
)
f(X)=f(X_k)+[\nabla f(X_k)]^T(X-X_k)+\\ \frac{1}{2!}(X-X_k)^TH(X_k)(X-X_k)+o(n)
f(X)=f(Xk)+[∇f(Xk)]T(X−Xk)+2!1(X−Xk)TH(Xk)(X−Xk)+o(n)
其中,
∇
f
(
X
k
)
=
[
∂
f
(
X
k
)
∂
x
1
,
∂
f
(
X
k
)
∂
x
2
,
⋯
 
,
∂
f
(
X
k
)
∂
x
n
]
T
\nabla f(X_k)=[\frac{\partial f(X_k)}{\partial x^1},\frac{\partial f(X_k)}{\partial x^2},\cdots,\frac{\partial f(X_k)}{\partial x^n}]^T
∇f(Xk)=[∂x1∂f(Xk),∂x2∂f(Xk),⋯,∂xn∂f(Xk)]T
称为n元函数
f
(
X
)
f(X)
f(X)在点
X
k
X_k
Xk处的梯度(向量);
H
(
X
k
)
=
[
∂
2
f
(
X
k
)
∂
x
1
∂
x
1
∂
2
f
(
X
k
)
∂
x
1
∂
x
2
⋯
∂
2
f
(
X
k
)
∂
x
1
∂
x
3
∂
2
f
(
X
k
)
∂
x
2
∂
x
1
∂
2
f
(
X
k
)
∂
x
2
∂
x
2
⋯
∂
2
f
(
X
k
)
∂
x
2
∂
x
3
⋮
⋮
⋱
⋮
∂
2
f
(
X
k
)
∂
x
n
∂
x
1
∂
2
f
(
X
k
)
∂
x
n
∂
x
2
⋯
∂
2
f
(
X
k
)
∂
x
n
∂
x
1
]
H(X_k)= \begin{bmatrix} \frac{\partial ^2 f(X_k)}{\partial x^1 \partial x^1} & \frac{\partial ^2 f(X_k)}{\partial x^1 \partial x^2} & \cdots & \frac{\partial ^2 f(X_k)}{\partial x^1 \partial x^3} \\ \frac{\partial ^2 f(X_k)}{\partial x^2 \partial x^1} & \frac{\partial ^2 f(X_k)}{\partial x^2 \partial x^2} & \cdots & \frac{\partial ^2 f(X_k)}{\partial x^2 \partial x^3} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial ^2 f(X_k)}{\partial x^n \partial x^1} & \frac{\partial ^2 f(X_k)}{\partial x^n \partial x^2} & \cdots & \frac{\partial ^2 f(X_k)}{\partial x^n \partial x^1} \end{bmatrix}
H(Xk)=⎣⎢⎢⎢⎢⎡∂x1∂x1∂2f(Xk)∂x2∂x1∂2f(Xk)⋮∂xn∂x1∂2f(Xk)∂x1∂x2∂2f(Xk)∂x2∂x2∂2f(Xk)⋮∂xn∂x2∂2f(Xk)⋯⋯⋱⋯∂x1∂x3∂2f(Xk)∂x2∂x3∂2f(Xk)⋮∂xn∂x1∂2f(Xk)⎦⎥⎥⎥⎥⎤