1.题目:
m
i
n
f
(
x
1
,
x
2
)
=
2
x
1
2
+
x
2
2
−
4
x
1
+
2
w
h
e
r
e
x
(
0
)
=
(
2
,
1
)
T
,
H
1
=
[
1
0
0
1
]
,
ϵ
=
0.01
\\ {\quad} {\quad} {\quad} {\quad}min {\quad} f(x_1, x_2) = 2x_1^2 + x_2^2-4x_1+2 \\ {\quad} {\quad} {\quad} {\quad} where {\quad}x^{(0)}=(2,1)^T,H_1= \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix} ,\epsilon=0.01
minf(x1,x2)=2x12+x22−4x1+2wherex(0)=(2,1)T,H1=[1001],ϵ=0.01
2. DFP算法
{\quad}
step1:给定
x
0
∈
R
n
,
\pmb{x}^0\in\pmb{R}^n,
xxx0∈RRRn,允许误差
ϵ
>
0
。
\epsilon>0。
ϵ>0。
{\quad} step2:令 H 0 : = E n , \pmb{H}_0:=\pmb{E}_n, HHH0:=EEEn,计算 g 0 : = ∇ f ( x 0 ) , \pmb{g}^0:={\nabla}f(\pmb{x}^0), ggg0:=∇f(xxx0),令 k : = 0 。 k:=0。 k:=0。
{\quad} step3:令 p k : = − H k g k 。 p^k:=-\pmb{H}_k\pmb{g}^k。 pk:=−HHHkgggk。
{\quad} step4: 求 λ k : f ( x k + λ k p k ) = m i n λ > 0 f ( x k + λ p k ) , 令 x x + 1 : = x k + λ k p k , 计 算 g k + 1 : = ∇ f ( x k + 1 ) 。 求\lambda_k:f(\pmb{x}^k+\lambda_k\pmb{p}^k)=\underset {\lambda>0}{min}f(\pmb{x}^k+\lambda\pmb{p}^k),令\pmb{x}^{x+1}:=\pmb{x}^k+\lambda_k\pmb{p}^k,计算\pmb{g}^{k+1}:={\nabla}f(\pmb{x}^{k+1})。 求λk:f(xxxk+λkpppk)=λ>0minf(xxxk+λpppk),令xxxx+1:=xxxk+λkpppk,计算gggk+1:=∇f(xxxk+1)。
{\quad} step5:若 ∣ ∣ g k + 1 ∣ ∣ < ϵ , 则 停 止 , 打 印 x k + 1 ; 否 则 , 转 s t e p 6 。 ||\pmb{g}^{k+1}||<\epsilon,则停止,打印\pmb{x}^{k+1};否则,转step6。 ∣∣gggk+1∣∣<ϵ,则停止,打印xxxk+1;否则,转step6。
{\quad} step6:若 k = n − 1 k=n-1 k=n−1,则令 x 0 : = x k + 1 , 转 s t e p 2 ; 否 则 , 转 s t e p 7 。 \pmb{x}^0:=\pmb{x}^{k+1},转step2;否则,转step7。 xxx0:=xxxk+1,转step2;否则,转step7。
{\quad}
step7:计算
Δ
g
k
=
g
k
+
1
−
g
k
,
Δ
x
k
=
x
k
+
1
−
x
k
H
k
+
1
:
=
H
k
+
Δ
g
k
=
H
k
+
Δ
x
k
(
Δ
x
k
)
T
(
Δ
x
k
)
T
Δ
g
k
−
H
k
Δ
g
k
(
Δ
g
k
)
T
H
k
(
Δ
g
k
)
T
H
k
Δ
g
k
\Delta{\pmb{g}^k}=\pmb{g}^{k+1}-\pmb{g}^k,\Delta{\pmb{x}^k}=\pmb{x}^{k+1}-\pmb{x}^k\\{\pmb{H}_{k+1}:=\pmb{H}_k+\Delta{\pmb{g}^k}=\pmb{H}_k+\frac {\Delta{\pmb{x}^k}(\Delta{\pmb{x}^k})^T} {(\Delta{\pmb{x}^k})^T\Delta{\pmb{g}^k}}}-\frac {\pmb{H}_k\Delta{\pmb{g}^k(\Delta{\pmb{g}^k})^T\pmb{H}_k}} {(\Delta{\pmb{g}^k})^T\pmb{H}_k\Delta{\pmb{g}^k}}
Δgggk=gggk+1−gggk,Δxxxk=xxxk+1−xxxkHHHk+1:=HHHk+Δgggk=HHHk+(Δxxxk)TΔgggkΔxxxk(Δxxxk)T−(Δgggk)THHHkΔgggkHHHkΔgggk(Δgggk)THHHk
令
k
:
=
k
+
1
,
转
s
t
e
p
3
{\quad}{\quad}{\quad}{\quad}令k:=k+1,转step3
令k:=k+1,转step3
3. BFGS算法
{\quad}
仅需将DFP的step7中计算
H
k
+
1
H_{k+1}
Hk+1的方式改变即可。
H
k
+
1
=
H
k
+
β
k
Δ
x
k
(
Δ
x
k
)
T
(
Δ
x
k
)
T
Δ
g
k
−
Δ
x
k
(
Δ
g
k
)
T
H
k
+
H
k
Δ
g
k
(
Δ
x
k
)
T
(
Δ
x
k
)
T
Δ
g
k
其
中
,
β
k
=
1
+
(
Δ
g
k
)
T
H
k
Δ
g
k
(
Δ
x
k
)
T
Δ
g
k
H_{k+1} =H_{k}+\beta_k\frac {\Delta{\pmb{x}^k}(\Delta{\pmb{x}^k})^T} {(\Delta{\pmb{x}^k})^T\Delta{\pmb{g}^k}}-\frac {\Delta{\pmb{x}^k}(\Delta{\pmb{g}^k})^TH_{k}+H_{k}\Delta{\pmb{g}^k}(\Delta{\pmb{x}^k})^T} {(\Delta{\pmb{x}^k})^T\Delta{\pmb{g}^k}}\\其中,{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}{\quad}\\\beta_k=1+\frac {(\Delta{\pmb{g}^k})^TH_k\Delta{\pmb{g}^k}} {(\Delta{\pmb{x}^k})^T\Delta{\pmb{g}^k}}
Hk+1=Hk+βk(Δxxxk)TΔgggkΔxxxk(Δxxxk)T−(Δxxxk)TΔgggkΔxxxk(Δgggk)THk+HkΔgggk(Δxxxk)T其中,βk=1+(Δxxxk)TΔgggk(Δgggk)THkΔgggk4.python实现
{\quad}
4.1代码
import numpy as np
import math
def get_g(x):
a = [4*x[0,0]-4,2*x[1,0]]
b = math.sqrt(a[0]**2+a[1]**2)
c = np.mat([[a[0]],[a[1]]])
return [c,b]
def get_x(x,p):
a = p[0,0]*2-x[0,0]*p[0,0]*2-x[1,0]*p[1,0]
b = 2*p[0,0]**2+p[1,0]**2
r = a/b
return x+r*p
def get_H_DFP(H,x,g):
return H + (x*x.T)/(x.T*g) - (H*g*g.T*H)/(g.T*H*g)
def get_H_BFGS(H,x,g):
xg = x.T*g
b = 1+(g.T*H*g)/xg
H_ans = H + b[0,0]*(x*x.T)/xg - (x*g.T*H+H*g*x.T)/xg
return H_ans
def Quasi_Newton(get_H_method, method_name):
e = 0.01
n = 2
H = np.eye(2)
k = 0
xk = np.mat([[2],[1]])
gk = get_g(xk)
p = (-1)*H*gk[0]
xk_1 = get_x(xk,p)
gk_1 = get_g(xk_1)
while gk_1[1] >=e:
if k==n-1:
xk = xk_1
gk = gk_1
H = np.eye(2)
k=0
else:
dg = gk_1[0]-gk[0]
dx = xk_1-xk
H = get_H_method(H,dx,dg)
k = k+1
p = (-1)*H*gk_1[0]
xk_1 = get_x(xk_1, p)
gk_1 = get_g(xk_1)
print(method_name+"_answer:",[xk_1[0,0],xk_1[1,0]])
Quasi_Newton(get_H_DFP, "DFP")
Quasi_Newton(get_H_BFGS, "BFGS")
{\quad}
4.2结果