INTRODUCTION TO NONELINEAR OPTIMIZATION Excise 5.2 Freudenstein and Roth Test Function

最新推荐文章于 2024-08-05 20:49:33 发布

Sylvan Ding

最新推荐文章于 2024-08-05 20:49:33 发布

阅读量834

点赞数 2

分类专栏：凸优化文章标签：线性代数算法矩阵最优化

©SylvanDing

本文链接：https://blog.csdn.net/IYXUAN/article/details/121801303

版权

凸优化专栏收录该内容

5 篇文章 8 订阅

订阅专栏

Amir Beck’s INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB Excise 5.2

INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB. Amir Beck. 2014

本文主要涉及题目(ii)的MATLAB部分，
题目(i): https://blog.csdn.net/IYXUAN/article/details/121454946

实验目的

掌握回溯法梯度下降、阻尼牛顿法和混合牛顿法；
学会使用MATLAB，通过上述三种方法求解优化问题；
了解这三种方法的优劣，并能够分析结果产生的原因。

实验环境

MATLAB R2021a

实验内容

Consider the Freudenstein and Roth test function

$f\left(x\right)=f_1\left(x\right)^2+f_2\left(x\right)^2,x\in \mathbb{R}^2,$

where

$f_1\left(x\right)=-13+x_1+\left(\left(5-x_2\right)x_2-2\right)x_2\ ,$

$f_2\left(x\right)=-29+x_1+\left(\left(x_2+1\right)x_2-14\right)x_2\ .$

Use MATLAB to employ the following three methods on the problem of minimizing $f$ :

the gradient method with backtracking and parameters $\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)$ .
the hybrid Newton’s method with parameters $\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)$ .
damped Gauss–Newton’s method with a backtracking line search strategy with parameters $\left(s,\alpha,\beta\right)=\left(1,0.5,0.5\right)$ .

All the algorithms should use the stopping criteria $\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}$ . Each algorithm should be employed four times on the following four starting points: $\left(-50,7\right)^T,\left(20,7\right)^T,\left(20,-18\right)^T,\left(5,-10\right)^T$ . For each of the four starting points, compare the number of iterations and the point to which each method converged. If a method did not converge, explain why.

算法描述

gradient_method_backtracking

回溯法梯度下降算法描述：

设定 $\epsilon$ ；

初始化 $x_0\in \mathbb{R}^n$ ；

FOR k = 0, 1, 2, …

选取下降方向 $d_k=g(x_k)$ ；
选取步长 $t_k$ ，使得 $f(x_k+t_kd_k)<f(x_k)$ ；
设置 $x_{k+1}=x_k+t_kd_k$ ；
IF $\Vert g(x_{k+1}) \Vert \le \epsilon$ THEN STOP，OUTPUT $x_{k+1}$ ；

在算法循环的 2. 步长选取中，对于定步长的梯度下降来说， $t_k$ 为一常量，即 $t_k=\bar{t}$ . 而对于采用回溯法的非精确线搜索来说，令步长的初值 $t_k=s, s>0$ ，更新步长 $t_k \gets \beta t_k, \beta \in (0,1)$ ，直到满足 $f(x_k)-f(x_k+t_kd_k)\ge -\alpha t_k\nabla f(x_k)^Td_k, \alpha \in (0,1)$ ，可以证明，当 $t\in [0,\epsilon]$ ，上述不等式总成立。

newton_backtracking

回溯牛顿算法描述：

回溯牛顿法和上述梯度下降相似，只需将1. 下降方向选取改为牛顿方向即可。

$d_k=-\left( \nabla^2f(x_k) \right )^{-1}\nabla f(x_k)$ .

newton_hybrid

混合牛顿法描述：

在选取下降方向 $d_k$ 时，当 Hessian 矩阵 $\nabla^2f(x_k)$ 非正定时，使用回溯法梯度下降处理，即 $d_k=-\nabla f(x_k)$ 。当 $\nabla^2f(x_k)$ 正定时， $d_k$ 可选用牛顿方向。在判定 Hessian 矩阵是否正定时，使用 Cholesky 分解作为判定条件。

实验步骤

计算 $f (x), g (x), h (x)$

$f(x)=2x_1^2 + 12x_1x_2^2 - 32x_1x_2 - 84x_1 + 2x_2^6 - 8x_2^5 + 2x_2^4 - 80x_2^3 + 12x_2^2 + 864x_2 + 1010$

$g(x)=\left ( 12x_2^2 - 32x_2 + 4x_1 - 84,24x_2 - 32x_1 + 24x_1x_2 - 240x_2^2 + 8x_2^3 - 40x_2^4 + 12x_2^5 + 864\right )^T$

$h(x)=\begin{pmatrix} 4 &24x_2 - 32 \\ 24x_2 - 32 &60x_2^4 - 160x_2^3 + 24x_2^2 - 480x_2 + 24x_1 + 24 \end{pmatrix}$

clc;clear
syms f1 f2 x1 x2 f g h
f1=-13+x1+((5-x2)*x2-2)*x2;
f2=-29+x1+((x2+1)*x2-14)*x2;
f=expand(f1^2+f2^2)
% f = 2*x1^2 + 12*x1*x2^2 - 32*x1*x2 - 84*x1 + 2*x2^6 - 8*x2^5 + 2*x2^4 - 80*x2^3 + 12*x2^2 + 864*x2 + 1010
g=gradient(f)
% g = [12*x2^2 - 32*x2 + 4*x1 - 84; 24*x2 - 32*x1 + 24*x1*x2 - 240*x2^2 + 8*x2^3 - 40*x2^4 + 12*x2^5 + 864]
h=hessian(f)
% h = [4, 24*x2 - 32; 24*x2 - 32, 60*x2^4 - 160*x2^3 + 24*x2^2 - 480*x2 + 24*x1 + 24]

绘制函数的等高线和曲面图

在这里插入图片描述

Figure 1. contour and surface plots of $f (x)$ around the global optimal solution $x^*=(5,4)$ .

在这里插入图片描述

Figure 2. contour and surface plots of $f (x)$ around the local optimal solution $x^*=(11.4128, -0.8968)$ .

clc;clear
clc;clear
clc;clear
% local optimality
x1 = linspace(11.40,11.42,50);
x2 = linspace(-0.897,-0.8965,50);
% global optimality
% x1 = linspace(4.9,5.1,50);
% x2 = linspace(3.9,4.1,50);
[X1,X2] = meshgrid(x1,x2);
f = 2*X1.^2 + 12*X1.*X2.^2 - 32*X1.*X2 - 84*X1 + 2*X2.^6 - 8*X2.^5 ...
    + 2*X2.^4 - 80*X2.^3 + 12*X2.^2 + 864*X2 + 1010;
createfigure(X1,X2,f)

%% 由 surfc 函数生成：
function createfigure(xdata1, ydata1, zdata1)
%CREATEFIGURE(xdata1, ydata1, zdata1)
%  XDATA1:  surface xdata
%  YDATA1:  surface ydata
%  ZDATA1:  surface zdata

%  Auto-generated by MATLAB on 08-Dec-2021 15:44:51

% Create figure
figure1 = figure;

% Create axes
axes1 = axes('Parent',figure1);
hold(axes1,'on');

% Create surf
surf(xdata1,ydata1,zdata1,'Parent',axes1);

% Create contour
contour(xdata1,ydata1,zdata1,'ZLocation','zmin');

view(axes1,[-113 13]);
grid(axes1,'on');
axis(axes1,'tight');
hold(axes1,'off');
% Create colorbar
colorbar(axes1);

执行函数

clc;clear

%% init
s = 1;
alpha = .5;
beta = .5;
epsilon = 1e-5;
p1 = [-50;7];
p2 = [20;7];
p3 = [20;-18];
p4 = [5;-10];
f = @(x) 2*x(1)^2 + 12*x(1)*x(2)^2 - 32*x(1)*x(2) - 84*x(1) ...
    + 2*x(2)^6 - 8*x(2)^5 + 2*x(2)^4 - 80*x(2)^3 + 12*x(2)^2 ...
    + 864*x(2) + 1010;
g = @(x) [12*x(2)^2 - 32*x(2) + 4*x(1) - 84; 24*x(2) - 32*x(1)...
    + 24*x(1)*x(2) - 240*x(2)^2 + 8*x(2)^3 - 40*x(2)^4 + 12*x(2)^5 + 864];
h = @(x) [4, 24*x(2) - 32; 24*x(2) - 32, 60*x(2)^4 - 160*x(2)^3 ...
    + 24*x(2)^2 - 480*x(2) + 24*x(1) + 24];

%% call func
% gradient_method_backtracking
[x_gb1,fun_val_gb1] = gradient_method_backtracking(f,g,p1,s,alpha,...
beta,epsilon);
[x_gb2,fun_val_gb2] = gradient_method_backtracking(f,g,p2,s,alpha,...
beta,epsilon);
[x_gb3,fun_val_gb3] = gradient_method_backtracking(f,g,p3,s,alpha,...
beta,epsilon);
[x_gb4,fun_val_gb4] = gradient_method_backtracking(f,g,p4,s,alpha,...
beta,epsilon);

% newton_backtracking
x_nb1 = newton_backtracking(f,g,h,p1,alpha,beta,epsilon);
x_nb2 = newton_backtracking(f,g,h,p2,alpha,beta,epsilon);
x_nb3 = newton_backtracking(f,g,h,p3,alpha,beta,epsilon);
x_nb4 = newton_backtracking(f,g,h,p4,alpha,beta,epsilon);

% newton_hybrid
x_nh1 = newton_hybrid(f,g,h,p1,alpha,beta,epsilon);
x_nh2 = newton_hybrid(f,g,h,p2,alpha,beta,epsilon);
x_nh3 = newton_hybrid(f,g,h,p3,alpha,beta,epsilon);
x_nh4 = newton_hybrid(f,g,h,p4,alpha,beta,epsilon);

function [x,fun_val]=gradient_method_backtracking(f,g,x0,s,alpha,...
beta,epsilon)

% Gradient method with backtracking stepsize rule
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% x0......... initial point
% s ......... initial choice of stepsize
% alpha ..... tolerance parameter for the stepsize selection
% beta ...... the constant in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function value

x=x0;
grad=g(x);
fun_val=f(x);
iter=0;
while (norm(grad)>epsilon)
    iter=iter+1;
    t=s;
    while (fun_val-f(x-t*grad)<alpha*t*norm(grad)^2)
        t=beta*t;
    end
    x=x-t*grad;
    fun_val=f(x);
    grad=g(x);
    fprintf('iter_number = %3d norm_grad = %2.6f fun_val = %2.6f \n',...
    iter,norm(grad),fun_val);
end

function x=newton_backtracking(f,g,h,x0,alpha,beta,epsilon)

% Newton’s method with backtracking
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% h ......... hessian of the objective function
% x0......... initial point
% alpha ..... tolerance parameter for the stepsize selection strategy
% beta ...... the proportion in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function value

x=x0;
gval=g(x);
hval=h(x);
d=hval\gval;
iter=0;
while ((norm(gval)>epsilon)&&(iter<10000))
    iter=iter+1;
    t=1;
    while(f(x-t*d)>f(x)-alpha*t*gval'*d)
        t=beta*t;
    end
    x=x-t*d;
    fprintf('iter= %2d f(x)=%10.10f\n',iter,f(x))
    gval=g(x);
    hval=h(x);
    d=hval\gval;
end
if (iter==10000)
    fprintf('did not converge\n')
end

function x=newton_hybrid(f,g,h,x0,alpha,beta,epsilon)

% Hybrid Newton’s method
%
% INPUT
%=======================================
% f ......... objective function
% g ......... gradient of the objective function
% h ......... hessian of the objective function
% x0......... initial point
% alpha ..... tolerance parameter for the stepsize selection strategy
% beta ...... the proportion in which the stepsize is multiplied
% at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
% of min f(x)
% fun_val ... optimal function value

x=x0;
gval=g(x);
hval=h(x);
[L,p]=chol(hval,'lower');
if (p==0)
    d=L'\(L\gval);
else
    d=gval;
end
iter=0;
while ((norm(gval)>epsilon)&&(iter<10000))
    iter=iter+1;
    t=1;
    while(f(x-t*d)>f(x)-alpha*t*gval'*d)
        t=beta*t;
    end
        x=x-t*d;
        fprintf('iter= %2d f(x)=%10.10f\n',iter,f(x))
        gval=g(x);
        hval=h(x);
        [L,p]=chol(hval,'lower');
    if (p==0)
        d=L'\(L\gval);
    else
        d=gval;
    end
end
if (iter==10000)
    fprintf('did not converge\n')
end

结果与分析

$p_1=\left(-50,7\right)^T,p_2=\left(20,7\right)^T,p_3=\left(20,-18\right)^T,p_4=\left(5,-10\right)^T$

gradient_method_backtracking

	x*	fun_val	iter
p1	5.0, 4.0	0.000000	-
p2	5.0, 4.0	0.000000	-
p3	11.4128, -0.8968	48.984254	-
p4	5.0, 4.0	0.000107	-

# p1
iter_number =   1 norm_grad = 16886.999242 fun_val = 11336.705024 
iter_number =   2 norm_grad = 5600.459983 fun_val = 5803.984203 
iter_number =   3 norm_grad = 1056.831987 fun_val = 4723.717473 
iter_number =   4 norm_grad = 434.668354 fun_val = 4676.341028 
iter_number =   5 norm_grad = 181.792854 fun_val = 4664.456052 
iter_number =   6 norm_grad = 481.845230 fun_val = 4645.697772 
iter_number =   7 norm_grad = 1450.431648 fun_val = 2784.316562 
iter_number =   8 norm_grad = 626.401853 fun_val = 2597.516663 
iter_number =   9 norm_grad = 220.402369 fun_val = 2565.305713 
iter_number =  10 norm_grad = 116.512243 fun_val = 2561.054079 
......
iter_number = 20849 norm_grad = 0.000092 fun_val = 0.000000 
iter_number = 20850 norm_grad = 0.000092 fun_val = 0.000000 
iter_number = 20851 norm_grad = 0.000092 fun_val = 0.000000 
iter_number = 20852 norm_grad = 0.000092 fun_val = 0.000000 
......

# p2
iter_number =   1 norm_grad = 20020.154130 fun_val = 11955.275024 
iter_number =   2 norm_grad = 8107.335439 fun_val = 3744.085975 
iter_number =   3 norm_grad = 2993.085982 fun_val = 1129.842099 
iter_number =   4 norm_grad = 949.370487 fun_val = 445.778197 
iter_number =   5 norm_grad = 192.946640 fun_val = 320.464548 
iter_number =   6 norm_grad = 86.882053 fun_val = 313.996463 
iter_number =   7 norm_grad = 41.236184 fun_val = 311.880377 
iter_number =   8 norm_grad = 223.087207 fun_val = 295.899225 
iter_number =   9 norm_grad = 91.972368 fun_val = 287.479235 
iter_number =  10 norm_grad = 40.293830 fun_val = 285.264643 
......
iter_number = 13961 norm_grad = 0.000151 fun_val = 0.000000 
iter_number = 13962 norm_grad = 0.000151 fun_val = 0.000000 
iter_number = 13963 norm_grad = 0.000151 fun_val = 0.000000 
iter_number = 13964 norm_grad = 0.000151 fun_val = 0.000000 
iter_number = 13965 norm_grad = 0.000151 fun_val = 0.000000 
......

# p3
iter_number =   1 norm_grad = 10459534.883496 fun_val = 26901796.557585 
iter_number =   2 norm_grad = 4328861.498929 fun_val = 9350549.111728 
iter_number =   3 norm_grad = 1815010.486957 fun_val = 3306985.630845 
iter_number =   4 norm_grad = 764085.159510 fun_val = 1178877.486706 
iter_number =   5 norm_grad = 322588.854922 fun_val = 423840.683140 
iter_number =   6 norm_grad = 136745.858345 fun_val = 154327.967236 
iter_number =   7 norm_grad = 58355.714510 fun_val = 57260.204151 
iter_number =   8 norm_grad = 25173.561268 fun_val = 21780.708212 
iter_number =   9 norm_grad = 11034.676282 fun_val = 8504.776429 
iter_number =  10 norm_grad = 4932.916085 fun_val = 3367.298679 
......
iter_number = 11759 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11760 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11761 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11762 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11763 norm_grad = 0.000034 fun_val = 48.984254 
iter_number = 11764 norm_grad = 0.000034 fun_val = 48.984254 
......

# p4
iter_number =   1 norm_grad = 740484.580250 fun_val = 1125740.309089 
iter_number =   2 norm_grad = 318800.867908 fun_val = 410873.876977 
iter_number =   3 norm_grad = 135283.737154 fun_val = 147433.167712 
iter_number =   4 norm_grad = 57286.724183 fun_val = 52647.763355 
iter_number =   5 norm_grad = 24273.902315 fun_val = 18647.021114 
iter_number =   6 norm_grad = 10264.843110 fun_val = 6442.533586 
iter_number =   7 norm_grad = 4279.439838 fun_val = 2094.562330 
iter_number =   8 norm_grad = 1694.354839 fun_val = 604.979471 
iter_number =   9 norm_grad = 568.486420 fun_val = 158.093768 
iter_number =  10 norm_grad = 98.759692 fun_val = 69.674565 
......
iter_number = 6072 norm_grad = 0.000107 fun_val = 0.000000 
iter_number = 6073 norm_grad = 0.000107 fun_val = 0.000000 
iter_number = 6074 norm_grad = 0.000107 fun_val = 0.000000 
iter_number = 6075 norm_grad = 0.000107 fun_val = 0.000000 
iter_number = 6076 norm_grad = 0.000107 fun_val = 0.000000 
......

newton_backtracking

	x*	fun_val	iter
p1	5.0, 4.0	-0.000000	8
p2	5.0, 4.0	0.000000	11
p3	11.4128, -0.8968	48.984254	16
p4	11.4128, -0.8968	48.984254	13

# p1
iter=  1 f(x)=17489.2000310639
iter=  2 f(x)=3536.1276294897
iter=  3 f(x)=556.9651095482
iter=  4 f(x)=50.4029326519
iter=  5 f(x)=1.2248344195
iter=  6 f(x)=0.0013284549
iter=  7 f(x)=0.0000000018
iter=  8 f(x)=-0.0000000000

# p2
iter=  1 f(x)=18114.8519548468
iter=  2 f(x)=3676.4191962662
iter=  3 f(x)=583.8090713510
iter=  4 f(x)=53.8298627069
iter=  5 f(x)=1.3684607619
iter=  6 f(x)=0.0016459572
iter=  7 f(x)=0.0000000027
iter=  8 f(x)=0.0000000007
iter=  9 f(x)=0.0000000002
iter= 10 f(x)=0.0000000000
iter= 11 f(x)=0.0000000000

# p3
iter=  1 f(x)=21356364.5665758252
iter=  2 f(x)=5561714.7070109081
iter=  3 f(x)=1444932.1368662491
iter=  4 f(x)=374422.3650773597
iter=  5 f(x)=96843.5106842914
iter=  6 f(x)=25078.5606496656
iter=  7 f(x)=6556.5446423615
iter=  8 f(x)=1764.6276022580
iter=  9 f(x)=509.7515417783
iter= 10 f(x)=171.5021447876
iter= 11 f(x)=76.9343325176
iter= 12 f(x)=52.3964690617
iter= 13 f(x)=49.0499328705
iter= 14 f(x)=48.9842770185
iter= 15 f(x)=48.9842536792
iter= 16 f(x)=48.9842536792

# p4
iter=  1 f(x)=695503.8833055196
iter=  2 f(x)=180005.8157903731
iter=  3 f(x)=46559.5630006172
iter=  4 f(x)=12100.5787363822
iter=  5 f(x)=3202.4785180644
iter=  6 f(x)=889.1211707798
iter=  7 f(x)=275.2063763915
iter=  8 f(x)=106.0983438392
iter=  9 f(x)=59.2021422471
iter= 10 f(x)=49.5512377643
iter= 11 f(x)=48.9860167506
iter= 12 f(x)=48.9842536959
iter= 13 f(x)=48.9842536792

newton_hybrid

	x*	fun_val	iter
p1	5.0, 4.0	-0.000000	8
p2	5.0, 4.0	0.000000	11
p3	11.4128, -0.8968	48.984254	16
p4	11.4128, -0.8968	48.984254	13

# p1
iter=  1 f(x)=17489.2000310639
iter=  2 f(x)=3536.1276294897
iter=  3 f(x)=556.9651095482
iter=  4 f(x)=50.4029326519
iter=  5 f(x)=1.2248344195
iter=  6 f(x)=0.0013284549
iter=  7 f(x)=0.0000000018
iter=  8 f(x)=-0.0000000000

# p2
iter=  1 f(x)=18114.8519548468
iter=  2 f(x)=3676.4191962662
iter=  3 f(x)=583.8090713510
iter=  4 f(x)=53.8298627069
iter=  5 f(x)=1.3684607619
iter=  6 f(x)=0.0016459572
iter=  7 f(x)=0.0000000027
iter=  8 f(x)=0.0000000007
iter=  9 f(x)=0.0000000002
iter= 10 f(x)=0.0000000000
iter= 11 f(x)=0.0000000000

# p3
iter=  1 f(x)=21356364.5665758252
iter=  2 f(x)=5561714.7070109081
iter=  3 f(x)=1444932.1368662491
iter=  4 f(x)=374422.3650773597
iter=  5 f(x)=96843.5106842914
iter=  6 f(x)=25078.5606496656
iter=  7 f(x)=6556.5446423615
iter=  8 f(x)=1764.6276022580
iter=  9 f(x)=509.7515417783
iter= 10 f(x)=171.5021447876
iter= 11 f(x)=76.9343325176
iter= 12 f(x)=52.3964690617
iter= 13 f(x)=49.0499328705
iter= 14 f(x)=48.9842770185
iter= 15 f(x)=48.9842536792
iter= 16 f(x)=48.9842536792

# p4
iter=  1 f(x)=695503.8833055196
iter=  2 f(x)=180005.8157903731
iter=  3 f(x)=46559.5630006172
iter=  4 f(x)=12100.5787363822
iter=  5 f(x)=3202.4785180644
iter=  6 f(x)=889.1211707798
iter=  7 f(x)=275.2063763915
iter=  8 f(x)=106.0983438392
iter=  9 f(x)=59.2021422471
iter= 10 f(x)=49.5512377643
iter= 11 f(x)=48.9860167506
iter= 12 f(x)=48.9842536959
iter= 13 f(x)=48.9842536792

结果分析

回溯法梯度下降（gradient_method_backtracking）在 $f (x)$ 函数上的表现情况比较糟糕，在所有初始点（p1-p4）上均不能满足结束条件 $\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}$ ，所以其迭代步数 $\to \infty$ . 但是可以发现，随着迭代次数的不断增加，最优解 $x^*$ 和函数值 $fun\_val$ 均收敛。在起始点为p1,p2,p4时，回溯梯度下降求得的最优解 $\bar{x} \to (5.0,4.0), fun\_val \to 0.0$ ，显然是 $f (x)$ 的全局最优，但当以p3为起始点时，所得解为局部最优。此外，当以p4为起始点时， $fun\_val \to 0.000107$ ，收敛效果不如p1,p2起始点。总之，回溯法梯度下降的结果说明，由梯度下降得到的 $f (x)$ 的最优解，无论是在全局最优还是局部最优，其梯度 $\nabla f\left(x^*\right)$ 难以收敛到 $\mathscr{0}$ ，这也导致了梯度下降的发散。

回溯牛顿法（newton_backtracking）和混合牛顿法（newton_hybrid）得到的结果极为相似，二者在所有四个起始点上的收敛结果表现良好。当以p1,p2为起始点时，两种方法均在10步左右收敛到全局最优 $(5.0, 4.0)$ 。当以p3,p4为起始点时，两方法也能在16步以内收敛到局部最优 $(11.4128, - 0.8968)$ . 观察两种方法在四个起始点上的输出结果，不难发现，输出结果完全相同，说明 $\nabla ^2f(x_k), k=1,2,\dots$ 是非正定的，所以混合牛顿法中的(a)步（纯牛顿）永远不执行，混合牛顿此时等价于回溯牛顿。

总之，在 $f (x)$ 上，当以 $\left|\left|\ \nabla f\left(x\right)\right|\right|\le\epsilon,\epsilon={10}^{-5}$ 为终止条件时，牛顿法的迭代次数明显少于梯度下降法。两种方法，当选取的起始点不同，可能会导致收敛不到全局最优解。

©️ Sylvan Ding ❤️

Sylvan Ding

关注

2
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
INTRODUCTION TO NONELINEAR OPTIMIZATION Excise 5.2 Freudenstein and Roth Test Function

Amir Beck's INTRODUCTION TO NONELINEAR OPTIMIZATION Theory, Algorithms, and Applications with MATLAB Excise 5.2
复制链接

扫一扫