文章目录
1. 无约束优化问题的解法
在无约束优化问题中,有四种经典的迭代优化方法:Newton’s method(牛顿法)、Levenberg-Marquardt’s method(非线性最小二乘法,LM)、Broyden-Fletcher-Goldfarb-Shanno’s method(BFGS)、Davidon-Fletcher-Powell’s method(DFP)。
2. Matlab实现
假设有如下无约束优化问题: f ( x 1 , x 2 ) = ( x 1 − 3 ) 4 + ( x 1 − 3 x 2 ) 2 f(x_1,x_2)=(x_1-3)^4+(x_1-3x_2)^2 f(x1,x2)=(x1−3)4+(x1−3x2)2为了方便四种算法的比较,我们统一设置初始迭代点为 x 0 = [ 0 , 0 ] T x_0=[0,0]^T x0=[0,0]T ,则初始海森矩阵为: H 0 = ( 110 − 6 − 6 18 ) H_0=\begin{pmatrix} 110 & -6 &\\ -6 & 18 &\\ \end{pmatrix} H0=(110−6−618)
上述优化问题用matlab代码表示为:
syms x1; % 变量x1
syms x2; % 变量x2
f = (x1 - 3).^4 + (x1 - 3*x2).^2; $ 函数表达
x0=[0 0]'; % 初始迭代点
H0=[110 -6;-6 18]; % 初始海森矩阵
m=2; % 变量个数
k=30; % 迭代次数
该函数的最优 X = ( x 1 , x 2 ) X=(x_1,x_2) X=(x1,x2)以及 f ( x 1 , x 2 ) f(x_1,x_2) f(x1,x2)应该为: x 1 = 3 , x 2 = 1 x_1=3, x_2=1 x1=3,x2=1 f ( x 1 , x 2 ) = 0 f(x_1,x_2)=0 f(x1,x2)=0
下面展示一下如何用Matlab实现对函数的优化:
2.1. Newton’s method(牛顿法)
牛顿法的迭代公式为: x k + 1 = x k − H − 1 ( x k ) ∇ f ( x k ) x_{k+1}=x_k-H^{-1}(x_k)\nabla f(x_k) xk+1=xk−H−1(xk)∇f(xk)matlab实现函数实现为:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%文件名:Newton.m
%
%f表示函数表达式
%H0表示初始的海森矩阵
%x0表示初始的迭代点 为列向量
%m表示变量的个数
%k表示迭代次数
%X存储每次迭代的x,F为函数值,G为每次的梯度,H为海森阵,HN为海森矩阵的逆
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function[X, H, F, G,HN] = Newton(f,H0,x0,m,k)
x1 = sym('x',[1,m]); % [x1, x2]
%f = (x1(1) - 3)^4 + (x1(1) - 3*x1(2))^2;
c = num2cell(x1); % c=变量[x1, x2]
g = sym('x',[m,1]); % [x1, x2]^T
X = zeros(m, k+1); % x1、x2的迭代值
H = zeros( m, m, k+1); % hessian的迭代值
F = zeros(1, k+1); % function的迭代值
G = zeros(m, k+1); % function‘的迭代值
HN = zeros( m, m, k+1); % hessian的逆阵的迭代值
H(:,:,1) = H0; % hessian初始化
HN(:,:,1) = inv(H0); % hessian逆初始化
X(:,1) = x0; % X(x1, x2)初始化
F(1,1) = subs(f, c, {X(:,1)'}); % 初始X值赋予F
h = hessian(f,x1);%求海森矩阵
for n = 1:m % f对x1、x2分别求偏导
g(n) = diff(f,x1(n));
end
G(:,1) = subs(g,c,{X(:,1)'}); % 初始X导赋予G
% 迭代
for n = 1:k
X(:,n+1) = X(:,n) - (H(:,:,n))\G(:,n);
F(1,n+1) = subs(f,c,{X(:,n+1)'});
G(:,n+1) = subs(g,c,{X(:,n+1)'});
H(:,:,n+1) = subs(h,c,{X(:,n+1)'});
HN(:,:,n+1) = inv(H(:,:,n+1));
end
end
执行matlab代码:
[X, H, F, G, HN] = Newton(f,H0,x0,m,k);
即可得到优化结果,下表是迭代次数 k 分别为:0、1、2、3时的输出值:
k k k | x k x_k xk | f ( x k ) f(x_k) f(xk) | ∇ f ( x k ) \nabla f(x_k) ∇f(xk) | H ( x k ) H(x_k) H(xk) |
---|---|---|---|---|
0 | ( 0 , 0 ) (0,0) (0,0) | 81 | ( − 108 , 0 ) (-108, 0) (−108,0) | ( 110 − 6 − 6 18 ) \begin{pmatrix}110 & -6 &\\-6 & 18 &\\\end{pmatrix} (110−6−618) |
1 | ( 1 , 0.3333 ) (1, 0.3333) (1,0.3333) | 16 | ( − 32 , 0 ) (-32, 0) (−32,0) | ( 50 − 6 − 6 18 ) \begin{pmatrix}50 & -6 &\\-6 & 18 &\\\end{pmatrix} (50−6−618) |
2 | ( 1.6667 , 0.5556 ) (1.6667, 0.5556) (1.6667,0.5556) | 3.1605 | ( − 9.4815 , 0 ) (-9.4815, 0) (−9.4815,0) | ( 23.3333 − 6 − 6 18 ) \begin{pmatrix}23.3333 & -6 &\\-6 & 18 &\\\end{pmatrix} (23.3333−6−618) |
3 | ( 2.1111 , 0.7037 ) (2.1111, 0.7037) (2.1111,0.7037) | 0.6243 | ( − 2.8093 , 0 ) (-2.8093, 0) (−2.8093,0) | ( 11.4815 − 6 − 6 18 ) \begin{pmatrix}11.4815 & -6 &\\-6 & 18 &\\\end{pmatrix} (11.4815−6−618) |
2.2. Levenberg-Marquardt’s method(LM)
LM法的迭代公式为: x k + 1 = x k − ( λ I + H ( x k ) ) − 1 ∇ f ( x k ) , H ( x k ) > > λ , x_{k+1}=x_k-(\lambda I+H(x_k))^{-1}\nabla f(x_k),\quad H(x_k) >> \lambda, xk+1=xk−(λI+H(xk))−1∇f(xk),H(xk)>>λ,matlab实现函数实现为:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%文件名:LM.m
%
%f表示函数表达式
%H0表示初始的海森矩阵
%x0表示初始的迭代点 为列向量
%m表示变量的个数
%k表示迭代次数
%X存储每次迭代的x,F为函数值,G为每次的梯度,H为海森阵,HN为海森矩阵的逆
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function[X, H, F, G,HN] = LM(f,H0,x0,m,k)
x1 = sym('x',[1,m]); % [x1, x2]
%f = (x1(1) - 3)^4 + (x1(1) - 3*x1(2))^2;
c = num2cell(x1); % c=变量[x1, x2]
g = sym('x',[m,1]); % [x1, x2]^T
lambda = 0.000001; $ lambda值
X = zeros(m, k+1); % x1、x2的迭代值
H = zeros( m, m, k+1); % hessian的迭代值
lambda_I = lambda * eye(2); % Lambda * I
F = zeros(1, k+1); % function的迭代值
G = zeros(m, k+1); % function‘的迭代值
HN = zeros( m, m, k+1); % hessian的逆阵的迭代值
H(:, :, 1) = H0; % hessian初始化
HN(:,:,1) = inv(H0); % hessian逆初始化
X(:,1) = x0; % X(x1, x2)初始化
F(1,1) = subs(f, c, {X(:,1)'}); % 初始X值赋予F
h = hessian(f,x1);%求海森矩阵
for n = 1:m % f对x1、x2分别求偏导
g(n) = diff(f,x1(n));
end
G(:,1) = subs(g,c,{X(:,1)'}); % 初始X导赋予G
% 迭代
for n = 1:k
X(:,n+1) = X(:,n) - (lambda_I + H(:,:,n))\G(:,n);
F(1,n+1) = subs(f,c,{X(:,n+1)'});
G(:,n+1) = subs(g,c,{X(:,n+1)'});
H(:,:,n+1) = subs(h,c,{X(:,n+1)'});
HN(:,:,n+1) = inv(H(:,:,n+1));
end
end
执行matlab代码:
[X, H, F, G, HN] = LM(f,H0,x0,m,k);
即可得到优化结果,下表是迭代次数 k 分别为:0、1、2、3时的输出值:
k k k | x k x_k xk | f ( x k ) f(x_k) f(xk) | ∇ f ( x k ) \nabla f(x_k) ∇f(xk) | H ( x k ) H(x_k) H(xk) |
---|---|---|---|---|
0 | ( 0 , 0 ) (0,0) (0,0) | 81 | ( − 108 , 0 ) (-108, 0) (−108,0) | ( 110 − 6 − 6 18 ) \begin{pmatrix}110 & -6 &\\-6 & 18 &\\\end{pmatrix} (110−6−618) |
1 | ( 1 , 0.3333 ) (1, 0.3333) (1,0.3333) | 16 | ( − 32.0000 , − 3.3333 × 1 0 − 7 ) (-32.0000, -3.3333\times 10^{-7}) (−32.0000,−3.3333×10−7) | ( 50 − 6 − 6 18 ) \begin{pmatrix}50 & -6 &\\-6 & 18 &\\\end{pmatrix} (50−6−618) |
2 | ( 1.6667 , 0.5556 ) (1.6667, 0.5556) (1.6667,0.5556) | 3.1605 | ( − 9.4815 , − 2.2222 × 1 0 − 7 ) (-9.4815, -2.2222\times 10^{-7}) (−9.4815,−2.2222×10−7) | ( 23.3333 − 6 − 6 18 ) \begin{pmatrix}23.3333 & -6 &\\-6 & 18 &\\\end{pmatrix} (23.3333−6−618) |
3 | ( 2.1111 , 0.7037 ) (2.1111, 0.7037) (2.1111,0.7037) | 0.6243 | ( − 2.8093 , − 1.4815 × 1 0 − 7 ) (-2.8093, -1.4815\times 10^{-7}) (−2.8093,−1.4815×10−7) | ( 11.4815 − 6 − 6 18 ) \begin{pmatrix}11.4815 & -6 &\\-6 & 18 &\\\end{pmatrix} (11.4815−6−618) |
2.3. Broyden-Fletcher-Goldfarb-Shanno’s method(BFGS)
BFGS法的迭代公式为: x k + 1 = x k − H k ^ − 1 ∇ f ( x k ) x_{k+1} = x_k-\hat{H_k}^{-1}\nabla f(x_k) xk+1=xk−Hk^−1∇f(xk)其中: H k ^ = H ^ k − 1 + q k q k T q k T s k − H ^ k − 1 s k s k T H ^ k − 1 T s k T H ^ k − 1 s k \hat{H_{k}} = \hat{H}_{k-1} +\frac{q_kq_k^T}{q_k^Ts_k} - \frac{\hat{H}_{k-1}s_ks_k^T\hat{H}_{k-1}^T}{s_k^T\hat{H}_{k-1}s_k} Hk^=H^k−1+qkTskqkqkT−skTH^k−1skH^k−1skskTH^k−1T s k = x k − x k − 1 s_k=x_k-x_{k-1} sk=xk−xk−1 q k = ∇ f ( x k ) − ∇ f ( x k − 1 ) q_k=\nabla f(x_k)-\nabla f(x_{k-1}) qk=∇f(xk)−∇f(xk−1) H ^ 0 = H ( x 0 ) \hat{H}_0=H(x_0) H^0=H(x0)matlab实现函数实现为:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%文件名:BFGS.m
%
%f表示函数表达式
%H0表示初始的海森矩阵
%x0表示初始的迭代点 为列向量
%m表示变量的个数
%k表示迭代次数
%X存储每次迭代的x,F为函数值,G为每次的梯度,H为海森阵,HN为海森矩阵的逆
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function[X, H_hat, F, G, HN] = BFGS(f, H0, x0, m, k)
x1 = sym('x',[1,m]); % [x1, x2]
%f = (x1(1) - 3)^4 + (x1(1) - 3*x1(2))^2;
c = num2cell(x1); % c=变量[x1, x2]
g = sym('x',[m,1]); % [x1, x2]^T
X = zeros(m, k+1); % x1、x2的迭代值
H_hat = zeros( m, m, k+1); % 近似hessian的迭代值
F = zeros(1, k+1); % function的迭代值
G = zeros(m, k+1); % function‘的迭代值
HN = zeros( m, m, k+1); % hessian的逆阵的迭代值
H_hat(:, :, 1) = H0; % hessian初始化
HN(:,:,1) = inv(H0); % hessian逆初始化
X(:,1) = x0; % X(x1, x2)初始化
F(1,1) = subs(f, c, {X(:,1)'}); % 初始X值赋予F
for n = 1:m % f对x1、x2分别求偏导
g(n) = diff(f,x1(n));
end
G(:,1) = subs(g,c,{X(:,1)'}); % 初始X导赋予G
% 迭代
for n = 1:k
X(:,n+1) = X(:,n) - (H_hat(:,:,n))\G(:,n);
F(1,n+1) = subs(f,c,{X(:,n+1)'});
G(:,n+1) = subs(g,c,{X(:,n+1)'});
s_k = X(:, n+1) - X(:, n);
q_k = G(:, n+1) - G(:, n);
H_hat(:,:,n+1) = H_hat(:,:,n) +(q_k*q_k')/(q_k'*s_k) - (H_hat(:,:,n)*s_k*s_k'*H_hat(:,:,n)')/(s_k'*H_hat(:,:,n)*s_k);
HN(:,:,n+1) = inv(H_hat(:,:,n+1));
end
end
执行matlab代码:
[X, H_hat, F, G, HN] = BFGS(f,H0,x0,m,k);
即可得到优化结果,下表是迭代次数 k 分别为:0、1、2、3时的输出值:
k k k | x k x_k xk | f ( x k ) f(x_k) f(xk) | ∇ f ( x k ) \nabla f(x_k) ∇f(xk) | H ( x k ) H(x_k) H(xk) |
---|---|---|---|---|
0 | ( 0 , 0 ) (0,0) (0,0) | 81 | ( − 108 , 0 ) (-108, 0) (−108,0) | ( 110 − 6 − 6 18 ) \begin{pmatrix}110 & -6 &\\-6 & 18 &\\\end{pmatrix} (110−6−618) |
1 | ( 1 , 0.3333 ) (1, 0.3333) (1,0.3333) | 16 | ( − 32 , 0 ) (-32, 0) (−32,0) | ( 78 − 6 − 6 18 ) \begin{pmatrix}78 & -6 &\\-6 & 18 &\\\end{pmatrix} (78−6−618) |
2 | ( 1.4211 , 0.4737 ) (1.4211, 0.4737) (1.4211,0.4737) | 6.2154 | ( − 15.7457 , 0 ) (-15.7457, 0) (−15.7457,0) | ( 40.6039 − 6 − 6 18 ) \begin{pmatrix}40.6039 & -6 &\\-6 & 18 &\\\end{pmatrix} (40.6039−6−618) |
3 | ( 1.8289 , 0.6096 ) (1.8289, 0.6096) (1.8289,0.6096) | 1.8807 | ( − 6.4240 , 0 ) (-6.4240, 0) (−6.4240,0) | ( 24.8541 − 6 − 6 18 ) \begin{pmatrix}24.8541 & -6 &\\-6 & 18 &\\\end{pmatrix} (24.8541−6−618) |
2.4. Davidon-Fletcher-Powell’s method(DFP)
DFP法的迭代公式为: x k + 1 = x k − D k ^ ∇ f ( x k ) x_{k+1} = x_k-\hat{D_k}\nabla f(x_k) xk+1=xk−Dk^∇f(xk)其中: D k ^ = D ^ k − 1 + s k s k T q k T s k − D ^ k − 1 q k q k T D ^ k − 1 T q k T D ^ k − 1 q k \hat{D_{k}} = \hat{D}_{k-1} +\frac{s_ks_k^T}{q_k^Ts_k} - \frac{\hat{D}_{k-1}q_kq_k^T\hat{D}_{k-1}^T}{q_k^T\hat{D}_{k-1}q_k} Dk^=D^k−1+qkTskskskT−qkTD^k−1qkD^k−1qkqkTD^k−1T s k = x k − x k − 1 s_k=x_k-x_{k-1} sk=xk−xk−1 q k = ∇ f ( x k ) − ∇ f ( x k − 1 ) q_k=\nabla f(x_k)-\nabla f(x_{k-1}) qk=∇f(xk)−∇f(xk−1) D ^ 0 = H − 1 ( x 0 ) \hat{D}_0=H^{-1}(x_0) D^0=H−1(x0)matlab实现函数实现为:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%文件名:DFP.m
%
%f表示函数表达式
%H0表示初始的海森矩阵
%x0表示初始的迭代点 为列向量
%m表示变量的个数
%k表示迭代次数
%X存储每次迭代的x,F为函数值,G为每次的梯度,H为海森阵,HN为海森矩阵的逆
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function[X, D_hat, F, G] = DFP(f, H0, x0, m, k)
x1 = sym('x',[1,m]); % [x1, x2]
%f = (x1(1) - 3)^4 + (x1(1) - 3*x1(2))^2;
c = num2cell(x1); % c=变量[x1, x2]
g = sym('x',[m,1]); % [x1, x2]^T
X = zeros(m, k+1); % x1、x2的迭代值
D_hat = zeros( m, m, k+1); % hessian的逆阵的迭代值
F = zeros(1, k+1); % function的迭代值
G = zeros(m, k+1); % function‘的迭代值
D_hat(:, :, 1) = inv(H0); % hessian逆初始化
X(:,1) = x0; % X(x1, x2)初始化
F(1,1) = subs(f, c, {X(:,1)'}); % 初始X值赋予F
for n = 1:m % f对x1、x2分别求偏导
g(n) = diff(f, x1(n));
end
G(:,1) = subs(g, c, {X(:,1)'}); % 初始X导赋予G
% 迭代
for n = 1:k
X(:,n+1) = X(:,n) - (D_hat(:, :, n)) * G(:, n);
F(1,n+1) = subs(f, c, {X(:,n+1)'});
G(:,n+1) = subs(g, c, {X(:,n+1)'});
s_k = X(:, n+1) - X(:, n);
q_k = G(:, n+1) - G(:, n);
D_hat(:,:,n+1) = D_hat(:, :, n) +(s_k*s_k')/(q_k'*s_k) - (D_hat(:,:,n)*q_k*q_k'*D_hat(:,:,n)')/(q_k'*D_hat(:,:,n)*q_k);
end
end
执行matlab代码:
[X, D, F, G] = DFP(f, H0, x0, m, k);
即可得到优化结果,下表是迭代次数 k 分别为:0、1、2、3时的输出值:
k k k | x k x_k xk | f ( x k ) f(x_k) f(xk) | ∇ f ( x k ) \nabla f(x_k) ∇f(xk) | D ( x k ) D(x_k) D(xk) |
---|---|---|---|---|
0 | ( 0 , 0 ) (0,0) (0,0) | 81 | ( − 108 , 0 ) (-108, 0) (−108,0) | ( 0.0093 0.0031 0.0031 0.0566 ) \begin{pmatrix}0.0093 & 0.0031 &\\0.0031 & 0.0566 &\\\end{pmatrix} (0.00930.00310.00310.0566) |
1 | ( 1 , 0.3333 ) (1, 0.3333) (1,0.3333) | 16 | ( − 32 , 0 ) (-32, 0) (−32,0) | ( 0.0132 0.0044 0.0044 0.0570 ) \begin{pmatrix}0.0132 & 0.0044 &\\0.0044 & 0.0570 &\\\end{pmatrix} (0.01320.00440.00440.0570) |
2 | ( 1.4211 , 0.4737 ) (1.4211, 0.4737) (1.4211,0.4737) | 6.2154 | ( − 15.7457 , 0 ) (-15.7457, 0) (−15.7457,0) | ( 0.0259 0.0086 0.0086 0.0584 ) \begin{pmatrix}0.0259 & 0.0086 &\\0.0086 & 0.0584 &\\\end{pmatrix} (0.02590.00860.00860.0584) |
3 | ( 1.8289 , 0.6096 ) (1.8289, 0.6096) (1.8289,0.6096) | 1.8807 | ( − 6.4240 , 0 ) (-6.4240, 0) (−6.4240,0) | ( 0.0438 0.0146 0.0146 0.0604 ) \begin{pmatrix}0.0438 & 0.0146 &\\0.0146 & 0.0604 &\\\end{pmatrix} (0.04380.01460.01460.0604) |