单纯形法只需要计算目标函数值,是无需要一维搜索,也无需进行求导的一种直接法。其优点计算比较简单,几何概念清晰,适用于目标函数求导比较困难或不知道目标函数的具体表达式而仅知道其具体计算方法的情况。
一、基本思想
所谓n维欧氏空间中的单纯形,是指在n维空间中由n+1个顶点构成的简单图形或多面体。例如,在二维平面中的单纯形应具有三个顶点,即三角形。在三维空间中的单纯形应具有四个顶点,即四面体。当各顶点之间的距离相等时,这种几何图形就称为正规单纯形。
单纯形法的基本思想是:选n+1个顶点构成初始单纯形,计算并比较各顶点目标函数值的大小,确定它们当中函数值大的顶点及函数值的下降方向,再设法找到一个函数值较小的新点替换函数值最大的顶点,从而构成新的单纯形。随着这种取代过程不断进行,新的单纯形想着极小值点收缩逼近,从而求得极小值点。
迭代过程的主要步骤:反射、延伸、压缩。
二、单纯形法的方法原理
现在以二维问题为例说明单纯形方法的原理。
(1)构造初始单纯形
如图,设二维目标函数 f ( x ) = f ( x 1 , x 2 ) f(x)=f(x_1,x_2) f(x)=f(x1,x2),在二维平面上选三个线性无关的顶点 x 1 , x 2 , x 3 x_1,x_2,x_3 x1,x2,x3构成初始单纯形。计算这三个顶点处的目标函数值 f ( x 1 ) , f ( x 2 ) , f ( x 3 ) f(x_1),f(x_2),f(x_3) f(x1),f(x2),f(x3)并比较其大小。若 f ( x 1 ) > f ( x 2 ) > f ( x 3 ) f(x_1)>f(x_2)>f(x_3) f(x1)>f(x2)>f(x3),把目标函数值最小的点 x 3 x_3 x3成为最好点,用 x L x_L xL表示;把目标函数值最大的点 x 1 x_1 x1成为最坏点,用 x H x_H xH表示;把目标函数值次大的点 x 2 x_2 x2称为次坏点,用 x G x_G xG表示。求除 x H x_H xH以外的所有顶点的集合形心 x C x_C xC
(2)反射
一般来说,目标函数值下降方向在最坏点
x
H
x_H
xH关于形心
x
C
x_C
xC的对称位置的方向可能性最大。故首先求反射点,以探测目标函数值的变化趋向。
求
x
L
,
x
G
x_L,x_G
xL,xG的中心点
x
C
x_C
xC,在
x
H
x_H
xH与
x
C
x_C
xC延长线取一点
x
R
x_R
xR,使
x
R
=
x
C
+
α
(
x
C
−
x
H
)
x_R=x_C+\alpha(x_C-x_H)
xR=xC+α(xC−xH)
点
x
R
x_R
xR称为最坏点
x
H
x_H
xH关于形心
x
C
x_C
xC的反射点;
α
\alpha
α称为反射系数,一般取
α
=
1
\alpha=1
α=1,故上式变为
x
R
=
2
x
C
−
x
H
x_R = 2x_C-x_H
xR=2xC−xH
计算函数值
f
(
x
R
)
f(x_R)
f(xR),若
f
(
x
R
)
<
f
(
x
L
)
f(x_R)<f(x_L)
f(xR)<f(xL),说明所取的探索方向正确,可沿该方向进一步扩大效果。
(3)延伸(扩张)
将搜索点延伸到
x
E
x_E
xE,且
x
E
=
x
C
+
γ
(
x
R
−
x
C
)
x_E = x_C + \gamma(x_R-x_C)
xE=xC+γ(xR−xC)
式中,
γ
\gamma
γ为扩张系数,
γ
=
1.2
2
\gamma=1.2~2
γ=1.2 2,一般取
γ
=
2
\gamma=2
γ=2。
如果
f
(
x
E
)
<
f
(
x
R
)
f(x_E)<f(x_R)
f(xE)<f(xR),说明扩张有利,以
x
E
x_E
xE代替最坏点
x
H
x_H
xH得到新的单纯形顶点
{
x
E
,
x
G
,
x
L
}
\{x_E,x_G,x_L\}
{xE,xG,xL};如果
f
(
x
E
)
>
f
(
x
R
)
f(x_E)>f(x_R)
f(xE)>f(xR),说明扩张不利,舍弃
x
E
x_E
xE,仍以
x
R
x_R
xR代替最坏点
x
H
x_H
xH得到新的单纯形点
{
x
R
,
x
G
,
x
L
}
\{x_R,x_G,x_L\}
{xR,xG,xL}.
若
f
(
x
H
)
>
f
(
x
R
)
>
f
(
x
G
)
f(x_H)>f(x_R)>f(x_G)
f(xH)>f(xR)>f(xG),说明反射点
x
R
x_R
xR比次坏点还差,表明反射点取得太远,应沿点
x
R
x_R
xR向点
x
C
x_C
xC方向压缩。
(4)压缩
压缩点
x
S
x_S
xS,其表达式为
x
S
=
x
C
+
β
(
x
R
−
x
C
)
x_S=x_C+\beta(x_R-x_C)
xS=xC+β(xR−xC)
式中
β
\beta
β为压缩系数,通常取
β
=
0.5
\beta=0.5
β=0.5或0.25~0.75.
若
f
(
x
S
)
<
f
(
x
H
)
f(x_S)<f(x_H)
f(xS)<f(xH),则用压缩点
x
S
x_S
xS替换最坏点
x
H
x_H
xH得到新的单纯形顶点
{
x
S
,
x
G
,
x
L
}
\{x_S,x_G,x_L\}
{xS,xG,xL};
若反射点
x
R
x_R
xR的函数值
f
(
x
R
)
>
f
(
x
H
)
f(x_R)>f(x_H)
f(xR)>f(xH),说明点
x
R
x_R
xR比坏点
x
H
x_H
xH还差,则应压缩得更多,即将新点压缩至
x
H
x_H
xH与
x
C
x_C
xC之间,此时所得到的压缩点应为
x
S
′
=
x
C
+
β
(
x
H
−
x
C
)
x_{S^{'}}=x_C+\beta(x_H-x_C)
xS′=xC+β(xH−xC)
若
f
(
x
S
′
)
<
f
(
x
H
)
f(x_{S^{'}})<f(x_H)
f(xS′)<f(xH),则用新的压缩点
x
S
′
x_{S^{'}}
xS′替换最坏点
x
H
x_H
xH,得到新的单纯形顶点
{
x
S
′
,
x
G
,
x
L
}
\{x_{S^{'}},x_G,x_L\}
{xS′,xG,xL};否则:
当
x
H
x
C
x_Hx_C
xHxC连线上所有点的函数值
f
(
x
)
f(x)
f(x)都大于
f
(
x
H
)
f(x_H)
f(xH)时,说明沿反射方向探索失败,此时应将单纯形想最好点
x
L
x_L
xL收缩。即以
x
L
x_L
xL为基点,将单纯形边长减半,得到新的单纯形,其新的顶点表达式为
x
G
′
=
x
L
+
0.5
(
x
G
−
x
L
)
x
H
′
=
x
L
+
0.5
(
x
H
−
x
L
)
\begin{aligned} x_{G^{'}} &=x_L+0.5(x_G-x_L) \\ x_{H^{'}} &=x_L+0.5(x_H-x_L) \end{aligned}
xG′xH′=xL+0.5(xG−xL)=xL+0.5(xH−xL)
三、单纯形法的计算步骤
单纯形法算法步骤:
步1 构造初始单纯形:设目标函数
f
(
x
)
f(x)
f(x)为n维函数,取n+1个顶点
x
1
,
x
2
,
⋯
,
x
n
+
1
x_1,x_2,\cdots,x_{n+1}
x1,x2,⋯,xn+1,要保证
x
j
−
x
i
,
(
j
=
2
,
3
,
⋯
,
n
+
1
)
x_j-x_i,(j=2,3,\cdots,n+1)
xj−xi,(j=2,3,⋯,n+1)为n个向量线性无关,否则将导致搜索过程在一个较低维的空间内进行,可能漏掉真正的极小值点。具体方法是:选取初始顶点
x
1
(
0
)
x_1^{(0)}
x1(0),从
x
1
(
0
)
x_1^{(0)}
x1(0)出发沿各坐标轴方向以步长h找其余n个顶点
x
j
(
0
)
=
x
1
(
0
)
+
h
e
i
,
e
i
=
[
0
,
⋯
,
0
,
1
,
0
,
⋯
,
0
]
T
,
(
j
=
i
+
1
,
,
i
=
1
,
2
,
⋯
,
n
)
x_j^{(0)}=x_1^{(0)}+he_i,e_i=[0,\cdots,0,1,0,\cdots,0]^T, (j=i+1,,i=1,2,\cdots,n)
xj(0)=x1(0)+hei,ei=[0,⋯,0,1,0,⋯,0]T,(j=i+1,,i=1,2,⋯,n)
式中,h为步长,一般取
h
=
0.5
15
h=0.5~15
h=0.5 15,初始取
h
=
1.6
1.7
h=1.6~1.7
h=1.6 1.7;
步2 计算各顶点的目标函数值,按大小排队,找出函数最大的最坏点
x
H
(
k
)
x_H^{(k)}
xH(k),函数值第二大的次坏点
x
G
(
k
)
x_G^{(k)}
xG(k),函数值最小的最好点
x
L
(
k
)
x_L^{(k)}
xL(k),k为迭代的轮数;
步3 求反射点
x
n
+
3
(
k
)
x_{n+3}^{(k)}
xn+3(k),求除最坏点
x
L
(
k
)
x_L^{(k)}
xL(k)外其余各点几何图形的形心
x
n
+
2
(
k
)
x_{n+2}^{(k)}
xn+2(k).
形心点:
x
n
+
2
(
k
)
=
1
n
(
∑
j
=
1
n
+
1
x
j
(
k
)
−
x
H
(
k
)
)
x_{n+2}^{(k)}=\frac{1}{n}(\sum_{j=1}^{n+1}x_j^{(k)}-x_H^{(k)})
xn+2(k)=n1(∑j=1n+1xj(k)−xH(k))
反射点:
x
n
+
3
(
k
)
=
2
x
n
+
2
(
k
)
−
x
H
(
k
)
x_{n+3}^{(k)}=2x_{n+2}^{(k)}-x_H^{(k)}
xn+3(k)=2xn+2(k)−xH(k)
步4 延伸(扩张):若
f
(
x
n
+
3
(
k
)
)
<
f
(
x
L
(
k
)
)
f(x_{n+3}^{(k)})<f(x_L^{(k)})
f(xn+3(k))<f(xL(k)),则进行延伸,延伸点为
x
n
+
4
(
k
)
=
x
n
+
2
(
k
)
+
γ
(
x
n
+
3
(
k
)
−
x
n
+
2
(
k
)
)
x_{n+4}^{(k)}=x_{n+2}^{(k)}+\gamma(x_{n+3}^{(k)}-x_{n+2}^{(k)})
xn+4(k)=xn+2(k)+γ(xn+3(k)−xn+2(k)),若
f
(
x
n
+
4
(
k
)
)
<
f
(
x
n
+
3
(
k
)
)
f(x_{n+4}^{(k)})<f(x_{n+3}^{(k)})
f(xn+4(k))<f(xn+3(k)),则用延伸点
x
n
+
4
(
k
)
x_{n+4}^{(k)}
xn+4(k)代替最坏点
x
H
(
k
)
x_{H}^{(k)}
xH(k),转步7;否则,用反射点
x
n
+
3
(
k
)
x_{n+3}^{(k)}
xn+3(k)代替最坏点
x
H
(
k
)
x_{H}^{(k)}
xH(k),转步7;
步5 若
f
(
x
n
+
3
(
k
)
)
>
f
(
x
L
(
k
)
)
f(x_{n+3}^{(k)})>f(x_L^{(k)})
f(xn+3(k))>f(xL(k)),即反射点比最好点差;
若
f
(
x
n
+
3
(
k
)
)
<
f
(
x
G
(
k
)
)
f(x_{n+3}^{(k)})<f(x_G^{(k)})
f(xn+3(k))<f(xG(k)),说明反射点比次坏点好,则用反射点
x
k
+
3
(
k
)
x_{k+3}^{(k)}
xk+3(k)代替最坏点
x
H
(
k
)
x_{H}^{(k)}
xH(k),转步7;
若
f
(
x
G
(
k
)
)
≤
f
(
x
n
+
3
(
k
)
)
<
f
(
x
H
(
k
)
)
f(x_G^{(k)})\le f(x_{n+3}^{(k)})\lt f(x_H^{(k)})
f(xG(k))≤f(xn+3(k))<f(xH(k)),用反射点
x
n
+
3
(
k
)
x_{n+3}^{(k)}
xn+3(k)
代替最坏点
x
H
(
k
)
x_{H}^{(k)}
xH(k)后进行压缩,求压缩点
x
n
+
5
(
k
)
x_{n+5}^{(k)}
xn+5(k);否则,直接求压缩点
x
n
+
5
(
k
)
x_{n+5}^{(k)}
xn+5(k),
x
n
+
5
(
k
)
=
x
n
+
2
(
k
)
+
β
(
x
H
(
k
)
−
x
n
+
2
(
k
)
)
x_{n+5}^{(k)}=x_{n+2}^{(k)}+\beta(x_{H}^{(k)}-x_{n+2}^{(k)})
xn+5(k)=xn+2(k)+β(xH(k)−xn+2(k))
步6 若
f
(
x
n
+
5
(
k
)
)
<
f
(
x
H
(
k
)
)
f(x_{n+5}^{(k)})<f(x_H^{(k)})
f(xn+5(k))<f(xH(k)),则用压缩点
x
n
+
5
(
k
)
x_{n+5}^{(k)}
xn+5(k)代替最坏点
x
H
(
k
)
x_{H}^{(k)}
xH(k),转步7;否则,将单纯形向最好点
x
L
(
k
)
x_L^{(k)}
xL(k)收缩
x
j
(
k
)
=
x
L
(
k
)
+
0.5
(
x
j
(
k
)
−
x
L
(
k
)
)
,
j
=
1
,
2
,
⋯
,
n
+
1
x_j^{(k)}=x_L^{(k)}+0.5(x_j^{(k)}-x_L^{(k)}), j=1,2,\cdots,n+1
xj(k)=xL(k)+0.5(xj(k)−xL(k)),j=1,2,⋯,n+1
步7 进行收敛性检验:
若
{
1
n
+
1
∑
j
=
1
n
+
1
[
f
(
x
j
(
k
)
)
−
f
(
x
n
+
2
(
k
)
)
]
2
}
1
2
≤
ϵ
\{\frac{1}{n+1}\sum_{j=1}^{n+1}[f(x_j^{(k)})-f(x_{n+2}^{(k)})]^2\}^{\frac{1}{2}}\le\epsilon
{n+11j=1∑n+1[f(xj(k))−f(xn+2(k))]2}21≤ϵ
则停止迭代,输出:
x
∗
=
x
L
(
k
)
,
f
(
x
∗
)
x^*=x_L^{(k)},f(x^*)
x∗=xL(k),f(x∗);否则
k
=
k
+
1
k=k+1
k=k+1,转步2.
四、单纯形法代码实现及举例
function [fmin, xmin] = nelderMead(func, X0, n, alpha, beta, gamma, epsilon)
X = zeros(n, n+1);
X(:, 1) = X0;
X(:, 2:end) = diag(ones(1, n)) * 1.6 + repmat(X(:, 1), 1, n);
fX = zeros(1, n+1);
for i = 1:n+1
fX(i) = feval(func, X(:, i));
end
[fX, pos] = sort(fX);
true = 1;
k = 1;
while true
XHOld = X(:, pos(n+1));
XLOld = X(:, pos(1));
XGOld = X(:, pos(n));
fXHOld = fX(n+1);
fXLOld = fX(1);
fXGOld = fX(n);
XCenter = (sum(X, 2) - XHOld) / n;
fXCenter = feval(func, XCenter);
XNewFirst = XCenter + alpha * (XCenter - XHOld);
fXNewFirst = feval(func, XNewFirst);
if fXNewFirst < fXLOld
XNewSecond = XCenter + gamma * (XNewFirst - XCenter);
fXNewSecond = feval(func, XNewSecond);
if fXNewSecond < fXNewFirst %fXLOld
XHOld = XNewSecond;
X(:, pos(n+1)) = XHOld;
fX(n+1) = fXNewSecond;
else
XHOld = XNewFirst;
X(:, pos(n+1)) = XHOld;
fX(n+1) = fXNewFirst;
end
else
if fXNewFirst > fXGOld
if fXNewFirst < fXHOld
XHOld = XNewFirst;
X(:, pos(n+1)) = XHOld;
fX(n+1) = fXNewFirst;
end
XNewThird = XCenter + beta * (XHOld - XCenter);
fXNewThird = feval(func, XNewThird);
if fXNewThird > fXHOld
% update all Xk
X = 0.5 * X + 0.5 * repmat(XLOld, 1, n+1);
% updata all fXk
for i = 1:n+1
fX(i) = feval(func, X(:, i));
end
else
XHOld = XNewThird;
X(:, pos(n+1)) = XHOld;
fX(n+1) = fXNewThird;
end
else
XHOld = XNewFirst;
X(:, pos(n+1)) = XHOld;
fX(n+1) = fXNewFirst;
end
end
fprintf('k = %d, fval = %f\n', k, feval(func, XLOld));
fXDiff = sqrt(mean((fX - fXCenter).^2));
if fXDiff < epsilon
break;
end
Xtemp = X;
for i = 1:n+1
X(:, i) = Xtemp(:, pos(i));
end
[fX, pos] = sort(fX);
k = k + 1;
end
xmin = XLOld;
fmin = feval(func, xmin);
end
测试样例1
min
f
(
x
1
,
x
2
)
=
x
1
2
+
2
x
2
2
−
4
x
1
−
2
x
1
x
2
,
x
0
=
[
1
,
1
]
T
\min f(x1, x_2)=x_1^2+2x_2^2-4x_1-2x_1x_2, x_0 = [1, 1]^T
minf(x1,x2)=x12+2x22−4x1−2x1x2,x0=[1,1]T
function f = nelderMeadTestFunc1(x)
f = x(1)^2 + 2 * x(2)^2 - 4 * x(1) - 2 * x(1)*x(2);
end
测试样例2
min
f
(
x
1
,
x
2
)
=
3
x
1
2
+
x
2
2
−
12
x
1
−
6
x
2
+
21
,
x
0
=
[
3
,
4
]
T
\min f(x_1, x_2) = 3x_1^2+x_2^2-12x_1-6x_2+21, x_0=[3, 4]^T
minf(x1,x2)=3x12+x22−12x1−6x2+21,x0=[3,4]T
function f = nelderMeadTestFunc2(x)
f = 3*x(1)^2 + x(2)^2 - 12 * x(1) - 6*x(2) + 21;
end
测试样例1与测试样例2一起进行计算脚本
% exercise (1)
epsilon = 0.00001;
X0 = [1, 1]';
n = 2;
alpha = 1.0;
beta = 0.5;
gamma = 2;
[fmin, xmin] = nelderMead('nelderMeadTestFunc1', X0, n, alpha, beta, gamma, epsilon);
[x, fval] = fminsearch('nelderMeadTestFunc1', X0);
fprintf('fmin = %f, xmin = (%f, %f), MATLAB build-in search: fmin = %f, xmin = (%f, %f)\n', ...
fmin, xmin(1), xmin(2), fval, x(1), x(2));
% exercise (2)
epsilon = 0.00001;
X0 = [3, 4]';
n = 2;
alpha = 1.0;
beta = 0.5;
gamma = 2;
[fmin, xmin] = nelderMead('nelderMeadTestFunc2', X0, n, alpha, beta, gamma, epsilon);
[x, fval] = fminsearch('nelderMeadTestFunc2', X0);
fprintf('fmin = %f, xmin = (%f, %f), MATLAB build-in search: fmin = %f, xmin = (%f, %f)\n', ...
fmin, xmin(1), xmin(2), fval, x(1), x(2));
计算结果为
fmin = -7.999991, xmin = (3.996602, 1.999585), MATLAB build-in search: fmin = -8.000000, xmin = (3.999976, 1.999973)
fmin = 0.000002, xmin = (2.000547, 2.998811), MATLAB build-in search: fmin = 0.000000, xmin = (1.999996, 2.999965)
从计算结果和matlab内置无约束搜索结果对比来看单纯形法计算结果正确,并符合精度要求