本章目录
1. 约束最优化问题
1.1 约束最优化问题的一般形式
约束最优化问题的一般形式为
min
f
(
x
)
s
.
t
.
h
i
(
x
)
=
0
i
=
1
,
2
,
⋯
,
l
h
j
(
x
)
≤
0
j
=
l
+
1
,
l
+
2
,
⋯
,
m
\begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix}
mins.t.f(x)hi(x)=0hj(x)≤0i=1,2,⋯,lj=l+1,l+2,⋯,m称
f
(
x
)
f(\boldsymbol{x})
f(x)为目标函数,
h
i
(
x
)
=
0
h_i(\boldsymbol{x}) = 0
hi(x)=0为等式约束,
h
j
(
x
)
≤
0
h_j(\boldsymbol{x}) \le 0
hj(x)≤0为不等式约束。
称集合
Ω
=
{
x
∣
h
i
(
x
)
=
0
,
h
j
(
x
)
≤
0
,
i
=
1
,
2
,
⋯
,
l
,
j
=
l
+
1
,
l
+
2
,
⋯
,
m
}
\varOmega = \left \lbrace \boldsymbol{x} | h_i(\boldsymbol{x}) = 0, h_j(\boldsymbol{x}) \le 0, i = 1, 2, \cdots, l, j = l+1, l+2, \cdots, m \right \rbrace
Ω={x∣hi(x)=0,hj(x)≤0,i=1,2,⋯,l,j=l+1,l+2,⋯,m}为可行域。
1.2 可行方向与可行下降方向
设 d \boldsymbol{d} d为非零向量, x ∈ Ω \boldsymbol{x} \in \varOmega x∈Ω, 若 ∃ k > 0 \exists k > 0 ∃k>0, 使得 ∀ α ∈ ( 0 , k ) \forall \alpha \in (0, k) ∀α∈(0,k), 都有 x + α d ∈ Ω \boldsymbol{x} + \alpha \boldsymbol{d} \in \varOmega x+αd∈Ω, 则称向量 d \boldsymbol{d} d为点 x \boldsymbol{x} x处的可行方向, 若还满足 f ( x + α d ) < f ( x ) f(\boldsymbol{x} + \alpha \boldsymbol{d}) < f(\boldsymbol{x}) f(x+αd)<f(x), 则称 d \boldsymbol{d} d为点 x \boldsymbol{x} x处的可行下降方向(或称改进的可行方向)。
1.3 起作用指标集
对于点 x ∈ Ω \boldsymbol{x} \in \varOmega x∈Ω, 称集合 A ( x ) = { i ∣ h i ( x ) = 0 } A(\boldsymbol{x}) = \left \lbrace i | h_i(\boldsymbol{x}) = 0 \right \rbrace A(x)={i∣hi(x)=0}为点 x \boldsymbol{x} x的起作用指标集, 直观来讲, 起作用指标集就是所有等式约束的下标和所有不等式约束中取等号的下标构成的集合。
2. KKT条件
设一般形式的约束最优化问题在点 x \boldsymbol{x} x处满足:向量组 ∇ h i ( x ) \nabla h_i(\boldsymbol{x}) ∇hi(x), i ∈ A ( x ) i \in A(\boldsymbol{x}) i∈A(x)线性无关, 问题在点 x \boldsymbol{x} x处的拉格朗日函数为 L ( x , λ ) = f ( x ) + ∑ i = 1 m λ i h i ( x ) L(\boldsymbol{x}, \boldsymbol{\lambda}) = f(\boldsymbol{x}) + \sum_{i=1}^{m} \lambda_i h_i(\boldsymbol{x}) L(x,λ)=f(x)+i=1∑mλihi(x)则问题在点 x \boldsymbol{x} x处的KKT条件为 { ∇ x L ( x , λ ) = 0 h i ( x ) = 0 , i = 1 , 2 , ⋯ , l h j ( x ) ≤ 0 , j = l + 1 , l + 2 , ⋯ , m λ j h j ( x ) = 0 , j = l + 1 , l + 2 , ⋯ , m λ j ≥ 0 , j = l + 1 , l + 2 , ⋯ , m \begin{cases} \nabla_{\boldsymbol{x}} L(\boldsymbol{x}, \boldsymbol{\lambda}) = \bold0 \\ h_i(\boldsymbol{x}) = 0, i = 1, 2, \cdots, l \\ h_j(\boldsymbol{x}) \le 0, j = l + 1, l + 2, \cdots, m \\ \lambda_j h_j(\boldsymbol{x}) = 0, j = l + 1, l + 2, \cdots, m \\ \lambda_j \ge 0, j = l+1, l+2, \cdots, m \end{cases} ⎩ ⎨ ⎧∇xL(x,λ)=0hi(x)=0,i=1,2,⋯,lhj(x)≤0,j=l+1,l+2,⋯,mλjhj(x)=0,j=l+1,l+2,⋯,mλj≥0,j=l+1,l+2,⋯,m若点 x \boldsymbol{x} x满足KKT条件, 则称 x \boldsymbol{x} x为KKT点, 相应的 ( x , λ ) (\boldsymbol{x}, \boldsymbol{\lambda}) (x,λ)称为KKT对。
【例1】求下列问题的所有KKT点 min x 1 x 2 s . t . x 1 2 + x 2 2 = 1 \begin{matrix} \min & x_1x_2 \\ \rm{s.t.} & x_1^2 + x_2^2 = 1 \end{matrix} mins.t.x1x2x12+x22=1【解】构造拉格朗日函数 L ( x 1 , x 2 , λ ) = x 1 x 2 + λ x 1 2 + λ x 2 2 − λ L(x_1, x_2, \lambda) = x_1x_2 + \lambda x_1^2 + \lambda x_2^2 - \lambda L(x1,x2,λ)=x1x2+λx12+λx22−λKKT条件为 { x 2 + λ x 1 = 0 x 1 + λ x 2 = 0 x 1 2 + x 2 2 = 1 \begin{cases} x_2 + \lambda x_1 = 0 \\ x_1 + \lambda x_2 = 0 \\ x_1^2 + x_2^2 = 1 \end{cases} ⎩ ⎨ ⎧x2+λx1=0x1+λx2=0x12+x22=1解得 x 1 = x 2 = ± 2 2 , λ = − 1 2 x_1 = x_2 = \pm \dfrac{\sqrt2}{2}, \lambda = -\dfrac{1}{2} x1=x2=±22,λ=−21或 x 1 = − x 2 = ± 2 2 , λ = 1 2 x_1 = -x_2 = \pm \dfrac{\sqrt2}{2}, \lambda = \dfrac{1}{2} x1=−x2=±22,λ=21所以KKT点为 ( 2 2 , 2 2 ) T \left( \dfrac{\sqrt2}{2}, \dfrac{\sqrt2}{2} \right)^{\rm T} (22,22)T, ( − 2 2 , 2 2 ) T \left( -\dfrac{\sqrt2}{2}, \dfrac{\sqrt2}{2} \right)^{\rm T} (−22,22)T, ( 2 2 , − 2 2 ) T \left( \dfrac{\sqrt2}{2}, -\dfrac{\sqrt2}{2} \right)^{\rm T} (22,−22)T, ( − 2 2 , − 2 2 ) T \left( -\dfrac{\sqrt2}{2}, -\dfrac{\sqrt2}{2} \right)^{\rm T} (−22,−22)T。
【例2】判断点 x 0 = ( 1 , 3 ) T \boldsymbol{x}_0 = (1, 3)^{\rm T} x0=(1,3)T是否为下列问题的KKT点 min 4 x 1 − 3 x 2 s . t . x 1 + x 2 ≤ 4 x 2 + 7 ≥ 0 ( x 1 − 3 ) 2 ≤ 1 + x 2 \begin{matrix} \min & 4x_1 - 3x_2 \\ \rm{s.t.} & x_1 + x_2 \le 4 \\ & x_2 + 7 \ge 0 \\ & (x_1 - 3)^2 \le 1 + x_2 \end{matrix} mins.t.4x1−3x2x1+x2≤4x2+7≥0(x1−3)2≤1+x2【解】点 x 0 \boldsymbol{x}_0 x0处的起作用指标集为 A ( x 0 ) = { 1 , 3 } A(\boldsymbol{x}_0) = \lbrace 1, 3 \rbrace A(x0)={1,3}, 所以 λ 2 = 0 \lambda_2 = 0 λ2=0, 构造拉格朗日函数 L ( x 1 , x 2 , λ 1 , λ 3 ) = 4 x 1 − 3 x 2 + λ 1 ( x 1 + x 2 − 4 ) + λ 3 [ ( x 1 − 3 ) 2 − x 2 − 1 ] L(x_1, x_2, \lambda_1, \lambda_3) = 4x_1 - 3x_2 + \lambda_1(x_1 + x_2 - 4) + \lambda_3\left[(x_1-3)^2 - x_2 - 1\right] L(x1,x2,λ1,λ3)=4x1−3x2+λ1(x1+x2−4)+λ3[(x1−3)2−x2−1]KKT条件可化为 { 4 + λ 1 + 2 λ 3 x 1 − 6 λ 3 = 0 − 3 + λ 1 − λ 3 = 0 λ 1 ≥ 0 , λ 3 ≥ 0 \begin{cases} 4 + \lambda_1 + 2\lambda_3x_1- 6\lambda_3 = 0 \\ -3 + \lambda_1 - \lambda_3 = 0 \\ \lambda_1 \ge 0, \lambda_3 \ge 0 \end{cases} ⎩ ⎨ ⎧4+λ1+2λ3x1−6λ3=0−3+λ1−λ3=0λ1≥0,λ3≥0将 x 1 = 1 x_1 = 1 x1=1和 x 2 = 3 x_2 = 3 x2=3代入上面的方程, 得到 { λ 1 − 4 λ 3 = − 4 λ 1 − λ 3 = 3 λ 1 ≥ 0 , λ 3 ≥ 0 \begin{cases} \lambda_1 - 4\lambda_3 = -4 \\ \lambda_1 - \lambda_3 = 3 \\ \lambda_1 \ge 0, \lambda_3 \ge 0 \end{cases} ⎩ ⎨ ⎧λ1−4λ3=−4λ1−λ3=3λ1≥0,λ3≥0上面的方程有解: λ 1 = 16 3 ≥ 0 \lambda_1 = \dfrac{16}{3} \ge 0 λ1=316≥0, λ 3 = 7 3 ≥ 0 \lambda_3 = \dfrac{7}{3} \ge 0 λ3=37≥0, 所以 x 0 \boldsymbol{x}_0 x0是KKT点。
3. 二次规划
3.1 二次规划的一般形式
称目标函数为二次函数, 约束为线性约束的约束最优化问题为二次规划, 二次规划的一般形式为 min 1 2 x T G x + c T x s . t . a i T x = b i i = 1 , 2 , ⋯ , l a j T x ≤ b j j = l + 1 , l + 2 , ⋯ , m \begin{matrix} \min & \dfrac{1}{2}\boldsymbol{x}^{\rm T}\boldsymbol{G}\boldsymbol{x} + \boldsymbol{c}^{\rm T} \boldsymbol{x}\\ \rm {s.t.} & \boldsymbol{a}_i^{\rm T} \boldsymbol{x} = b_i & i = 1, 2, \cdots, l \\ & \boldsymbol{a}_j^{\rm T} \boldsymbol{x} \le b_j & j = l+1, l+2, \cdots, m \end{matrix} mins.t.21xTGx+cTxaiTx=biajTx≤bji=1,2,⋯,lj=l+1,l+2,⋯,m
3.2 等式约束二次规划
若二次规划问题不含不等式约束, 则问题退化为 min 1 2 x T G x + c T x s . t . A x = b \begin{matrix} \min & \dfrac{1}{2}\boldsymbol{x}^{\rm T}\boldsymbol{G}\boldsymbol{x} + \boldsymbol{c}^{\rm T} \boldsymbol{x}\\ \rm {s.t.} & \boldsymbol{A} \boldsymbol{x} = \boldsymbol b \end{matrix} mins.t.21xTGx+cTxAx=b若矩阵 G \boldsymbol{G} G半正定, 且 A \boldsymbol{A} A的所有行线性无关, 则问题的KKT点与问题的最优解等价。只需求解线性方程组 [ G A T A O ] [ x λ ] = [ − c b ] \begin{bmatrix} \boldsymbol{G} & \boldsymbol{A}^{\rm T} \\ \boldsymbol{A} & \boldsymbol{O} \end{bmatrix} \begin{bmatrix} \boldsymbol{x} \\ \boldsymbol{\lambda} \end{bmatrix} = \begin{bmatrix} \boldsymbol{-c} \\ \boldsymbol{b} \end{bmatrix} [GAATO][xλ]=[−cb]即可得到问题的最优解。
【例3】求解二次规划问题 min x 1 2 + x 2 2 + x 3 2 − x 1 x 2 − x 2 x 3 + 2 x 1 − x 2 s . t . 3 x 1 − x 2 − x 3 = 0 2 x 1 − x 2 − x 3 = 0 \begin{matrix} \min & x_1^2 + x_2^2 + x_3^2 - x_1x_2 - x_2x_3 + 2x_1 - x_2\\ \rm {s.t.} & 3x_1 - x_2 - x_3 = 0 \\ & 2x_1 - x_2 - x_3 = 0 \end{matrix} mins.t.x12+x22+x32−x1x2−x2x3+2x1−x23x1−x2−x3=02x1−x2−x3=0【解】把问题化为矩阵形式 min 1 2 [ x 1 , x 2 , x 3 ] [ 2 − 1 0 − 1 2 − 1 0 − 1 2 ] [ x 1 x 2 x 3 ] + [ 2 , − 1 , 0 ] [ x 1 x 2 x 3 ] s . t . [ 3 − 1 − 1 2 − 1 − 1 ] [ x 1 x 2 x 3 ] = [ 0 0 ] \begin{matrix} \min & \dfrac{1}{2} [x_1, x_2, x_3] \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + [2, -1, 0] \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \\ \\ \rm {s.t.} & \begin{bmatrix} 3 & -1 & -1 \\ 2 & -1 & -1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{matrix} mins.t.21[x1,x2,x3] 2−10−12−10−12 x1x2x3 +[2,−1,0] x1x2x3 [32−1−1−1−1] x1x2x3 =[00]其中矩阵 [ 2 − 1 0 − 1 2 − 1 0 − 1 2 ] \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix} 2−10−12−10−12 正定, 解线性方程组 [ 2 − 1 0 3 2 − 1 2 − 1 − 1 − 1 0 − 1 2 − 1 − 1 3 − 1 − 1 0 0 2 − 1 − 1 0 0 ] [ x 1 x 2 x 3 λ 1 λ 2 ] = [ − 2 1 0 0 0 ] \begin{bmatrix} 2 & -1 & 0 & 3 & 2 \\ -1 & 2 & -1 & -1 & -1 \\ 0 & -1 & 2 & -1 & -1 \\ 3 & -1 & -1 & 0 & 0 \\ 2 & -1 & -1 & 0 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \lambda_1 \\ \lambda_2 \end{bmatrix} = \begin{bmatrix} -2 \\ 1 \\ 0 \\ 0 \\0 \end{bmatrix} 2−1032−12−1−1−10−12−1−13−1−1002−1−100 x1x2x3λ1λ2 = −21000 得 [ x 1 , x 2 , x 3 , λ 1 , λ 2 ] = [ 0 , 1 6 , 1 6 , − 5 6 , 1 3 ] [x_1, x_2, x_3, \lambda_1, \lambda_2] = \left[ 0, \dfrac{1}{6}, \dfrac{1}{6}, -\dfrac{5}{6}, \dfrac{1}{3} \right] [x1,x2,x3,λ1,λ2]=[0,61,61,−65,31], 所以最优解为 [ x 1 , x 2 , x 3 ] = [ 0 , 1 6 , 1 6 ] [x_1, x_2, x_3] = \left[ 0, \dfrac{1}{6}, \dfrac{1}{6} \right] [x1,x2,x3]=[0,61,61], 最优值为 − 5 36 -\dfrac{5}{36} −365。
3.3 起作用指标集方法
对于一般形式的二次规划问题, 若 G \boldsymbol{G} G为正定矩阵, 则以下算法可以得到问题的最优解:
- 给出问题的初始可行点 x \boldsymbol{x} x
- 初始化下标集合 I ← A ( x ) I \gets A(\boldsymbol{x}) I←A(x)
- w h i l e T r u e d o \bold{while} \; \rm{True} \; \bold{do} whileTruedo
- \qquad 求解下面仅含等式约束的二次规划问题得到 d \boldsymbol{d} d 和 λ \boldsymbol{\lambda} λ min d 1 2 d T G d + ( G x + c ) T d s . t . a i T d = 0 , i ∈ I \begin{matrix} \underset{\boldsymbol{d}}{\min} & \dfrac{1}{2} \boldsymbol{d}^{\rm T} \boldsymbol{G} \boldsymbol{d} + (\boldsymbol{G} \boldsymbol{x} + \boldsymbol{c})^{\rm T} \boldsymbol{d} \\ \rm {s.t.} & \boldsymbol{a}_i^{\rm T} \boldsymbol{d} = 0, i \in I \end{matrix} dmins.t.21dTGd+(Gx+c)TdaiTd=0,i∈I \qquad
- i f d = 0 d o \qquad \bold {if} \; \boldsymbol{d}=\bold0 \; \bold{do} ifd=0do
- i f λ ≥ 0 d o \qquad \qquad \bold{if} \; \boldsymbol{\lambda} \ge \bold0 \; \bold{do} ifλ≥0do
- r e t u r n x \qquad \qquad \qquad \bold{return} \; \boldsymbol{x} returnx
- e l s e \qquad \qquad \bold{else} else
- I ← I ∖ { arg min λ i } \qquad \qquad \qquad I \gets I \setminus \left \lbrace \argmin \lambda_i \right \rbrace I←I∖{argminλi}
- e n d \qquad \qquad \bold{end} end
- e l s e \qquad \bold{else} else
- α ← min i ∉ I { b i − a i T x a i T d ∣ a i T d > 0 } \qquad \qquad \alpha \gets \underset{i \notin I}{\min} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace α←i∈/Imin{aiTdbi−aiTx∣aiTd>0}
- i f α < 1 d o \qquad \qquad \bold{if} \; \alpha < 1 \; \bold{do} ifα<1do
- i ← arg min i ∉ I { b i − a i T x a i T d ∣ a i T d > 0 } \qquad \qquad \qquad i \gets \underset{i \notin I}{\argmin} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace i←i∈/Iargmin{aiTdbi−aiTx∣aiTd>0}
- I ← I ∪ { i } \qquad \qquad \qquad I \gets \ I \cup \lbrace i \rbrace I← I∪{i}
- e l s e \qquad \qquad \bold{else} else
- α ← 1 \qquad \qquad \qquad \alpha \gets 1 α←1
- e n d \qquad \qquad \bold{end} end
- x ← x + α d \qquad \qquad \boldsymbol{x} \gets \boldsymbol{x} + \alpha \boldsymbol{d} x←x+αd
- e n d \qquad \bold{end} end
- e n d \bold{end} end
上述算法称为起作用指标集方法。
【例4】求解二次规划问题 min ( x 1 − 1 ) 2 + ( x 2 − 2 ) 2 s . t . x 1 + x 2 ≤ 1 x 1 , x 2 ≥ 0 \begin{matrix} \min & (x_1 - 1)^2 + (x_2 - 2)^2 \\ \rm {s.t.} & x_1 + x_2 \le 1 \\ & x_1, x_2 \ge 0 \end{matrix} mins.t.(x1−1)2+(x2−2)2x1+x2≤1x1,x2≥0【解】该问题对应的系数为 G = [ 2 0 0 2 ] , c = [ − 2 − 4 ] , a 1 = [ 1 1 ] , a 2 = [ − 1 0 ] , a 3 = [ 0 − 1 ] , b = [ 1 0 0 ] \boldsymbol{G} = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}, \boldsymbol{c} = \begin{bmatrix} -2 \\ -4 \end{bmatrix}, \boldsymbol{a}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}, \boldsymbol{a}_2 = \begin{bmatrix} -1 \\ 0 \end{bmatrix}, \boldsymbol{a}_3 = \begin{bmatrix} 0 \\ -1 \end{bmatrix}, \boldsymbol{b} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} G=[2002],c=[−2−4],a1=[11],a2=[−10],a3=[0−1],b= 100 显然, x 0 = [ 0 , 0 ] T \boldsymbol{x}_0 = [0, 0]^{\rm T} x0=[0,0]T是可行的, 初始化 I 0 = A ( 0 , 0 ) = { 2 , 3 } I_0 = A(0, 0) = \lbrace 2, 3 \rbrace I0=A(0,0)={2,3}
第
1
1
1次迭代:
解二次规划问题
min
d
1
2
+
d
2
2
−
2
d
1
−
4
d
2
s
.
t
.
−
d
1
=
0
−
d
2
=
0
\begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t.} & -d_1= 0 \\ & -d_2 = 0 \end{matrix}
mins.t.d12+d22−2d1−4d2−d1=0−d2=0得
d
=
[
0
,
0
]
T
\boldsymbol{d} = [0, 0]^{\rm T}
d=[0,0]T,
λ
=
[
0
,
−
2
,
−
4
]
T
≤
0
\boldsymbol \lambda = [0, -2, -4]^{\rm T} \le \bold 0
λ=[0,−2,−4]T≤0,
arg min
λ
i
=
3
\argmin \lambda_i = 3
argminλi=3, 更新
I
1
=
{
2
}
I_1 = \lbrace 2 \rbrace
I1={2}
第
2
2
2次迭代:
解二次规划问题
min
d
1
2
+
d
2
2
−
2
d
1
−
4
d
2
s
.
t
.
−
d
1
=
0
\begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t.} & -d_1= 0 \end{matrix}
mins.t.d12+d22−2d1−4d2−d1=0得
d
=
[
0
,
2
]
T
≠
0
\boldsymbol{d} = [0, 2]^{\rm T} \ne \bold 0
d=[0,2]T=0, 则计算
α
=
min
i
∈
{
1
,
3
}
{
b
i
−
a
i
T
x
0
a
i
T
d
∣
a
i
T
d
>
0
}
=
1
2
<
1
\alpha = \underset{i \in \lbrace 1, 3 \rbrace}{\min} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}_0}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace = \dfrac{1}{2} < 1
α=i∈{1,3}min{aiTdbi−aiTx0∣aiTd>0}=21<1,
i
=
1
i = 1
i=1, 更新
I
2
=
{
1
,
2
}
I_2 = \lbrace 1, 2 \rbrace
I2={1,2},
x
2
=
[
0
,
1
]
T
\boldsymbol{x}_2 = [0, 1]^{\rm T}
x2=[0,1]T
第
3
3
3次迭代:
解二次规划问题
min
d
1
2
+
d
2
2
−
2
d
1
−
4
d
2
s
.
t
.
d
1
+
d
2
=
0
−
d
1
=
0
\begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t.} & d_1 + d_2 = 0 \\ & -d_1 = 0 \end{matrix}
mins.t.d12+d22−2d1−4d2d1+d2=0−d1=0得
d
=
[
0
,
0
]
T
\boldsymbol{d} = [0, 0]^{\rm T}
d=[0,0]T,
λ
=
[
2
,
0
,
0
]
T
≥
0
\boldsymbol \lambda = [2, 0, 0]^{\rm T} \ge \bold0
λ=[2,0,0]T≥0, 迭代结束
所以最优解为 [ 0 , 1 ] T [0, 1]^{\rm T} [0,1]T, 最优值为 1 1 1。
4. 惩罚函数法与障碍函数法
4.1 惩罚函数法
对于一般形式的约束最优化问题, 惩罚函数法通过添加惩罚项的方式将问题转化为无约束最优化问题, 从而可以使用无约束最优化方法求解原问题。对于一般形式的约束最优化问题 min f ( x ) s . t . h i ( x ) = 0 i = 1 , 2 , ⋯ , l h j ( x ) ≤ 0 j = l + 1 , l + 2 , ⋯ , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix} mins.t.f(x)hi(x)=0hj(x)≤0i=1,2,⋯,lj=l+1,l+2,⋯,m惩罚函数法的求解步骤是:
- 给定 ρ 1 > 0 \rho_1 > 0 ρ1>0, 精度 ε > 0 \varepsilon > 0 ε>0, 初始点 x 0 \boldsymbol{x}_0 x0, 当前迭代次数 k = 1 k = 1 k=1
- 第 k k k次迭代, 求解无约束优化问题 min P ( x , ρ k ) = f ( x ) + ρ k [ ∑ i = 1 l h i 2 ( x ) + ∑ j = l + 1 m ( max { 0 , h j ( x ) } ) 2 ] \min P(\boldsymbol{x}, \rho_k) = f(\boldsymbol{x}) + \rho_k \left[ \sum_{i=1}^{l}h_i^2(\boldsymbol{x}) + \sum_{j=l+1}^{m} (\max \left \lbrace 0, h_j(\boldsymbol{x}) \right \rbrace)^2 \right] minP(x,ρk)=f(x)+ρk i=1∑lhi2(x)+j=l+1∑m(max{0,hj(x)})2 得到最优解为 x k \boldsymbol{x}_k xk
- 若惩罚项满足 ρ k [ ∑ i = 1 l h i 2 ( x k ) + ∑ j = l + 1 m ( max { 0 , h j ( x k ) } ) 2 ] ≤ ε \rho_k \left[ \sum_{i=1}^{l}h_i^2(\boldsymbol{x}_k) + \sum_{j=l+1}^{m} (\max \left \lbrace 0, h_j(\boldsymbol{x}_k) \right \rbrace)^2 \right] \le \varepsilon ρk i=1∑lhi2(xk)+j=l+1∑m(max{0,hj(xk)})2 ≤ε则迭代结束, 最优解为 x k \boldsymbol{x}_k xk, 否则取 ρ k + 1 > ρ k \rho_{k+1} > \rho_k ρk+1>ρk, 继续迭代
通常来讲, 做题时, 迭代一次后令 ρ → + ∞ \rho \to +\infty ρ→+∞即可得到最优解。
【例5】用惩罚函数法求解 min x 1 2 + x 2 2 s . t . x 1 − 1 ≥ 0 x 1 + x 2 = 3 \begin{matrix} \min & x_1^2 + x_2^2 \\ \rm {s.t.} & x_1 - 1 \ge 0 \\ & x_1 + x_2 = 3 \end{matrix} mins.t.x12+x22x1−1≥0x1+x2=3【解】令 P ( x 1 , x 2 , ρ ) = x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 + ρ ( max { 0 , 1 − x 1 ] } ) 2 = { x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 , x 1 > 1 x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 + ρ ( 1 − x 1 ) 2 , x 1 ≤ 1 \begin{align} P(x_1, x_2, \rho) & = x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2 + \rho \left( \max \lbrace 0, 1 - x_1] \rbrace \right)^2 \nonumber \\ & = \begin{cases} x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2, & x_1 > 1 \\ x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2 + \rho(1 - x_1)^2, & x_1 \le 1 \end{cases} \nonumber \end{align} P(x1,x2,ρ)=x12+x22+ρ(x1+x2−3)2+ρ(max{0,1−x1]})2={x12+x22+ρ(x1+x2−3)2,x12+x22+ρ(x1+x2−3)2+ρ(1−x1)2,x1>1x1≤1则 ∂ P ∂ x 1 = { 2 x 1 + 2 ρ ( x 1 + x 2 − 3 ) , x 1 > 1 2 x 1 + 2 ρ ( x 1 + x 2 − 3 ) − 2 ρ ( 1 − x 1 ) , x 1 ≤ 1 \dfrac{\partial P}{\partial x_1} = \begin{cases} 2x_1 + 2\rho ( x_1 + x_2 - 3), & x_1 > 1 \\ 2x_1 + 2\rho ( x_1 + x_2 - 3) - 2\rho(1 - x_1), & x_1 \le 1 \end{cases} ∂x1∂P={2x1+2ρ(x1+x2−3),2x1+2ρ(x1+x2−3)−2ρ(1−x1),x1>1x1≤1 ∂ P ∂ x 2 = 2 x 2 + 2 ρ ( x 1 + x 2 − 3 ) \dfrac{\partial P}{\partial x_2} = 2x_2 + 2\rho ( x_1 + x_2 - 3) ∂x2∂P=2x2+2ρ(x1+x2−3)令 ∂ P ∂ x 1 = ∂ P ∂ x 2 = 0 \dfrac{\partial P}{\partial x_1} = \dfrac{\partial P}{\partial x_2} = 0 ∂x1∂P=∂x2∂P=0解得 x = { ( 3 ρ 2 ρ + 1 , 3 ρ 2 ρ + 1 ) T , ρ > 1 ρ ρ 2 + 3 ρ + 1 ( ρ + 4 , 2 ρ + 3 ) T , 0 < ρ ≤ 1 \boldsymbol{x} = \begin{cases} \left( \dfrac{3\rho}{2\rho + 1}, \dfrac{3\rho}{2\rho + 1} \right)^{\rm T}, & \rho > 1 \\ \dfrac{\rho}{\rho^2 + 3\rho + 1} \left( \rho + 4, 2\rho + 3 \right)^{\rm T}, & 0 < \rho \le 1 \end{cases} x=⎩ ⎨ ⎧(2ρ+13ρ,2ρ+13ρ)T,ρ2+3ρ+1ρ(ρ+4,2ρ+3)T,ρ>10<ρ≤1所以原问题的最优解为 lim ρ → + ∞ x = lim ρ → + ∞ ( 3 ρ 2 ρ + 1 , 3 ρ 2 ρ + 1 ) T = ( 3 2 , 3 2 ) T \underset{\rho \to +\infty}{\lim} \boldsymbol{x} = \underset{\rho \to +\infty}{\lim} \left( \dfrac{3\rho}{2\rho + 1}, \dfrac{3\rho}{2\rho + 1} \right)^{\rm T} = \left( \dfrac{3}{2}, \dfrac{3}{2} \right)^{\rm T} ρ→+∞limx=ρ→+∞lim(2ρ+13ρ,2ρ+13ρ)T=(23,23)T最优值为 9 2 \dfrac{9}{2} 29。
4.2 障碍函数法
对于仅含不等式约束的约束优化问题 min f ( x ) s . t . h i ( x ) ≤ 0 i = 1 , 2 , ⋯ , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) \le 0 & i = 1, 2, \cdots, m \end{matrix} mins.t.f(x)hi(x)≤0i=1,2,⋯,m障碍函数法通过构造障碍函数 b ( x ) b(\boldsymbol{x}) b(x)将其转化为无约束优化问题, 障碍函数 b ( x ) b(\boldsymbol{x}) b(x)通常有以下两种构造方法: b 1 ( x ) = − ∑ i = 1 m 1 h i ( x ) , b 2 ( x ) = − ∑ i = 1 m ln [ − h i ( x ) ] b_1(\boldsymbol{x}) = -\sum_{i=1}^{m}\dfrac{1}{h_i(\boldsymbol{x})}, b_2(\boldsymbol{x}) = -\sum_{i=1}^{m}\ln \left[-h_i(\boldsymbol{x}) \right] b1(x)=−i=1∑mhi(x)1,b2(x)=−i=1∑mln[−hi(x)]称 b 1 ( x ) b_1(\boldsymbol{x}) b1(x)为倒数障碍函数, b 2 ( x ) b_2(\boldsymbol{x}) b2(x)为对数障碍函数。障碍函数法的求解步骤是
- 给定 r 1 > 0 r_1 > 0 r1>0, 精度 ε > 0 \varepsilon > 0 ε>0, 初始点 x 0 \boldsymbol{x}_0 x0, 当前迭代次数 k = 1 k = 1 k=1
- 第 k k k次迭代, 求解无约束优化问题 min B ( x , r k ) = f ( x ) + r k b ( x ) \min B(\boldsymbol{x}, r_k) = f(\boldsymbol{x}) + r_kb(\boldsymbol{x}) minB(x,rk)=f(x)+rkb(x)得到最优解为 x k \boldsymbol{x}_k xk
- 若惩罚项满足 r k b ( x k ) ≤ ε r_kb(\boldsymbol{x}_k) \le \varepsilon rkb(xk)≤ε则迭代结束, 最优解为 x k \boldsymbol{x}_k xk, 否则取 r k + 1 ∈ ( 0 , r k ) r_{k+1} \in (0, r_k) rk+1∈(0,rk), 继续迭代
通常来讲, 做题时, 迭代一次后令 r → 0 + r \to 0^+ r→0+即可得到最优解。
【例6】用障碍函数法求解 min x 1 2 + x 2 2 s . t . x 1 − x 2 + 1 ≤ 0 \begin{matrix} \min & x_1^2 + x_2^2\\ \rm {s.t.} & x_1 - x_2 + 1 \le 0 \end{matrix} mins.t.x12+x22x1−x2+1≤0【解】令 B ( x 1 , x 2 , r ) = x 1 2 + x 2 2 − r ln ( x 2 − x 1 − 1 ) B(x_1, x_2, r) = x_1^2 + x_2^2 - r\ln(x_2 - x_1 - 1) B(x1,x2,r)=x12+x22−rln(x2−x1−1)则 ∂ B ∂ x 1 = 2 x 1 + r x 2 + x 1 − 1 , ∂ B ∂ x 2 = 2 x 2 − r x 2 + x 1 − 1 \dfrac{\partial B}{\partial x_1} = 2x_1 + \dfrac{r}{x_2 + x_1 - 1}, \dfrac{\partial B}{\partial x_2} = 2x_2 - \dfrac{r}{x_2 + x_1 - 1} ∂x1∂B=2x1+x2+x1−1r,∂x2∂B=2x2−x2+x1−1r令 ∂ B ∂ x 1 = ∂ B ∂ x 2 = 0 \dfrac{\partial B}{\partial x_1} = \dfrac{\partial B}{\partial x_2} = 0 ∂x1∂B=∂x2∂B=0解得 x = ( − 1 + 1 + r 4 , 1 + 1 + r 4 ) T \boldsymbol{x} = \left( -\dfrac{1+\sqrt{1 + r}}{4}, \dfrac{1+\sqrt{1 + r}}{4} \right)^{\rm T} x=(−41+1+r,41+1+r)T所以原问题的最优解为 lim r → 0 + x = lim r → 0 ( − 1 + 1 + r 4 , 1 + 1 + r 4 ) T = ( − 1 2 , 1 2 ) T \underset{r \to 0^+}{\lim} \boldsymbol{x} = \underset{r \to 0}{\lim} \left( -\dfrac{1+\sqrt{1 + r}}{4}, \dfrac{1+\sqrt{1 + r}}{4} \right)^{\rm T} = \left( -\dfrac{1}{2}, \dfrac{1}{2} \right)^{\rm T} r→0+limx=r→0lim(−41+1+r,41+1+r)T=(−21,21)T最优值为 1 2 \dfrac{1}{2} 21。
4.3 混合罚函数法
混合罚函数法综合使用惩罚函数和障碍函数, 目标函数为 F ( x , r ) = f ( x ) + + r b ( x ) + p ( x ) r F(\boldsymbol{x}, r) = f(x) + + rb(\boldsymbol{x}) + \dfrac{p(\boldsymbol{x})}{r} F(x,r)=f(x)++rb(x)+rp(x)式中, b ( x ) b(\boldsymbol{x}) b(x)为障碍函数, p ( x ) p(\boldsymbol{x}) p(x)为惩罚函数。
5. 增广拉格朗日函数法
对于一般形式的约束最优化问题 min f ( x ) s . t . h i ( x ) = 0 i = 1 , 2 , ⋯ , l h j ( x ) ≤ 0 j = l + 1 , l + 2 , ⋯ , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix} mins.t.f(x)hi(x)=0hj(x)≤0i=1,2,⋯,lj=l+1,l+2,⋯,m定义其增广拉格朗日函数为 L σ ( x , λ ) = f ( x ) + ∑ i = 1 l λ i h i ( x ) + σ 2 ∑ i = 1 l h i 2 ( x ) + 1 2 σ ∑ j = l + 1 m { [ max { 0 , λ j + σ h j ( x ) } ] 2 − λ j 2 } L_\sigma(\boldsymbol{x}, \boldsymbol{\lambda}) = f(\boldsymbol{x}) + \sum_{i=1}^{l}\lambda_i h_i(\boldsymbol{x}) + \dfrac{\sigma}{2}\sum_{i=1}^{l}h_i^2(\boldsymbol{x}) + \dfrac{1}{2\sigma} \sum_{j=l+1}^{m} \left \lbrace \left[ \max \left \lbrace 0, \lambda_j + \sigma h_j(\boldsymbol{x}) \right \rbrace \right]^2 - \lambda_j^2 \right \rbrace Lσ(x,λ)=f(x)+i=1∑lλihi(x)+2σi=1∑lhi2(x)+2σ1j=l+1∑m{[max{0,λj+σhj(x)}]2−λj2}第 k k k次迭代, 令 ∇ x L ( x , λ ) = 0 \nabla_{\boldsymbol{x}}L(\boldsymbol{x}, \boldsymbol{\lambda}) = \bold0 ∇xL(x,λ)=0解得 x k \boldsymbol{x}_k xk, 然后按以下公式修正拉格朗日系数 ( λ k + 1 ) i = { ( λ k ) i + σ h i ( x k ) , 1 ≤ i ≤ l max { 0 , ( λ k ) i + σ h i ( x k ) } , l < i ≤ m (\boldsymbol{\lambda}_{k+1})_i = \begin{cases} (\boldsymbol{\lambda}_k)_i + \sigma h_i(\boldsymbol{x}_k), & 1 \le i \le l \\ \max \lbrace 0, (\boldsymbol{\lambda}_k)_i + \sigma h_i(\boldsymbol{x}_k) \rbrace, & l < i \le m \end{cases} (λk+1)i={(λk)i+σhi(xk),max{0,(λk)i+σhi(xk)},1≤i≤ll<i≤m通常来讲, 做题时, 迭代一次, 然后计算 λ k \boldsymbol{\lambda}_k λk的极限, 将极限值代入即可得到最优解。
【例7】用增广拉格朗日函数法求解 min 3 x 1 2 + x 2 2 s . t . x 1 + x 2 = 1 \begin{matrix} \min & 3x_1^2 + x_2^2 \\ \rm {s.t.} & x_1 + x_2 = 1 \end{matrix} mins.t.3x12+x22x1+x2=1【解】令 L σ ( x 1 , x 2 , λ ) = 3 x 1 2 + x 2 2 + λ ( x 1 + x 2 − 1 ) + σ 2 ( x 1 + x 2 − 1 ) 2 L_\sigma(x_1, x_2, \lambda) = 3x_1^2 + x_2^2 + \lambda(x_1 + x_2 - 1) + \dfrac{\sigma}{2}(x_1 + x_2 - 1)^2 Lσ(x1,x2,λ)=3x12+x22+λ(x1+x2−1)+2σ(x1+x2−1)2则 ∇ x L σ ( x 1 , x 2 , λ ) = [ 6 x 1 + λ + σ ( x 1 + x 2 − 1 ) , 2 x 2 + λ + σ ( x 1 + x 2 − 1 ) ] T \nabla_{\boldsymbol{x}}L_\sigma(x_1, x_2, \lambda) = [6x_1 + \lambda + \sigma(x_1 + x_2 - 1), 2x_2 + \lambda + \sigma(x_1 + x_2 - 1)]^{\rm T} ∇xLσ(x1,x2,λ)=[6x1+λ+σ(x1+x2−1),2x2+λ+σ(x1+x2−1)]T令 ∇ x L σ ( x 1 , x 2 , λ ) = 0 \nabla_{\boldsymbol{x}}L_\sigma(x_1, x_2, \lambda) = \bold0 ∇xLσ(x1,x2,λ)=0解得 x k = [ σ − λ k 4 σ + 6 , 3 σ − 3 λ k 4 σ + 6 ] T \boldsymbol{x}_k = \left[ \dfrac{\sigma - \lambda_k}{4\sigma + 6}, \dfrac{3\sigma - 3\lambda_k}{4\sigma + 6} \right]^{\rm T} xk=[4σ+6σ−λk,4σ+63σ−3λk]T从而 λ k + 1 = λ k + σ ( σ − λ k 4 σ + 6 + 3 σ − 3 λ k 4 σ + 6 − 1 ) = 3 ( λ k − σ ) 2 σ + 3 \lambda_{k+1} = \lambda_k + \sigma \left( \dfrac{\sigma - \lambda_k}{4\sigma + 6} + \dfrac{3\sigma - 3\lambda_k}{4\sigma + 6} - 1 \right) = \dfrac{3(\lambda_k - \sigma)}{2\sigma + 3} λk+1=λk+σ(4σ+6σ−λk+4σ+63σ−3λk−1)=2σ+33(λk−σ)当 λ 1 > − 3 2 \lambda_1 > -\dfrac{3}{2} λ1>−23时, 由数学归纳法易证 λ k > − 3 2 \lambda_k > -\dfrac{3}{2} λk>−23, 于是 λ k + 1 − λ k = 3 ( λ k − σ ) 2 σ + 3 − λ k = − σ 2 σ + 3 ( 3 + 2 λ k ) < 0 \lambda_{k+1} - \lambda_k = \dfrac{3(\lambda_k - \sigma)}{2\sigma + 3} - \lambda_k = -\dfrac{\sigma}{2\sigma + 3}(3 + 2\lambda_k) < 0 λk+1−λk=2σ+33(λk−σ)−λk=−2σ+3σ(3+2λk)<0即数列 { λ k } \lbrace \lambda_k \rbrace {λk}单调递减且有界, 故 λ k \lambda_k λk的极限存在, 设为 γ \gamma γ, 对 λ k \lambda_k λk的递推式两边同时取极限得 γ = 3 ( γ − σ ) 2 σ + 3 \gamma = \dfrac{3(\gamma - \sigma)}{2\sigma + 3} γ=2σ+33(γ−σ)解得 γ = − 3 σ 2 σ = − 3 2 \gamma = -\dfrac{3\sigma}{2\sigma} = -\dfrac{3}{2} γ=−2σ3σ=−23所以原问题的最优解为 lim k → + ∞ x k = [ σ − γ 4 σ + 6 , 3 σ − 3 γ 4 σ + 6 ] T = [ 1 4 , 3 4 ] T \underset{k \to +\infty}{\lim} \boldsymbol{x}_k= \left[ \dfrac{\sigma - \gamma}{4\sigma + 6}, \dfrac{3\sigma - 3\gamma}{4\sigma + 6} \right]^{\rm T} = \left[ \dfrac{1}{4}, \dfrac{3}{4} \right]^{\rm T} k→+∞limxk=[4σ+6σ−γ,4σ+63σ−3γ]T=[41,43]T最优值为 3 4 \dfrac{3}{4} 43。