Chapter 5 Duality 主要是讲优化问题的对偶,是整本书理论部分的核心。
5.1 The Lagrange dual function
对于一般优化问题(primal problem)
minimize
f
0
(
x
)
subject to
f
i
(
x
)
≤
0
,
i
=
1
,
…
,
m
h
i
(
x
)
=
0
,
i
=
1
,
…
,
p
\begin{aligned} \text{minimize}\quad &f_0(x)\\ \text{subject to}\;\;\;&f_i(x)\leq 0,\quad i=1,\ldots,m\\ &h_i(x)=0,\quad i=1,\ldots,p \end{aligned}
minimizesubject tof0(x)fi(x)≤0,i=1,…,mhi(x)=0,i=1,…,p
domain
D
=
⋂
i
=
0
m
d
o
m
f
∩
⋂
i
=
1
p
d
o
m
h
i
\mathcal{D}=\bigcap_{i=0}^m\mathbf{dom}f\cap\bigcap_{i=1}^p\mathbf{dom}h_i
D=⋂i=0mdomf∩⋂i=1pdomhi,optimal value
p
∗
p^*
p∗,相应的Lagrange为
L
(
x
,
λ
,
ν
)
=
f
0
(
x
)
+
∑
i
=
1
m
λ
i
f
i
(
x
)
+
∑
i
=
1
p
ν
i
h
i
(
x
)
L(x,\lambda,\nu)=f_0(x)+\sum_{i=1}^m\lambda_if_i(x)+\sum_{i=1}^p\nu_ih_i(x)
L(x,λ,ν)=f0(x)+i=1∑mλifi(x)+i=1∑pνihi(x)
d
o
m
L
=
D
×
R
m
×
R
p
\mathbf{dom}L=\mathcal{D}\times\mathbb{R}^m\times\mathbb{R}^p
domL=D×Rm×Rp,
λ
i
\lambda_i
λi和
ν
i
\nu_i
νi称为Lagrange multiplier,Lagrange dual function为
g
(
λ
,
ν
)
=
inf
x
∈
D
L
(
x
,
λ
,
ν
)
=
inf
x
∈
D
(
f
0
(
x
)
+
∑
i
=
1
m
λ
i
f
i
(
x
)
+
∑
i
=
1
p
ν
i
h
i
(
x
)
)
g(\lambda,\nu)=\inf_{x\in\mathcal{D}}L(x,\lambda,\nu)=\inf_{x\in\mathcal{D}}\left(f_0(x)+\sum_{i=1}^m\lambda_if_i(x)+\sum_{i=1}^p\nu_ih_i(x)\right)
g(λ,ν)=x∈DinfL(x,λ,ν)=x∈Dinf(f0(x)+i=1∑mλifi(x)+i=1∑pνihi(x))
- g ( λ , ν ) g(\lambda,\nu) g(λ,ν) is concave
- for any λ ⪰ 0 \lambda\succeq 0 λ⪰0 and any ν \nu ν, g ( λ , ν ) ≤ p ∗ g(\lambda,\nu)\leq p^* g(λ,ν)≤p∗
5.2 The Lagrange dual problem
maximize g ( λ , ν ) subject to λ ⪰ 0 \begin{aligned} \text{maximize}\quad &g(\lambda,\nu)\\ \text{subject to}\;\;\;&\lambda\succeq 0 \end{aligned} maximizesubject tog(λ,ν)λ⪰0
dual optimal ( λ ∗ , ν ∗ ) (\lambda^*,\nu^*) (λ∗,ν∗),optimal value d ∗ d^* d∗
-
d ∗ ≤ p ∗ d^*\leq p^* d∗≤p∗: weak duality
-
if d ∗ = p ∗ d^*=p^* d∗=p∗, then strong duality holds
-
Slater’s condition for convex problem: There exists an x ∈ r e l i n t D x\in\mathbf{relint}\mathcal{D} x∈relintD such that
f i ( x ) < 0 , i = 1 , … , m , A x = b f_i(x)<0,\quad i=1,\ldots,m,\quad Ax=b fi(x)<0,i=1,…,m,Ax=bthen strong duality holds
- if
f
1
,
…
,
f
k
f_1,\ldots,f_k
f1,…,fk are affine, then Slater’s condition is: there exists an
x
∈
r
e
l
i
n
t
D
x\in\mathbf{relint}\mathcal{D}
x∈relintD such that
f i ( x ) ≤ 0 , i = 1 , … , k , f i ( x ) < 0 , i = k + 1 , … , m , A x = b f_i(x)\leq 0,\ i=1,\ldots,k,\ f_i(x)<0,\ i=k+1,\ldots,m,\ Ax=b fi(x)≤0, i=1,…,k, fi(x)<0, i=k+1,…,m, Ax=b
- if
f
1
,
…
,
f
k
f_1,\ldots,f_k
f1,…,fk are affine, then Slater’s condition is: there exists an
x
∈
r
e
l
i
n
t
D
x\in\mathbf{relint}\mathcal{D}
x∈relintD such that
simple equivalent reformulations of a problem can lead to very different dual problems.
5.3和5.4是对Lagrange对偶的几何和鞍点解释。
5.5 Optimality conditions
主要是KKT条件
-
for any optimization problem with differentiable objective and constraint functions for which strong duality obtains, any pair of primal and dual optimal points must satisfy
f i ( x ∗ ) ≤ 0 , i = 1 , … , m h i ( x ∗ ) = 0 , i = 1 , … , p λ i ∗ ≥ 0 , i = 1 , … , m λ i ∗ f i ( x ∗ ) = 0 , i = 1 , … , m ∇ f 0 ( x ∗ ) + ∑ i = 1 m λ i ∗ ∇ f i ( x ∗ ) + ∑ i = 1 p ν i ∗ ∇ h i ( x ∗ ) = 0 \begin{aligned} f_i(x^*)&\leq 0,\ i=1,\ldots,m\\ h_i(x^*)&=0,\ i=1,\ldots,p\\ \lambda_i^*&\geq 0,\ i=1,\ldots,m\\ \lambda_i^*f_i(x^*)&=0,\ i=1,\ldots,m\\ \nabla f_0(x^*)+\sum_{i=1}^m\lambda_i^*\nabla f_i(x^*)+\sum_{i=1}^p\nu_i^*\nabla h_i(x^*)&=0 \end{aligned} fi(x∗)hi(x∗)λi∗λi∗fi(x∗)∇f0(x∗)+i=1∑mλi∗∇fi(x∗)+i=1∑pνi∗∇hi(x∗)≤0, i=1,…,m=0, i=1,…,p≥0, i=1,…,m=0, i=1,…,m=0 -
when the primal problem is convex, the KKT conditions are also sufficient for the points to be primal and dual optimal.
5.6 Perturbation and sensitivity analysis
the perturbed problem
minimize
f
0
(
x
)
subject to
f
i
(
x
)
≤
u
i
,
i
=
1
,
…
,
m
h
i
(
x
)
=
v
i
,
i
=
1
,
…
,
p
\begin{aligned} \text{minimize}\quad &f_0(x)\\ \text{subject to}\;\;\;&f_i(x)\leq u_i,\quad i=1,\ldots,m\\ &h_i(x)=v_i,\quad i=1,\ldots,p \end{aligned}
minimizesubject tof0(x)fi(x)≤ui,i=1,…,mhi(x)=vi,i=1,…,p
- p ∗ ( u , v ) = inf { f 0 ( x ) ∣ ∃ x ∈ D , f i ( x ) ≤ u i , i = 1 , … , m , h i ( x ) = v i , i = 1 , … , p } p^*(u,v)=\inf\{f_0(x)\vert \exists x\in\mathcal{D},f_i(x)\leq u_i,i=1,\ldots,m,h_i(x)=v_i,i=1,\ldots,p\} p∗(u,v)=inf{f0(x)∣∃x∈D,fi(x)≤ui,i=1,…,m,hi(x)=vi,i=1,…,p}
- p ∗ ( 0 , 0 ) = g ( λ ∗ , ν ∗ ) ≤ f 0 ( x ) + λ ∗ T u + ν ∗ T v ⇒ p^*(0,0)=g(\lambda^*,\nu^*)\leq f_0(x)+\lambda^{*\mathrm{T}}u+\nu^{*\mathrm{T}}v\Rightarrow p∗(0,0)=g(λ∗,ν∗)≤f0(x)+λ∗Tu+ν∗Tv⇒ p ∗ ( u , v ) ≥ p ∗ ( 0 , 0 ) − λ ∗ T u − ν ∗ T v p^*(u,v)\geq p^*(0,0)-\lambda^{*\mathrm{T}}u-\nu^{*\mathrm{T}}v p∗(u,v)≥p∗(0,0)−λ∗Tu−ν∗Tv
5.7是examples, 5.8是约束可行性问题,5.9研究了generalized inequalities描述的对偶问题。