不等式约束问题
(P)
min
f
(
x
)
s.t.
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
\begin{array}{lll} \text { (P) } & \min & f(\mathbf{x}) \\ & \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, & i=1,2, \ldots, m \end{array}
(P) min s.t. f(x)gi(x)≤0,i=1,2,…,m
其中
f
,
g
1
,
⋯
,
g
m
f,g_1,\cdots,g_m
f,g1,⋯,gm是在
R
n
\mathbb{R}^n
Rn连续可微的函数
可行下降方向
min
b
(
x
)
s.t.
x
∈
C
\begin{array}{ll} \min & b(\mathbf{x}) \\ \text { s.t. } & \mathbf{x} \in C \end{array}
min s.t. b(x)x∈C
其中
h
h
h是在闭凸集
C
⊆
R
n
\mathbf{C}\subseteq \mathbb{R}^n
C⊆Rn上连续可微的函数。
设向量
d
≠
0
\mathbf{d}\neq 0
d=0 ,
x
∈
C
\mathbf{x}\in\mathbf{C}
x∈C
如果
∇
h
(
x
)
T
d
<
0
\nabla h(\mathbf{x})^T\mathbf{d}<0
∇h(x)Td<0,并且存在
ϵ
>
0
\epsilon>0
ϵ>0使得
∀
t
∈
[
0
,
ϵ
]
,
x
+
t
d
∈
C
\forall t\in\left[0,\epsilon\right],\mathbf{x}+t\mathbf{d}\in \mathbf{C}
∀t∈[0,ϵ],x+td∈C
则
d
\mathbf{d}
d是在
x
\mathbf{x}
x点的可行下降方向
引理1
考虑问题
(G)
min
h
(
x
)
s.t.
x
∈
C
.
\begin{array}{lll} \text { (G) } & \min & h(\mathbf{x}) \\ & \text { s.t. } & \mathbf{x} \in C . \end{array}
(G) min s.t. h(x)x∈C.
如果
x
∗
\mathbf{x}^*
x∗是一个局部最优解,则在
x
∗
\mathbf{x}^*
x∗没有可行下降方向
证明:
假设有下降方向,则存在
d
≠
0
,
ϵ
1
>
0
\mathbf{d}\neq 0,\epsilon_1>0
d=0,ϵ1>0
∀
t
∈
[
0
,
ϵ
1
]
,
x
∗
+
t
d
∈
C
,
∇
f
(
x
)
T
d
<
0
\forall t \in \left[0,\epsilon_1\right],\mathbf{x}^*+t\mathbf{d}\in\mathbf{C},\nabla f(\mathbf{x})^T\mathbf{d}<0
∀t∈[0,ϵ1],x∗+td∈C,∇f(x)Td<0
根据下降方向的性质
∃
ϵ
2
<
ϵ
1
,
∀
t
∈
[
0
,
ϵ
2
]
,
f
(
x
∗
+
t
d
)
<
f
(
x
∗
)
\exists \epsilon_2<\epsilon_1,\forall t\in\left[0,\epsilon_2\right],f(\mathbf{x}^*+t\mathbf{d})<f(\mathbf{x}^*)
∃ϵ2<ϵ1,∀t∈[0,ϵ2],f(x∗+td)<f(x∗)
和局部最优矛盾了
引理2
设
x
∗
\mathbf{x}^*
x∗是
min
f
(
x
)
s.t.
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
\begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \end{array}
min s.t. f(x)gi(x)≤0,i=1,2,…,m
的局部最优解
其中
f
,
g
1
,
⋯
,
g
m
f,g_1,\cdots,g_m
f,g1,⋯,gm是
R
n
\mathbb{R}^n
Rn上连续可微的函数
设
I
(
x
∗
)
=
{
i
:
g
i
(
x
∗
)
=
0
}
I(\mathbf{x}^*)=\left\{i: g_i(\mathbf{x}^*)=0\right\}
I(x∗)={i:gi(x∗)=0}
则不存在
d
∈
R
n
\mathbf{d}\in\mathbb{R}^n
d∈Rn,使得
∇
f
(
x
∗
)
T
d
<
0
,
∇
g
i
(
x
∗
)
T
d
<
0
,
i
∈
I
(
x
∗
)
\begin{aligned} &\nabla f\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \\ &\nabla g_{i}\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \quad i \in I\left(\mathbf{x}^{*}\right) \end{aligned}
∇f(x∗)Td<0,∇gi(x∗)Td<0,i∈I(x∗)
证明:
假设存在
d
\mathbf{d}
d使得
∇
f
(
x
∗
)
T
d
<
0
,
∇
g
i
(
x
∗
)
T
d
<
0
,
i
∈
I
(
x
∗
)
\begin{aligned} &\nabla f\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \\ &\nabla g_{i}\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \quad i \in I\left(\mathbf{x}^{*}\right) \end{aligned}
∇f(x∗)Td<0,∇gi(x∗)Td<0,i∈I(x∗)
那么存在
ϵ
1
>
0
\epsilon_1>0
ϵ1>0,
∀
t
∈
(
0
,
ϵ
1
)
,
i
∈
I
(
x
∗
)
\forall t \in (0,\epsilon_1),i\in I(\mathbf{x}^*)
∀t∈(0,ϵ1),i∈I(x∗)
使得
f
(
x
∗
+
t
d
)
<
f
(
x
∗
)
f(\mathbf{x}^*+t\mathbf{d})<f(\mathbf{x}^*)
f(x∗+td)<f(x∗)以及
g
i
(
x
∗
+
t
d
)
<
g
i
(
x
∗
)
=
0
g_i(\mathbf{x}^*+t\mathbf{d})<g_i(\mathbf{x}^*)=0
gi(x∗+td)<gi(x∗)=0
对于
i
∉
I
(
x
∗
)
i\notin I(\mathbf{x}^*)
i∈/I(x∗),有
g
i
(
x
∗
)
<
0
g_i(\mathbf{x}^*)<0
gi(x∗)<0
因此
∃
ϵ
2
>
0
,
∀
t
∈
(
0
,
ϵ
2
)
,
i
∉
I
(
x
∗
)
\exists \epsilon_2>0,\forall t\in (0,\epsilon_2),i\notin I(\mathbf{x}^*)
∃ϵ2>0,∀t∈(0,ϵ2),i∈/I(x∗),有
g
i
(
x
∗
+
t
d
)
<
0
g_i(\mathbf{x}^*+t\mathbf{d})<0
gi(x∗+td)<0
所以
∀
t
∈
(
0
,
min
{
ϵ
1
,
ϵ
2
}
)
\forall t\in \left(0,\min \left\{\epsilon_1,\epsilon_2\right\}\right)
∀t∈(0,min{ϵ1,ϵ2}),有
f
(
x
∗
+
t
d
)
<
f
(
x
∗
)
g
i
(
x
∗
+
t
d
)
<
0
,
i
=
1
,
2
,
…
,
m
,
\begin{aligned} &f\left(\mathbf{x}^{*}+t \mathbf{d}\right)<f\left(\mathbf{x}^{*}\right) \\ &g_{i}\left(\mathbf{x}^{*}+t \mathbf{d}\right)<0, \quad i=1,2, \ldots, m, \end{aligned}
f(x∗+td)<f(x∗)gi(x∗+td)<0,i=1,2,…,m,
与
x
∗
\mathbf{x}^*
x∗局部最优解矛盾
不等式约束问题的Fritz-John条件
设
x
∗
\mathbf{x}^*
x∗是
min
f
(
x
)
s.t.
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
\begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \end{array}
min s.t. f(x)gi(x)≤0,i=1,2,…,m
的局部最优解
其中
f
,
g
1
,
⋯
,
g
m
f,g_1,\cdots,g_m
f,g1,⋯,gm是
R
n
\mathbb{R}^n
Rn上连续可微的函数
则存在不全为0的
λ
0
,
λ
1
,
⋯
,
λ
m
≥
0
\lambda_0,\lambda_1,\cdots,\lambda_m\ge 0
λ0,λ1,⋯,λm≥0,使得
λ
0
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \lambda_{0} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
λ0∇f(x∗)+i=1∑mλi∇gi(x∗)λigi(x∗)=0,=0,i=1,2,…,m
证明:
由引理2
不存在
d
\mathbf{d}
d使得
∇
f
(
x
∗
)
T
d
<
0
,
∇
g
i
(
x
∗
)
T
d
<
0
,
i
∈
I
(
x
∗
)
\begin{aligned} &\nabla f\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \\ &\nabla g_{i}\left(\mathbf{x}^{*}\right)^{T} \mathbf{d}<0, \quad i \in I\left(\mathbf{x}^{*}\right) \end{aligned}
∇f(x∗)Td<0,∇gi(x∗)Td<0,i∈I(x∗)
其中
I
(
x
∗
)
=
{
i
:
g
i
(
x
∗
)
}
=
{
i
1
,
i
2
,
⋯
,
i
k
}
I(\mathbf{x}^*)=\left\{i:g_i(\mathbf{x}^*)\right\}=\left\{i_1,i_2,\cdots,i_k\right\}
I(x∗)={i:gi(x∗)}={i1,i2,⋯,ik}
这个等价于
A
d
<
0
\mathbf{Ad}<0
Ad<0
无解
其中
A
=
(
∇
f
(
x
∗
)
T
∇
g
i
1
(
x
∗
)
T
⋮
∇
g
i
k
(
x
∗
)
T
)
\mathbf{A}=\left(\begin{array}{c} \nabla f\left(\mathbf{x}^{*}\right)^{T} \\ \nabla g_{i_{1}}\left(\mathbf{x}^{*}\right)^{T} \\ \vdots \\ \nabla g_{i_{k}}\left(\mathbf{x}^{*}\right)^{T} \end{array}\right)
A=⎝⎜⎜⎜⎛∇f(x∗)T∇gi1(x∗)T⋮∇gik(x∗)T⎠⎟⎟⎟⎞
根据Gordan定理,
无解等价于存在
η
=
(
λ
0
,
⋯
,
λ
i
k
)
T
≠
0
\eta=\left(\lambda_0,\cdots,\lambda_{i_k}\right)^T\neq 0
η=(λ0,⋯,λik)T=0,使得
A
T
η
=
0
,
η
≥
0
\mathrm{A}^{T} \eta=0, \quad \eta \geq 0
ATη=0,η≥0
于是
λ
0
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \lambda_{0} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
λ0∇f(x∗)+i=1∑mλi∇gi(x∗)λigi(x∗)=0,=0,i=1,2,…,m
不等式约束问题的KKT条件
设
x
∗
\mathbf{x}^*
x∗是
min
f
(
x
)
s.t.
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
\begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \end{array}
min s.t. f(x)gi(x)≤0,i=1,2,…,m
的局部最优解
其中
f
,
g
1
,
⋯
,
g
m
f,g_1,\cdots,g_m
f,g1,⋯,gm是
R
n
\mathbb{R}^n
Rn上连续可微的函数
设
I
(
x
∗
)
=
{
i
:
g
i
(
x
∗
)
=
0
}
I(\mathbf{x}^*)=\left\{i:g_i(\mathbf{x}^*)=0\right\}
I(x∗)={i:gi(x∗)=0}
设
{
∇
g
i
(
x
∗
)
}
i
∈
I
(
x
∗
)
\left\{\nabla g_i(\mathbf{x}^*)\right\}_{i\in I(\mathbf{x}^*)}
{∇gi(x∗)}i∈I(x∗)线性无关
则存在
λ
1
,
⋯
,
λ
m
≥
0
\lambda_1,\cdots,\lambda_m\ge 0
λ1,⋯,λm≥0,使得
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
∇f(x∗)+i=1∑mλi∇gi(x∗)λigi(x∗)=0,=0,i=1,2,…,m
证明:
根据Fritz-John条件,存在不全为0的
λ
0
~
,
⋯
,
λ
m
~
≥
0
\tilde{\lambda_0},\cdots,\tilde{\lambda_m}\ge 0
λ0~,⋯,λm~≥0
使得
λ
0
~
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
~
∇
g
i
(
x
∗
)
=
0
,
λ
i
~
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \tilde{\lambda_0} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \tilde{\lambda_i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \tilde{\lambda_i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
λ0~∇f(x∗)+i=1∑mλi~∇gi(x∗)λi~gi(x∗)=0,=0,i=1,2,…,m
如果
λ
0
~
=
0
\tilde{\lambda_0}=0
λ0~=0,则
λ
i
~
=
0
\tilde{\lambda_i}=0
λi~=0,与
{
∇
g
i
(
x
∗
)
}
i
∈
I
(
x
∗
)
\left\{\nabla g_i(\mathbf{x}^*)\right\}_{i\in I(\mathbf{x}^*)}
{∇gi(x∗)}i∈I(x∗)线性无关
所以令
λ
i
=
λ
i
~
λ
0
~
\lambda_i=\frac{\tilde{\lambda_i}}{\tilde{\lambda_0}}
λi=λ0~λi~
得证
不等式与等式约束问题
不等式与等式约束问题的KKT条件
设
x
∗
\mathbf{x}^*
x∗是问题
min
f
(
x
)
s.t.
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
h
j
(
x
)
=
0
,
j
=
1
,
2
,
…
,
p
\begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \\ & h_{j}(\mathbf{x})=0, \quad j=1,2, \ldots, p \end{array}
min s.t. f(x)gi(x)≤0,i=1,2,…,mhj(x)=0,j=1,2,…,p
的局部最优解,
其中
f
,
g
1
,
⋯
,
g
m
,
h
1
,
⋯
,
h
p
f,g_1,\cdots,g_m,h_1,\cdots,h_p
f,g1,⋯,gm,h1,⋯,hp是在
R
n
\mathbb{R}^n
Rn上连续可微的函数
假设
{
∇
g
i
(
x
∗
)
:
i
∈
I
(
x
∗
)
}
∪
{
∇
h
j
(
x
∗
)
:
j
=
1
,
2
,
…
,
p
}
\left\{\nabla g_{i}\left(\mathbf{x}^{*}\right): i \in I\left(\mathbf{x}^{*}\right)\right\} \cup\left\{\nabla h_{j}\left(\mathbf{x}^{*}\right): j=1,2, \ldots, p\right\}
{∇gi(x∗):i∈I(x∗)}∪{∇hj(x∗):j=1,2,…,p}
线性无关
其中
I
(
x
∗
)
=
{
i
:
g
i
(
x
∗
)
=
0
}
I(\mathbf{x}^*)=\left\{i:g_i(\mathbf{x}^*)=0\right\}
I(x∗)={i:gi(x∗)=0}
则存在
λ
1
,
⋯
,
λ
m
≥
0
\lambda_1,\cdots,\lambda_m\ge 0
λ1,⋯,λm≥0和
μ
1
,
⋯
,
μ
p
∈
R
\mu_1,\cdots,\mu_p\in \mathbb{R}
μ1,⋯,μp∈R,使得
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
+
∑
j
=
1
p
μ
j
∇
h
j
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \mu_{j} \nabla h_{j}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
∇f(x∗)+i=1∑mλi∇gi(x∗)+j=1∑pμj∇hj(x∗)λigi(x∗)=0,=0,i=1,2,…,m
KKT点
一个可行解
x
∗
\mathbb{x}^*
x∗
如果存在
λ
1
,
⋯
,
λ
m
≥
0
\lambda_1,\cdots,\lambda_m\ge 0
λ1,⋯,λm≥0和
μ
1
,
⋯
,
μ
p
∈
R
\mu_1,\cdots,\mu_p\in \mathbb{R}
μ1,⋯,μp∈R,使得
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
+
∑
j
=
1
p
μ
j
∇
h
j
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \mu_{j} \nabla h_{j}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
∇f(x∗)+i=1∑mλi∇gi(x∗)+j=1∑pμj∇hj(x∗)λigi(x∗)=0,=0,i=1,2,…,m
则
x
∗
\mathbf{x}^*
x∗称为KKT点
规范(regularity)
一个可行解
x
∗
\mathbf{x}^*
x∗
如果
{
∇
g
i
(
x
∗
)
:
i
∈
I
(
x
∗
)
}
∪
{
∇
h
j
(
x
∗
)
:
j
=
1
,
2
,
…
,
p
}
\left\{\nabla g_{i}\left(\mathbf{x}^{*}\right): i \in I\left(\mathbf{x}^{*}\right)\right\} \cup\left\{\nabla h_{j}\left(\mathbf{x}^{*}\right): j=1,2, \ldots, p\right\}
{∇gi(x∗):i∈I(x∗)}∪{∇hj(x∗):j=1,2,…,p}
线性无关,则称为规范(regularity)
凸的条件下
凸优化问题中KKT条件的充分性
设
x
∗
\mathbf{x}^*
x∗是问题
min
f
(
x
)
s.t.
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
h
j
(
x
)
=
0
,
j
=
1
,
2
,
…
,
p
\begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \\ & h_{j}(\mathbf{x})=0, \quad j=1,2, \ldots, p \end{array}
min s.t. f(x)gi(x)≤0,i=1,2,…,mhj(x)=0,j=1,2,…,p
的一个可行解
其中
f
,
g
1
,
⋯
,
g
m
f,g_1,\cdots,g_m
f,g1,⋯,gm是在
R
n
\mathbb{R}^n
Rn上连续可微的凸函数
h
1
,
⋯
,
h
p
h_1,\cdots,h_p
h1,⋯,hp是仿射函数
如果存在
λ
1
,
⋯
,
λ
m
≥
0
\lambda_1,\cdots,\lambda_m\ge 0
λ1,⋯,λm≥0和
μ
1
,
⋯
,
μ
p
∈
R
\mu_1,\cdots,\mu_p\in \mathbb{R}
μ1,⋯,μp∈R,使得
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
+
∑
j
=
1
p
μ
j
∇
h
j
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \mu_{j} \nabla h_{j}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
∇f(x∗)+i=1∑mλi∇gi(x∗)+j=1∑pμj∇hj(x∗)λigi(x∗)=0,=0,i=1,2,…,m
则
x
∗
\mathbf{x}^*
x∗是最优解
证明:
s
(
x
)
=
f
(
x
)
+
∑
i
=
1
m
λ
i
g
i
(
x
)
+
∑
j
=
1
p
μ
j
h
j
(
x
)
s(\mathbf{x})=f(\mathbf{x})+\sum_{i=1}^{m} \lambda_{i} g_{i}(\mathbf{x})+\sum_{j=1}^{p} \mu_{j} h_{j}(\mathbf{x})
s(x)=f(x)+i=1∑mλigi(x)+j=1∑pμjhj(x)
∇
s
(
x
∗
)
=
0
\nabla s(\mathbf{x}^*)=0
∇s(x∗)=0,所以
s
(
x
∗
)
≤
s
(
x
)
s(\mathbf{x}^*)\le s(\mathbf{x})
s(x∗)≤s(x)
所以
f
(
x
∗
)
=
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
g
i
(
x
∗
)
+
∑
j
=
1
p
μ
j
h
j
(
x
∗
)
=
s
(
x
∗
)
≤
s
(
x
)
=
f
(
x
)
+
∑
i
=
1
m
λ
i
g
i
(
x
)
+
∑
j
=
1
p
μ
j
h
j
(
x
)
≤
f
(
x
)
\begin{aligned} f\left(\mathbf{x}^{*}\right) &=f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \mu_{j} h_{j}\left(\mathbf{x}^{*}\right) \\ &=s\left(\mathbf{x}^{*}\right) \\ & \leq s(\mathbf{x}) \\ &=f(\mathbf{x})+\sum_{i=1}^{m} \lambda_{i} g_{i}(\mathbf{x})+\sum_{j=1}^{p} \mu_{j} h_{j}(\mathbf{x}) \\ & \leq f(\mathbf{x}) \end{aligned}
f(x∗)=f(x∗)+i=1∑mλigi(x∗)+j=1∑pμjhj(x∗)=s(x∗)≤s(x)=f(x)+i=1∑mλigi(x)+j=1∑pμjhj(x)≤f(x)
不等式约束问题中KKT条件下Slater条件的必要性
设
x
∗
\mathbf{x}^*
x∗是
min
f
(
x
)
s.t.
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
\begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m \end{array}
min s.t. f(x)gi(x)≤0,i=1,2,…,m
的局部最优解
其中
f
,
g
1
,
⋯
,
g
m
f,g_1,\cdots,g_m
f,g1,⋯,gm是
R
n
\mathbb{R}^n
Rn上连续可微的函数,
并且
g
1
,
⋯
,
g
m
g_1,\cdots,g_m
g1,⋯,gm是凸函数
假设存在
x
^
∈
R
n
\hat{\mathbf{x}}\in \mathbb{R}^n
x^∈Rn,使得
g
i
(
x
^
)
<
0
,
i
=
1
,
2
,
…
,
m
g_{i}(\hat{\mathbf{x}})<0, \quad i=1,2, \ldots, m
gi(x^)<0,i=1,2,…,m
则存在
λ
1
,
⋯
,
λ
m
≥
0
\lambda_1,\cdots,\lambda_m\ge 0
λ1,⋯,λm≥0,使得
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
∇f(x∗)+i=1∑mλi∇gi(x∗)λigi(x∗)=0,=0,i=1,2,…,m
证明:
因为
x
∗
\mathbf{x}^*
x∗是局部最优解,所以一定满足Fritz-John条件
即存在不全为0的
λ
0
,
λ
1
,
⋯
,
λ
m
≥
0
\lambda_0,\lambda_1,\cdots,\lambda_m\ge 0
λ0,λ1,⋯,λm≥0,使得
λ
0
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
\begin{aligned} \lambda_{0} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m \end{aligned}
λ0∇f(x∗)+i=1∑mλi∇gi(x∗)λigi(x∗)=0,=0,i=1,2,…,m
假设
λ
0
=
0
\lambda_0=0
λ0=0,则
∑
i
=
1
m
λ
~
i
∇
g
i
(
x
∗
)
=
0
\sum_{i=1}^{m} \tilde{\lambda}_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)=0
i=1∑mλ~i∇gi(x∗)=0
由凸函数的一阶条件
0
>
g
i
(
x
^
)
≥
g
i
(
x
∗
)
+
∇
g
i
(
x
∗
)
T
(
x
^
−
x
∗
)
0>g_{i}(\hat{\mathbf{x}}) \geq g_{i}\left(\mathbf{x}^{*}\right)+\nabla g_{i}\left(\mathbf{x}^{*}\right)^{T}\left(\hat{\mathbf{x}}-\mathbf{x}^{*}\right)
0>gi(x^)≥gi(x∗)+∇gi(x∗)T(x^−x∗)
于是
0
>
∑
i
=
1
m
λ
~
i
g
i
(
x
∗
)
+
[
∑
i
=
1
m
λ
~
i
∇
g
i
(
x
∗
)
]
T
(
x
^
−
x
∗
)
=
0
0>\sum_{i=1}^{m} \tilde{\lambda}_{i} g_{i}\left(\mathbf{x}^{*}\right)+\left[\sum_{i=1}^{m} \tilde{\lambda}_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)\right]^{T}\left(\hat{\mathbf{x}}-\mathbf{x}^{*}\right)=0
0>i=1∑mλ~igi(x∗)+[i=1∑mλ~i∇gi(x∗)]T(x^−x∗)=0
矛盾
所以
λ
0
>
0
\lambda_0>0
λ0>0,然后取
λ
i
=
λ
i
~
λ
0
\lambda_i=\frac{\tilde{\lambda_i}}{\lambda_0}
λi=λ0λi~
得证
广义Slater条件
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
,
h
j
(
x
)
≤
0
,
j
=
1
,
2
,
…
,
p
,
s
k
(
x
)
=
0
,
k
=
1
,
2
,
…
,
q
,
\begin{array}{ll} g_{i}(\mathbf{x}) \leq 0, & i=1,2, \ldots, m, \\ h_{j}(\mathbf{x}) \leq 0, & j=1,2, \ldots, p, \\ s_{k}(\mathbf{x})=0, & k=1,2, \ldots, q, \end{array}
gi(x)≤0,hj(x)≤0,sk(x)=0,i=1,2,…,m,j=1,2,…,p,k=1,2,…,q,
其中
g
i
g_i
gi是凸函数,
h
j
,
s
k
h_j,s_k
hj,sk是仿射函数
如果存在
x
^
∈
R
n
\hat{\mathbf{x}}\in\mathbb{R}^n
x^∈Rn,满足
g
i
(
x
^
)
<
0
,
i
=
1
,
2
,
…
,
m
,
h
j
(
x
^
)
≤
0
,
j
=
1
,
2
,
…
,
p
,
s
k
(
x
^
)
=
0
,
k
=
1
,
2
,
…
,
q
,
\begin{array}{ll} g_{i}(\hat{\mathbf{x}}) < 0, & i=1,2, \ldots, m, \\ h_{j}(\hat{\mathbf{x}}) \leq 0, & j=1,2, \ldots, p, \\ s_{k}(\hat{\mathbf{x}})=0, & k=1,2, \ldots, q, \end{array}
gi(x^)<0,hj(x^)≤0,sk(x^)=0,i=1,2,…,m,j=1,2,…,p,k=1,2,…,q,
则满足广义Slater条件
不等式与等式约束问题的KKT条件下的广义Slater条件的必要性
设
x
∗
\mathbf{x}^*
x∗是问题
min
f
(
x
)
s.t.
g
i
(
x
)
≤
0
,
i
=
1
,
2
,
…
,
m
,
h
j
(
x
)
≤
0
,
j
=
1
,
2
,
…
,
p
,
s
k
(
x
)
=
0
,
k
=
1
,
2
,
…
,
q
,
\begin{array}{ll} \min & f(\mathbf{x}) \\ \text { s.t. } & g_{i}(\mathbf{x}) \leq 0, \quad i=1,2, \ldots, m, \\ & h_{j}(\mathbf{x}) \leq 0, \quad j=1,2, \ldots, p, \\ & s_{k}(\mathbf{x})=0, \quad k=1,2, \ldots, q, \end{array}
min s.t. f(x)gi(x)≤0,i=1,2,…,m,hj(x)≤0,j=1,2,…,p,sk(x)=0,k=1,2,…,q,
的最优解
其中
f
,
g
1
,
⋯
,
g
m
f,g_1,\cdots,g_m
f,g1,⋯,gm是在
R
n
\mathbb{R}^n
Rn上的连续可微凸函数,
h
j
,
s
k
h_j,s_k
hj,sk是仿射函数
如果存在
x
^
∈
R
n
\hat{\mathbf{x}}\in\mathbb{R}^n
x^∈Rn,满足
g
i
(
x
^
)
<
0
,
i
=
1
,
2
,
…
,
m
,
h
j
(
x
^
)
≤
0
,
j
=
1
,
2
,
…
,
p
,
s
k
(
x
^
)
=
0
,
k
=
1
,
2
,
…
,
q
,
\begin{array}{ll} g_{i}(\hat{\mathbf{x}}) < 0, & i=1,2, \ldots, m, \\ h_{j}(\hat{\mathbf{x}}) \leq 0, & j=1,2, \ldots, p, \\ s_{k}(\hat{\mathbf{x}})=0, & k=1,2, \ldots, q, \end{array}
gi(x^)<0,hj(x^)≤0,sk(x^)=0,i=1,2,…,m,j=1,2,…,p,k=1,2,…,q,
则存在
λ
1
,
⋯
,
λ
m
,
η
1
,
⋯
,
η
p
≥
0
,
μ
1
,
⋯
,
μ
q
∈
R
\lambda_1,\cdots,\lambda_m,\eta_1,\cdots,\eta_p\ge 0,\mu_1,\cdots,\mu_q\in\mathbb{R}
λ1,⋯,λm,η1,⋯,ηp≥0,μ1,⋯,μq∈R,使得
∇
f
(
x
∗
)
+
∑
i
=
1
m
λ
i
∇
g
i
(
x
∗
)
+
∑
j
=
1
p
η
j
∇
h
j
(
x
∗
)
+
∑
k
=
1
q
μ
k
∇
s
k
(
x
∗
)
=
0
,
λ
i
g
i
(
x
∗
)
=
0
,
i
=
1
,
2
,
…
,
m
,
η
j
h
j
(
x
∗
)
=
0
,
j
=
1
,
2
,
…
,
p
.
\begin{aligned} \nabla f\left(\mathbf{x}^{*}\right)+\sum_{i=1}^{m} \lambda_{i} \nabla g_{i}\left(\mathbf{x}^{*}\right)+\sum_{j=1}^{p} \eta_{j} \nabla h_{j}\left(\mathbf{x}^{*}\right)+\sum_{k=1}^{q} \mu_{k} \nabla s_{k}\left(\mathbf{x}^{*}\right) &=0, \\ \lambda_{i} g_{i}\left(\mathbf{x}^{*}\right) &=0, \quad i=1,2, \ldots, m, \\ \eta_{j} h_{j}\left(\mathbf{x}^{*}\right) &=0, \quad j=1,2, \ldots, p . \end{aligned}
∇f(x∗)+i=1∑mλi∇gi(x∗)+j=1∑pηj∇hj(x∗)+k=1∑qμk∇sk(x∗)λigi(x∗)ηjhj(x∗)=0,=0,i=1,2,…,m,=0,j=1,2,…,p.
二阶优化条件
再说吧