《优化方法基础》知识点总结
本总结适用于课程《优化方法基础》,结合《凸优化》教材(Convex Optimization,written by Stephen Boyd),以及老师PPT总结而成,适用于期末复习以及快速查阅知识点使用。笔记基本无误,但仍然请以老师上课所讲为准。 同时,这份笔记主要是为了节省不必要的整理时间,并不是上课不听讲的理由 ,如果因为上课不听讲造成的问题作者概不负责。制作不易,请勿转载,也禁止以盈利为目的打印销售。版权所有,违法必究。
第一章 凸集和凸函数
- 最小二乘问题分析解 x ∗ = ( A ⊤ A ) − 1 A ⊤ b x^*=\left( A^{\top}A \right) ^{-1}A^{\top}b x∗=(A⊤A)−1A⊤b
- 最小二乘计算时间复杂度 O ( n 2 m ) , A ∈ R m × n O\left( n^2m \right) \text{,}A\in \mathbb{R}^{m\times n} O(n2m),A∈Rm×n
- 对偶范数 ∥ z ∥ ∗ = s u p { z T x : ∥ x ∥ ≤ 1 } = s u p { z T x : ∥ x ∥ = 1 } \left\| z \right\| _*=sup\!\:\left\{ z^Tx:\mathrm{ }\left\| x \right\| \le 1 \right\} =sup\!\:\left\{ z^Tx:\mathrm{ }\left\| x \right\| =1 \right\} ∥z∥∗=sup{zTx:∥x∥≤1}=sup{zTx:∥x∥=1}
- 凸集 z = θ x + ( 1 − θ ) y ∈ C z=\theta x+\left( 1-\theta \right) y\in C z=θx+(1−θ)y∈C
- 凸函数 f ( λ x + ( 1 − λ ) y ) ≤ λ f ( x ) + ( 1 − λ ) f ( y ) f\left( \lambda x+\left( 1-\lambda \right) y \right) \le \lambda f\left( x \right) +\left( 1-\lambda \right) f(y) f(λx+(1−λ)y)≤λf(x)+(1−λ)f(y)
- 范数定义:
① f f f是非负的: ∀ x ∈ R n \forall x\in \mathbb{R}^n ∀x∈Rn,有 f ( x ) ≥ 0 f\left( x \right) \ge 0 f(x)≥0
② f f f是正定的:若 f ( x ) = 0 f\left( x \right) =0 f(x)=0,则 x = 0 x=0 x=0
③ f f f是齐次的: ∀ x ∈ R n \forall x\in \mathbb{R}^n ∀x∈Rn和 t ∈ R t\in\mathbb{R} t∈R,有 f ( t x ) = ∣ t ∣ f ( x ) f\left( tx \right) =|t|f\left( x \right) f(tx)=∣t∣f(x)
④ f f f满足三角不等式: ∀ x , y ∈ R n \forall x,y\in \mathbb{R}^n ∀x,y∈Rn,有 f ( x + y ) ≤ f ( x ) + f ( y ) f\left( x+y \right) \le f\left( x \right) +f(y) f(x+y)≤f(x)+f(y) - 二次范数 ∥ x ∥ P = ( x ⊤ P x ) 1 2 = ∥ P 1 2 x ∥ 2 \left\| x \right\| _P=\left( x^{\top}Px \right) ^{\frac{1}{2}}=\left\| P^{\frac{1}{2}}x \right\| _2 ∥x∥P=(x⊤Px)21= P21x 2
第二章 对偶与最优性条件
- 拉格朗日函数 L ( x , λ , v ) = f 0 ( x ) + ∑ i = 1 m λ i f i ( x ) + ∑ i = 1 p v i h i ( x ) L\left( x,\lambda ,v \right) =f_0\left( x \right) +\sum\nolimits_{i=1}^m{\lambda _if_i(x)}+\sum\nolimits_{i=1}^p{v_ih_i(x)} L(x,λ,v)=f0(x)+∑i=1mλifi(x)+∑i=1pvihi(x)
- 拉格朗日对偶函数 L D ( λ , v ) = i n f x ∈ X L ( x , λ , v ) = i n f x ∈ X [ f 0 ( x ) + ∑ i = 1 m λ i f i ( x ) + ∑ i = 1 p v i h i ( x ) ] L_D\left( \lambda ,v \right) =\underset{x\in X}{\mathrm{inf}}\!\:L\left( x,\lambda ,v \right) =\underset{x\in X}{\mathrm{inf}}\!\:\left[ f_0\left( x \right) +\sum\nolimits_{i=1}^m{\lambda _if_i\left( x \right)}+\sum\nolimits_{i=1}^p{v_ih_i\left( x \right)} \right] LD(λ,v)=x∈XinfL(x,λ,v)=x∈Xinf[f0(x)+∑i=1mλifi(x)+∑i=1pvihi(x)]
- 拉格朗日对偶问题 m a x λ , v L D ( λ , v ) s . t . λ ≥ 0 \underset{\lambda ,v}{max} \!\:L_D\left( \lambda ,v \right) \mathrm{ }s.t.\mathrm{ }\lambda \ge 0 λ,vmaxLD(λ,v)s.t.λ≥0
- 强对偶性 d ∗ ≤ p ∗ ,即 min x ∈ X max λ ≥ 0 L ( x , λ , v ) ≥ max λ ≥ 0 min x ∈ X L ( x , λ , v ) d^*\le p^*\text{,即}\min_{x\in X} \!\:\max_{\lambda \ge 0} \!\:L\left( x,\lambda ,v \right) \ge \max_{\lambda \ge 0} \!\:\underset{x\in X}{\min}\mathrm{ }L\left( x,\lambda ,v \right) d∗≤p∗,即x∈Xminλ≥0maxL(x,λ,v)≥λ≥0maxx∈XminL(x,λ,v)
- 弱对偶性 d ∗ = p ∗ ,即 min x ∈ X max λ ≥ 0 L ( x , λ , v ) = max λ ≥ 0 min x ∈ X L ( x , λ , v ) d^*=p^*\text{,即}\min_{x\in X} \!\:\max_{\lambda \ge 0} \!\:L\left( x,\lambda ,v \right) =\max_{\lambda \ge 0} \!\:\underset{x\in X}{\min}\mathrm{ }L\left( x,\lambda ,v \right) d∗=p∗,即x∈Xminλ≥0maxL(x,λ,v)=λ≥0maxx∈XminL(x,λ,v)
- 最优对偶间隙 p ∗ − d ∗ p^*-d^* p∗−d∗
- 对偶间隙 f 0 ( x ) − L D ( λ , v ) f_0\mathrm{ }\left( x \right) -L_D\left( \lambda ,v \right) f0(x)−LD(λ,v)
- Slater约束品性 存在集合 X 的 X\text{的} X的一个内点 x 0 x_0 x0,使得 f i ( x 0 ) < 0 ( i = 1 , 2 , ⋯ , m ) , A x 0 = b , f_i\left( x_0 \right) <0\left( i=1,2,\cdots ,m \right) ,\mathrm{ }Ax_0=b\text{,} fi(x0)<0(i=1,2,⋯,m),Ax0=b,则凸优化问题的强对偶性成立。
- 通过解对偶问题求解原问题 最优对偶间隙为0,当对偶问题的求解比原问题更简单时,可通过先求解对偶问题得到 ( λ ∗ , v ∗ ) (\lambda ^*,v^*) (λ∗,v∗),再求解优化问题 x ∗ = a r g m i n x ∈ X L ( x , λ ∗ , v ∗ ) = f 0 ( x ) + ∑ i = 1 m λ i ∗ f i ( x ) + ∑ i = 1 p v i ∗ h i ( x ) x^*=arg\underset{x\in X}{min}L\left( x,\lambda ^*,v^* \right) =f_0\left( x \right) +\sum\nolimits_{i=1}^m{\lambda _{i}^{*}f_i\left( x \right)}+\sum\nolimits_{i=1}^p{v_{i}^{*}h_i\left( x \right)} x∗=argx∈XminL(x,λ∗,v∗)=f0(x)+∑i=1mλi∗fi(x)+∑i=1pvi∗hi(x)
- KKT方程
{ ∇ f 0 ( x ∗ ) + ∑ i = 1 m λ i ∗ ∇ f i ( x ∗ ) + ∑ i = 1 k v i ∗ ∇ h i ( x ∗ ) = 0 f i ( x ∗ ) ≤ 0 , i = 1 , 2 , ⋯ , m λ i ∗ f i ( x ∗ ) = 0 , i = 1 , 2 , ⋯ , m (互补松弛性) λ i ∗ ≥ 0 , i = 1 , 2 , ⋯ , m h i ( x ∗ ) = 0 , i = 1 , 2 , ⋯ , p \left\{ \begin{array}{c} \begin{array}{c} \nabla f_0\left( x^* \right) +\sum\nolimits_{i=1}^m{\lambda _{i}^{*}\nabla f_i\left( x^* \right)}+\sum\nolimits_{i=1}^k{v_{i}^{*}\nabla h_i\left( x^* \right)}=0\\ f_i\left( x^* \right) \le 0,\mathrm{ }i=1,2,\cdots ,m\\\end{array}\\ \begin{array}{c} \lambda _{i}^{*}f_i\left( x^* \right) =0,\mathrm{ }i=1,2,\cdots ,m\text{ (互补松弛性)}\\ \lambda _{i}^{*}\ge 0,\mathrm{ }i=1,2,\cdots ,m\\\end{array}\\ h_i\left( x^* \right) =0,\mathrm{ }i=1,2,\cdots ,p\\\end{array} \right. ⎩ ⎨ ⎧∇f0(x∗)+∑i=1mλi∗∇fi(x∗)+∑i=1kvi∗∇hi(x∗)=0fi(x∗)≤0,i=1,2,⋯,mλi∗fi(x∗)=0,i=1,2,⋯,m (互补松弛性)λi∗≥0,i=1,2,⋯,mhi(x∗)=0,i=1,2,⋯,p - 互补松弛性 λ i ∗ f i ( x ∗ ) = 0 , i = 1 , 2 , ⋯ , m \lambda _{i}^{*}f_i\left( x^* \right) =0,\mathrm{ }i=1,2,\cdots ,m λi∗fi(x∗)=0,i=1,2,⋯,m 即导数=0
第三章 无约束优化问题
-
最优性条件 K K T 最优性条件: K K T 方程 KKT最优性条件:KKT方程 KKT最优性条件:KKT方程
-
次优性条件:概念上的停止准则 f ( x ) − p ∗ ≤ 1 2 m ∣ ∣ ∇ f ( x ) ∣ ∣ 2 2 f\left( x \right) -p^*\le \frac{1}{2m}||\nabla f\left( x \right) ||_{2}^{2} f(x)−p∗≤2m1∣∣∇f(x)∣∣22
-
强凸性假设 存在 m > 0 满足 ∀ x ∈ S , ∇ 2 f ( x ) ≥ m I m>0\text{满足}\forall x\in S\text{, }\nabla ^2f\left( x \right) \ge mI m>0满足∀x∈S, ∇2f(x)≥mI
-
Hessian矩阵上界 存在 M > 0 满足 ∀ x ∈ S , ∇ 2 f ( x ) ≤ M I M>0\text{满足}\forall x\in S\text{, }\nabla ^2f\left( x \right) \le MI M>0满足∀x∈S, ∇2f(x)≤MI
-
通用下降算法
① 给定初始点 x 0 ∈ d o m f , x^0\in dom\mathrm{ }f\text{,} x0∈domf,令 k = 0 k=0 k=0
② 停止条件 ∥ ∇ f ( x k ) ∥ ≤ ε \left\| \nabla f\left( x^k \right) \right\| \le \varepsilon ∇f(xk) ≤ε
③ 确定 x k x^k xk处下降方向 d k d^k dk使满足 ∃ t ^ > 0 , \exists \hat{t}>0\text{,} ∃t^>0,
f ( x k + t d k ) < f ( x k ) , ∀ t ∈ ( 0 , t ^ ) f\left( x^k+td^k \right) <f\left( x^k \right) ,\mathrm{ }\forall t\in \left( 0,\hat{t} \right) f(xk+tdk)<f(xk),∀t∈(0,t^)
④ 直线搜索:确定步长 t k > 0 t^k>0 tk>0使
f ( x k + t k d k ) < f ( x k ) f\left( x^k+t^kd^k \right) <f\left( x^k \right) f(xk+tkdk)<f(xk)
⑤ 更新 x k + 1 = x k + t k d k , k = k + 1 , x^{k+1}=x^k+t^kd^k\text{,}k=k+1\text{,} xk+1=xk+tkdk,k=k+1,回② -
精确直线搜索 t = a r g m i n s ≥ 0 f ( x + s d ) t=argmin_{s\ge 0}f(x+sd) t=argmins≥0f(x+sd)
-
回溯直线搜索
① 给定 d k 及 0 < α < 0.5 , 0 < β < 1 d^k\text{及}0<\alpha <0.5,\mathrm{ }0<\beta <1 dk及0<α<0.5,0<β<1
② t = 1 t=1 t=1
③ 当 f ( x + t d k ) > f ( x ) + α t ∇ f ( x ) ⊤ d k , t ≔ β t f\left( x+td^k \right) >f\left( x \right) +\alpha t\nabla f\left( x \right) ^{\top}d^k\text{,}\mathrm{ }t\coloneqq \beta t f(x+tdk)>f(x)+αt∇f(x)⊤dk,t:=βt
结束 -
梯度下降方法
① 给定初始点 x 0 ∈ d o m f , x^0\in dom\mathrm{ }f\text{,} x0∈domf,令 k = 0 k=0 k=0
② 判断是否停止:如果 ∣ ∣ ∇ f ( x k ) ∣ ∣ ≤ ε ||\nabla f(x^k)||\le \varepsilon ∣∣∇f(xk)∣∣≤ε,停止
③ d k = − ∇ f ( x k ) d^k=-\nabla f\left( x^k \right) dk=−∇f(xk)
④ 直线搜索:确定 t k > 0 t^k>0 tk>0满足
f ( x k + t k d k ) < f ( x k ) f\left( x^k+t^kd^k \right) <f\left( x^k \right) f(xk+tkdk)<f(xk)
⑤ 更新 x k + 1 = x k + t k d k , k k + 1 , x^{k+1}=x^k+t^kd^k\text{,}kk+1\text{,} xk+1=xk+tkdk,kk+1,回② -
下降方向 d k = − ∇ f ( x k ) d^k=-\nabla f(x^k) dk=−∇f(xk)
-
收敛性
{ f ( x k ) − f ( x k + 1 ) ≥ 1 2 M ∣ ∣ ∇ f ( x k ) ∣ ∣ 2 f ( x ) − p ∗ ≤ 1 2 m ∣ ∣ ∇ f ( x ) ∣ ∣ 2 2 ⇒ f ( x K ) − p ∗ ≤ ( 1 − m M ) K ( f ( x 0 ) − p ∗ ) \left\{ \begin{array}{c} f\left( x^k \right) -f\left( x^{k+1} \right) \ge \frac{1}{2M}||\nabla f\left( x^k \right) ||^2\\ f\left( x \right) -p^*\le \frac{1}{2m}||\nabla f\left( x \right) ||_{2}^{2}\\\end{array}\Rightarrow f\left( x^K \right) -p^*\le \left( 1-\frac{m}{M} \right) ^K\left( f\left( x^0 \right) -p^* \right) \right. {f(xk)−f(xk+1)≥2M1∣∣∇f(xk)∣∣2f(x)−p∗≤2m1∣∣∇f(x)∣∣22⇒f(xK)−p∗≤(1−Mm)K(f(x0)−p∗)
所以有收敛次数的上界为: f ( x K ) − p ∗ ≤ ε ⟹ K ≤ log ( ( f ( x 0 ) − p ∗ ) ε ) log ( 1 c ) f\left( x^K \right) -p^*\le \varepsilon \mathrm{ }\Longrightarrow \mathrm{ }K\le \frac{\log \!\:\left( \frac{\left( f\left( x^0 \right) -p^* \right)}{\varepsilon} \right)}{\log \!\:\left( \frac{1}{c} \right)} f(xK)−p∗≤ε⟹K≤log(c1)log(ε(f(x0)−p∗))
采用回溯直线搜索时 c = ( 1 − min { 2 m α , 2 β α m M } ) c=\left( 1-\min \!\:\left\{ 2m\alpha ,\frac{2\beta \alpha m}{M} \right\} \right) c=(1−min{2mα,M2βαm})
采用精确直线搜索时 c = 1 − m M c=1-\frac{m}{M} c=1−Mm -
规范化最速下降方法
① 给定初始点 x 0 ∈ d o m f , x^0\in dom\mathrm{ }f\text{,} x0∈domf,令 k = 0 k=0 k=0
② 判断:如果 ∣ ∣ ∇ f ( x k ) ∣ ∣ ≤ ε ||\nabla f(x^k)||\le \varepsilon ∣∣∇f(xk)∣∣≤ε,停止
③ d n s d k = a r g min { ∇ f ( x k ) T d ∣ s . t . ∥ d ∥ = 1 } \mathrm{ }d_{nsd}^{k}=\mathrm{arg}\!\:\min \!\:\left\{ \nabla f\left( x^k \right) ^Td \right|s.t.\mathrm{ }\left\| d \right\| =1\} dnsdk=argmin{∇f(xk)Td s.t.∥d∥=1}
④ 直线搜索:确定 t k > 0 t^k>0 tk>0满足
f ( x k + t k d k ) < f ( x k ) f\left( x^k+t^kd^k \right) <f\left( x^k \right) f(xk+tkdk)<f(xk)
⑤ 更新 x k + 1 = x k + t k d k , k k + 1 , x^{k+1}=x^k+t^kd^k,kk+1\text{,} xk+1=xk+tkdk,kk+1,回② -
下降方向 d n s d k = a r g min { ∇ f ( x k ) T d ∣ s . t . ∥ d ∥ = 1 } \mathrm{ }d_{nsd}^{k}=\mathrm{arg}\!\:\min \!\:\left\{ \nabla f\left( x^k \right) ^Td \right|s.t.\mathrm{ }\left\| d \right\| =1\} dnsdk=argmin{∇f(xk)Td s.t.∥d∥=1}
-
非规范化最速下降方法 下降方向 d n s d k = a r g min { ∇ f ( x k ) T d ∣ s . t . ∥ d ∥ = 1 } \mathrm{ }d_{nsd}^{k}=\mathrm{arg}\!\:\min \!\:\left\{ \nabla f\left( x^k \right) ^Td \right|s.t.\mathrm{ }\left\| d \right\| =1\} dnsdk=argmin{∇f(xk)Td s.t.∥d∥=1}
-
最速下降方法收敛性
{ f ( x k + 1 ) ≤ f ( x k ) − α γ ~ 2 m i n { 1 , β γ 2 M } ∥ ∇ f ( x k ) ∥ 2 2 f ( x ) − p ∗ ≤ 1 2 m ∣ ∣ ∇ f ( x ) ∣ ∣ 2 2 \left\{ \begin{array}{c} f\left( x^{k+1} \right) \le f\left( x^k \right) -\alpha \tilde{\gamma}^2min\left\{ 1,\frac{\beta \gamma ^2}{M} \right\} \left\| \nabla f\left( x^k \right) \right\| _{2}^{2}\\ f\left( x \right) -p^*\le \frac{1}{2m}||\nabla f\left( x \right) ||_{2}^{2}\\\end{array} \right. {f(xk+1)≤f(xk)−αγ~2min{1,Mβγ2} ∇f(xk) 22f(x)−p∗≤2m1∣∣∇f(x)∣∣22
K ≤ log ( ( f ( x 0 ) − p ∗ ) ε ) log ( 1 c ) , c = 1 − 2 m α γ ~ 2 min { 1 , β γ 2 M } < 1 K\le \frac{\log \!\:\left( \frac{\left( f\left( x^0 \right) -p^* \right)}{\varepsilon} \right)}{\log \!\:\left( \frac{1}{c} \right)}\text{,}c=1-2m\alpha \tilde{\gamma}^2\min \left\{ 1,\frac{\beta \gamma ^2}{M} \right\} <1 K≤log(c1)log(ε(f(x0)−p∗)),c=1−2mαγ~2min{1,Mβγ2}<1 -
Newton下降方法的三种理解
① 二阶近似最优解: d n t k = a r g min v f ^ ( x k + v ) = f ( x ) + ∇ f ( x ) ⊤ v + 1 2 v ⊤ ∇ 2 f ( x ) v d_{nt}^{k}=\underset{v}{\mathrm{arg}\min}\!\:\hat{f}\left( x^k+v \right) =f\left( x \right) +\nabla f\left( x \right) ^{\top}v+\frac{1}{2}v^{\top}\nabla ^2f\left( x \right) v dntk=vargminf^(xk+v)=f(x)+∇f(x)⊤v+21v⊤∇2f(x)v
② 线性化最优性条件: ∇ f ( x k + v ) ≈ ∇ f ( x k ) + ∇ 2 f ( x k ) v = 0 ⇒ v = d n t k \nabla f\left( x^k+v \right) \approx \nabla f\left( x^k \right) +\nabla ^2f\left( x^k \right) v=0\mathrm{ }\Rightarrow \mathrm{ }v=d_{nt}^{k} ∇f(xk+v)≈∇f(xk)+∇2f(xk)v=0⇒v=dntk
③ Hessian范数下的最速下降方向: d n t k = d s d k = ∥ ∇ f ( x k ) ∥ ∗ × d n s d k d_{nt}^{k}=d_{sd}^{k}=\left\| \nabla f(x^k) \right\| _*\times d_{nsd}^{k} dntk=dsdk= ∇f(xk) ∗×dnsdk
= ∥ ∇ f ( x k ) ∥ ∗ × a r g min d { ∇ f ( x k ) ⊤ d : ∥ d ∥ ∇ 2 f ( x k ) = 1 } =\left\| \nabla f(x^k) \right\| _*\times \mathrm{arg}\!\:\min_d \!\:\{\nabla f\left( x^k \right) ^{\top}d:\mathrm{ }\left\| d \right\| _{\nabla ^2f\left( x^k \right)}=1\} = ∇f(xk) ∗×argdmin{∇f(xk)⊤d:∥d∥∇2f(xk)=1} -
Newton下降回溯直线搜索算法
① 给定初始点 x 0 ∈ d o m f , x^0\in dom\mathrm{ }f\text{,} x0∈domf,误差阈值 ε > 0 , \varepsilon >0\text{,} ε>0,令 k = 0 k=0 k=0
② 计算Newton步径和增量 d n t k = − ∇ 2 f ( x k ) − 1 ∇ f ( x k ) ; λ ( x k ) 2 = ( d n t k ) ⊤ ∇ 2 f ( x k ) d n t k d_{nt}^{k}=-\nabla ^2f\left( x^k \right) ^{-1}\nabla f\left( x^k \right) \text{;}\lambda \left( x^k \right) ^2=\left( d_{nt}^{k} \right) ^{\top}\nabla ^2f\left( x^k \right) ^{\mathrm{ }}d_{nt}^{k} dntk=−∇2f(xk)−1∇f(xk);λ(xk)2=(dntk)⊤∇2f(xk)dntk
③ 停止准则:若 1 2 λ ( x k ) 2 ≤ ε , \frac{1}{2}\lambda (x^k)^2\le \varepsilon \text{,} 21λ(xk)2≤ε,退出
④ 回溯直线搜索: 0 < α < 0.5 , 0 < β < 1 0<\alpha <0.5,\mathrm{ }0<\beta <1 0<α<0.5,0<β<1, t k = 1 t^k=1 tk=1,当
f ( x k + t k d n t k ) > f ( x k ) − α t k λ ( x k ) 2 f\left( x^k+t^kd_{nt}^{k} \right) >f\left( x^k \right) -\alpha t^k\lambda \left( x^k \right) ^2 f(xk+tkdntk)>f(xk)−αtkλ(xk)2
t k ≔ β t k t^k\coloneqq \beta t^k tk:=βtk
⑤ x k + 1 = x k + t k d n t k , k = k + 1 , x^{k+1}=x^k+t^kd_{nt}^{k}\text{,}k=k+1\text{,} xk+1=xk+tkdntk,k=k+1,回② -
Newton步径 d n t k = − ∇ 2 f ( x k ) − 1 ∇ f ( x k ) d_{nt}^{k}=-\nabla ^2f\left( x^k \right) ^{-1}\nabla f(x^k) dntk=−∇2f(xk)−1∇f(xk)
-
Newton减量 λ ( x k ) 2 = ( d n t k ) ⊤ ∇ 2 f ( x k ) d n t k \lambda (x^k)^2=\left( d_{nt}^{k} \right) ^{\top}\nabla ^2f\left( x^k \right) ^{\mathrm{ }}d_{nt}^{k} λ(xk)2=(dntk)⊤∇2f(xk)dntk λ ( x k ) = ( ∇ f ( x k ) ⊤ ∇ 2 f ( x k ) − 1 ∇ f ( x k ) ) 1 2 \lambda \left( x^k \right) =\left( \nabla f\left( x^k \right) ^{\top}\nabla ^2f\left( x^k \right) ^{-1}\nabla f\left( x^k \right) \right) ^{\frac{1}{2}} λ(xk)=(∇f(xk)⊤∇2f(xk)−1∇f(xk))21
-
停止准则 1 2 λ ( x k ) 2 ≤ ε \frac{1}{2}\lambda (x^k)^2\le \varepsilon 21λ(xk)2≤ε
-
收敛性
① 阻尼牛顿阶段 小于等于 f ( x 0 ) − p ∗ γ \frac{f\left( x^0 \right) -p^*}{\gamma} γf(x0)−p∗
② 二次收敛阶段 不超过 l o g 2 ( l o g 2 ( ϵ 0 ϵ ) ) ≈ 6 log_2\left( log_2\left( \frac{\epsilon _0}{\epsilon} \right) \right) \approx 6 log2(log2(ϵϵ0))≈6 -
最大迭代次数 K ≥ 6 + M 2 L 2 m 5 α β min { 1 , 9 ( 1 − 2 α ) 2 } ( f ( x 0 ) − p ∗ ) K\ge \mathrm{ }6+\frac{M^2L\frac{^2}{m^5}}{\alpha \beta \min \!\:\{1,9\left( 1-2\alpha \right) ^2\}}\left( f\left( x^0 \right) -p^* \right) K≥6+αβmin{1,9(1−2α)2}M2Lm52(f(x0)−p∗)
第四章 等式约束优化问题
- 对偶方法(构造对偶函数)
m i n x f ( x ) + ( v ∗ ) ⊤ ( A x − b ) ∇ f ( x ) + A ⊤ v ∗ = 0 \underset{x}{min}\mathrm{ }f\left( x \right) +\left( v^* \right) ^{\top}(Ax-b)\nabla f\left( x \right) +A^{\top}v^*=0 xminf(x)+(v∗)⊤(Ax−b)∇f(x)+A⊤v∗=0
m a x v − b ⊤ v − f ∗ ( − A ⊤ v ) \underset{v}{max}\!\:-b^{\top}v-f^*(-A^{\top}v)\mathrm{ } vmax−b⊤v−f∗(−A⊤v) - 消除方法(仿射参数集)
min z ∈ R n − p f ~ ( z ) = f ( F z + x ^ ) \min_{z\in R^{n-p}} \!\:\tilde{f}\left( z \right) =f\left( Fz+\hat{x} \right) z∈Rn−pminf~(z)=f(Fz+x^)
其Newton方向和步径: d x = F d z , λ ~ ( x ) = λ ( x ) d_x=Fd_z\text{,}\tilde{\lambda}(x)=\lambda \left( x \right) dx=Fdz,λ~(x)=λ(x) - 等式约束Newton方法 可行初始点、不可行初始点 可行初始点、不可行初始点 可行初始点、不可行初始点
- 可行初始点的Newton方法
① Newton步径 [ ∇ 2 f ( x ) A T A 0 ] [ d x k w ] = [ − ∇ f ( x ) 0 ] \left[ \begin{matrix} \nabla _{\mathrm{ }}^{2}f(x)& A^T\\ A& 0\\\end{matrix} \right] \left[ \begin{array}{c} d_{x}^{k}\\ w\\\end{array} \right] =\left[ \begin{array}{c} -\nabla _{\mathrm{ }}f(x)\\ 0\\\end{array} \right] [∇2f(x)AAT0][dxkw]=[−∇f(x)0]
② Newton减量 λ ( x k ) 2 = ( d x k ) ⊤ ∇ 2 f ( x k ) d x k \lambda (x^k)^2=\left( d_{x}^{k} \right) ^{\top}\nabla ^2f\left( x^k \right) ^{\mathrm{ }}d_{x}^{k} λ(xk)2=(dxk)⊤∇2f(xk)dxk - 牛顿下降回溯直线搜索算法(可行初始点,等式约束)
① 初始可行点 x 0 ∈ d o m f , A x 0 = b , x^0\in dom\mathrm{ }f,Ax^0=b, x0∈domf,Ax0=b,误差阈值 ε > 0 , k = 0 \varepsilon >0,k=0 ε>0,k=0
② 确定牛顿方向和牛顿减少量
[ ∇ 2 f ( x ) A T A 0 ] [ d x k w ] = [ − ∇ f ( x ) 0 ] ; λ ( x k ) 2 = ( d x k ) ⊤ ∇ 2 f ( x k ) d x k \left[ \begin{matrix} \nabla _{\mathrm{ }}^{2}f\left( x \right)& A^T\\ A& 0\\\end{matrix} \right] \left[ \begin{array}{c} d_{x}^{k}\\ w\\\end{array} \right] =\left[ \begin{array}{c} -\nabla _{\mathrm{ }}f\left( x \right)\\ 0\\\end{array} \right] ;\lambda \left( x^k \right) ^2=\left( d_{x}^{k} \right) ^{\top}\nabla ^2f\left( x^k \right) ^{\mathrm{ }}d_{x}^{k} [∇2f(x)AAT0][dxkw]=[−∇f(x)0];λ(xk)2=(dxk)⊤∇2f(xk)dxk
③ 停止准则:若 1 2 λ ( x k ) 2 ≤ ε , \frac{1}{2}\lambda (x^k)^2\le \varepsilon \text{,} 21λ(xk)2≤ε,退出
④ 回溯直线搜索: 0 < α < 0.5 , 0 < β < 1 0<\alpha <0.5,\mathrm{ }0<\beta <1 0<α<0.5,0<β<1, t k = 1 t^k=1 tk=1当
f ( x k + t k d x k ) > f ( x k ) − α t k λ ( x k ) 2 f\left( x^k+t^kd_{x}^{k} \right) >f\left( x^k \right) -\alpha t^k\lambda \left( x^k \right) ^2 f(xk+tkdxk)>f(xk)−αtkλ(xk)2
t k ≔ β t k t^k\coloneqq \beta t^k tk:=βtk
结束
⑤ x k + 1 = x k + t k d x k , k = k + 1 , x^{k+1}=x^k+t^kd_{x}^{k}\text{,}k=k+1\text{,} xk+1=xk+tkdxk,k=k+1,回② - 不可行初始点的Newton方法
① 原对偶残差 ∥ r ( x k , v k ) ∥ 2 = ∥ [ ∇ f ( x k ) + A T v k A x k − b ] ∥ 2 \left\| r\left( x^k,v^k \right) \right\| _2=\left\| \left[ \begin{array}{c} \nabla f\left( x^k \right) +A^Tv^k\\ Ax^k-b\\\end{array} \right] \right\| _2 r(xk,vk) 2= [∇f(xk)+ATvkAxk−b] 2
② 停止准则 A x k = b Ax^k=b Axk=b且 ∥ r ( x k , v k ) ∥ 2 ≤ ϵ \left\| r\left( x^k,\mathrm{ }v^k \right) \right\| _2\le \epsilon r(xk,vk) 2≤ϵ - 牛顿下降回溯直线搜索算法(不可行初始点)
① 初始可行点 x 0 ∈ d o m f , v x^0\in dom\mathrm{ }f,v x0∈domf,v误差阈值 ϵ > 0 , k = 0 \epsilon >0,k=0 ϵ>0,k=0
② 停止准则: A x k = b \mathrm{ }Ax^k=b Axk=b且 ∥ r ( x k , v k ) ∥ 2 = ∥ [ ∇ f ( x k ) + A T v k A x k − b ] ∥ 2 ≤ ϵ , \left\| r\left( x^k,\mathrm{ }v^k \right) \right\| _2=\left\| \left[ \begin{array}{c} \nabla f\left( x^k \right) +A^Tv^k\\ Ax^k-b\\\end{array} \right] \right\| _2\le \epsilon , r(xk,vk) 2= [∇f(xk)+ATvkAxk−b] 2≤ϵ, 时退出,否则
③ 确定原对偶牛顿方向 [ ∇ 2 f ( x k ) A ⊤ A 0 ] [ d x k d v k ] = − [ ∇ f ( x k ) + A ⊤ v A x k − b ] \left[ \begin{matrix} \nabla ^2f\left( x^k \right)& A^{\top}\\ A& 0\\\end{matrix} \right] \left[ \begin{array}{c} d_{x}^{k}\\ d_{v}^{k}\\\end{array} \right] =-\left[ \begin{array}{c} \nabla f\left( x^k \right) +A^{\top}v\\ Ax^k-b\\\end{array} \right] [∇2f(xk)AA⊤0][dxkdvk]=−[∇f(xk)+A⊤vAxk−b]
④ 对原对偶残差 ∥ r ∥ 2 \left\| r\right\| _2 ∥r∥2进行回溯,确定步长 t k t^k tk,其中 0 < α < 0.5 , 0 < β < 1 0<\alpha <0.5,\mathrm{ }0<\beta <1 0<α<0.5,0<β<1, t k = 1 t^k=1 tk=1,当
∥ r ( x k + t d x k , v k + t d v k ) ∥ 2 > ( 1 − α t ) ∥ r ( x k , v k ) ∥ 2 \left\| r\left( x^k+td_{x}^{k},v^k+td_{v}^{k} \right) \right\| _2>\left( 1-\alpha t \right) \left\| r\left( x^k,v^k \right) \right\| _2 r(xk+tdxk,vk+tdvk) 2>(1−αt) r(xk,vk) 2
t k ≔ β t k t^k\coloneqq \beta t^k tk:=βtk
结束
⑤ x k + 1 = x k + t k d x k , v k + 1 = v k + t k d v k , k = k + 1 , x^{k+1}=x^k+t^kd_{x}^{k}\text{,}v^{k+1}=v^k+t^kd_{v}^{k}\text{,}k=k+1\text{,} xk+1=xk+tkdxk,vk+1=vk+tkdvk,k=k+1,回③
第五章 等式不等式约束优化问题
- 近似示性函数 I ^ − ( u ) = − 1 t log ( − u ) \hat{I}_-\left( u \right) =-\frac{1}{t}\log \!\:(-u) I^−(u)=−t1log(−u) t t t越大,精度越高,函数越接近直角
- 对数障碍函数 ϕ ( x ) = − ∑ i = 1 m log ( − f i ( x ) ) \phi \left( x \right) =-\sum\nolimits_{i=1}^m{\log \!\:(-f_i(x))} ϕ(x)=−∑i=1mlog(−fi(x))
- 对偶间隙 m t \frac{m}{t} tm
- 强弱对偶性关系可得: f 0 ( x ∗ ( t ) ) − p ∗ ≤ m t f_0\left( x^*\left( t \right) \right) -p^*\le \frac{m}{t} f0(x∗(t))−p∗≤tm
- 等式不等式凸优化问题的障碍方法:
给定严格可行初始点 x , t = t 0 > 0 , μ > 1 , x\text{,}t=t^0>0,\mu >1, x,t=t0>0,μ>1, ϵ > 0 , \epsilon >0\text{,} ϵ>0,令 k = 0 k=0 k=0
① 中心点步骤:从 x x x开始,求解 x ∗ ( t ) = a r g m i n t f 0 ( x ) + ϕ ( x ) s . t . A x = b x^*\left( t \right) =argmin\ tf_0\left( x \right) +\phi \left( x \right) \mathrm{ }s.t.\mathrm{ }Ax=b x∗(t)=argmin tf0(x)+ϕ(x)s.t.Ax=b
② 改进: x : = x ∗ ( t ) x:=x^*\left( t \right) x:=x∗(t)
③ 停止准则:若 m t ≤ ϵ , \frac{m}{t}\le \epsilon \text{,} tm≤ϵ,退出,否则
④ 增加 t t t: t = μ t , t=\mu t\text{,} t=μt,回到步骤① - 参数的选择
① μ \mu μ较小,每次外部迭代只减少较小的对偶间隙,所需要外部迭代次数较多,但是产生较好的牛顿初始点,内部迭代较少; μ \mu μ较大,每次外部迭代减少较大的对偶间隙,减少外部迭代次数,但是“过于进取”的改进,可能产生不好的牛顿初始点,内部迭代增多
② t 0 t_0 t0太大导致第一次外部迭代(第一次中心点步骤)所需要的内部迭代次数很多; t 0 t_0 t0很小导致算法进行额外的外部迭代,而第一次中心点步骤仍然可能需要很多次迭代 - 中心路径:基于KKT条件的解释
中心路径中的中心点 x ∗ ( t ) x^*(t) x∗(t)满足: A x ∗ ( t ) = b , f i ( x ∗ ( t ) ) < 0 \mathrm{ }Ax^*\left( t \right) =b\text{ ,}f_i\left( x^*\left( t \right) \right) <0 Ax∗(t)=b ,fi(x∗(t))<0且存在 v ∈ R p v\in \mathbb{R}^p v∈Rp,使得 t ∇ f 0 ( x ∗ ( t ) ) + ∑ i = 1 m 1 − f i ( x ∗ ( t ) ) ∇ f i ( x ∗ ( t ) ) + A ⊤ v = 0 t\nabla f_0\left( x^*\left( t \right) \right) +\sum\nolimits_{i=1}^m{\frac{1}{-f_i\left( x^*\left( t \right) \right)}\nabla f_i\left( x^*\left( t \right) \right)}+A^{\top}v=0 t∇f0(x∗(t))+∑i=1m−fi(x∗(t))1∇fi(x∗(t))+A⊤v=0 - 修改的KKT方程
A x = b , f i ( x ) ≤ 0 ( i = 1 , 2 , ⋯ , m ) Ax=b\text{ ,}f_i\left( x \right) \le 0\mathrm{ (}i=1,2,\cdots ,m) Ax=b ,fi(x)≤0(i=1,2,⋯,m)
λ ≥ 0 , − λ i f i ( x ) = 1 t ( i = 1 , 2 , ⋯ , m ) \lambda \ge 0\text{,}-\lambda _if_i\left( x \right) =\frac{1}{t}\mathrm{ (}i=1,2,\cdots ,m) λ≥0,−λifi(x)=t1(i=1,2,⋯,m)
∇ f 0 ( x ) + ∑ i = 1 m λ i ∇ f i ( x ) + A ⊤ v = 0 \nabla f_0\left( x \right) +\sum\nolimits_{i=1}^m{\lambda _i\nabla f_i(x)}+A^{\top}v=0 ∇f0(x)+∑i=1mλi∇fi(x)+A⊤v=0 - 互补松弛性的变化 − λ i f i ( x ) = 1 t -\lambda _if_i\left( x \right) =\frac{1}{t} −λifi(x)=t1
- 修改的KKT方程的Newton步径和中心点问题的Newton步径的关系
[ t ∇ 2 f 0 ( x ) + ∇ 2 ϕ ( x ) A T A 0 ] [ d x d v ] = − [ t ∇ f 0 ( x ) + ∇ ϕ ( x ) 0 ] ⟺ { t H d x + A ⊤ d v = − t g A d x = 0 \left[ \begin{matrix} t\nabla _{\mathrm{ }}^{2}f_0\left( x \right) +\nabla _{\mathrm{ }}^{2}\phi \left( x \right)& A^T\\ A& 0\\\end{matrix} \right] \left[ \begin{array}{c} d_x\\ d_v\\\end{array} \right] =-\left[ \begin{array}{c} t\nabla _{\mathrm{ }}f_0\left( x \right) +\nabla \phi (x)\\ 0\\\end{array} \right] \Longleftrightarrow \left\{ \begin{array}{c} tHd_x+A^{\top}d_v=-tg\\ Ad_x=0\\\end{array} \right. [t∇2f0(x)+∇2ϕ(x)AAT0][dxdv]=−[t∇f0(x)+∇ϕ(x)0]⟺{tHdx+A⊤dv=−tgAdx=0
{ u = d x v = 1 t d v \left\{ \begin{array}{c} u=d_x\\ v=\frac{1}{t}d_v\\\end{array} \right. {u=dxv=t1dv - 障碍方法求解
①初始化步骤:阶段1方法
确定 x x x满足: f i ( x ) < 0 ( i = 1 , 2 , ⋯ , m ) 且 A x = b f_i\left( x \right) <0\mathrm{ }\left( i=1,2,\cdots ,m \right) \text{且}Ax=b fi(x)<0(i=1,2,⋯,m)且Ax=b
设定 t = t 0 > 0 , μ > 1 , ϵ > 0 t=t^0>0,\mathrm{ }\mu >1,\mathrm{ }\epsilon >0 t=t0>0,μ>1,ϵ>0
②中心点步骤:对数障碍函数的最优解
从 x x x开始,对当前的 t t t求解近似等式约束优化问题
x ∗ ( t ) = a r g m i n t f 0 ( x ) + ϕ ( x ) s . t . A x = b x^*\left( t \right) =argmin\ tf_0\left( x \right) +\phi \left( x \right) \mathrm{ }s.t.\mathrm{ }Ax=b x∗(t)=argmin tf0(x)+ϕ(x)s.t.Ax=b
③停止或迭代:
若 m t ≤ ϵ , \frac{m}{t}\le \epsilon \text{,} tm≤ϵ,退出,否则令 x = x ∗ ( t ) , t = μ t , x=x^*\left( t \right) \text{,}t=\mu t\text{,} x=x∗(t),t=μt,继续迭代 - 阶段1方法(三种情况)
① x x x在公共域,且满足等式约束条件
② x x x在公共域,且不满足等式约束条件
原问题通常为:
min s , x f 0 ( x ) \underset{s,\mathrm{ }x}{\min}\mathrm{ }f_0\left( x \right) s,xminf0(x)
s . t . f i ( x ) ≤ s ( i = 1 , 2 , ⋯ , m ) \mathrm{ }s.t.\mathrm{ }f_i\left( x \right) \le s(i=1,2,\cdots ,m) s.t.fi(x)≤s(i=1,2,⋯,m)
A x = b , s = 0 Ax=b\text{,}s=0 Ax=b,s=0
可以转化为:
min s , x t 0 f 0 ( x ) − ∑ i = 1 m l o g ( s − f i ( x ) ) \underset{s,\mathrm{ }x}{\min}\mathrm{ }t^0f_0\left( x \right) -\sum\nolimits_{i=1}^m{log\left( s-f_i(x) \right)} s,xmint0f0(x)−∑i=1mlog(s−fi(x))
s . t . A x = b , s = 0 \mathrm{ }s.t.\mathrm{ }Ax=b\mathrm{ },s=0 s.t.Ax=b,s=0
③ x x x不在公共域
原问题通常为:
min s , x , z 0 , ⋯ , z m f 0 ( x + z 0 ) \underset{s,\mathrm{ }x,z_0,\cdots ,z_m}{\min}\mathrm{ }f_0\left( x+z_0 \right) s,x,z0,⋯,zmminf0(x+z0)
s . t . f i ( x + z i ) ≤ s ( i = 1 , 2 , ⋯ , m ) s.t.\mathrm{ }f_i\left( x+z_i \right) \le s(i=1,2,\cdots ,m) s.t.fi(x+zi)≤s(i=1,2,⋯,m)
A x = b , s = 0 , Ax=b,s=0, Ax=b,s=0, z 0 = 0 , ⋯ , z m = 0 z_0=0,\mathrm{ }\cdots ,z_m=0 z0=0,⋯,zm=0
可以转化为:
min s , x t 0 f 0 ( x + z 0 ) − ∑ i = 1 m l o g ( s − f i ( x + z i ) ) \underset{s,\ x}{\min}\mathrm{ }t^0f_0\left( x+z_0 \right) -\sum\nolimits_{i=1}^m{log\left( s-f_i(x+z_i) \right)} s, xmint0f0(x+z0)−∑i=1mlog(s−fi(x+zi))
s . t . A x = b , s = 0 , \mathrm{ }s.t.\mathrm{ }Ax=b,s=0, s.t.Ax=b,s=0,
z 0 = 0 , ⋯ , z m = 0 z_0=0,\mathrm{ }\cdots ,z_m=0 z0=0,⋯,zm=0 - 原对偶内点法
①代理对偶间隙 η ( x , λ ) = − f ( x ) ⊤ λ \eta (x,\lambda )=-f\left( x \right) ^{\top}\lambda η(x,λ)=−f(x)⊤λ
②原对偶搜索方向
[ ∇ 2 f 0 ( x ) + ∑ i = 1 m λ i ∇ 2 f i ( x ) D f ( x ) ⊤ A ⊤ − d i a g ( λ ) D f ( x ) − d i a g ( f ( x ) ) 0 A 0 0 ] [ Δ x p d Δ λ p d Δ v p d ] = − [ r d u a l r c e n t r p r i ] = − [ ∇ f 0 ( x ) + D f ( x ) ⊤ λ + A ⊤ v − d i a g ( λ ) f ( x ) − 1 t 1 A x − b ] \left[ \begin{matrix} \nabla ^2f_0\left( x \right) +\sum\nolimits_{i=1}^m{\lambda _i\nabla ^2f_i(x)}& Df\left( x \right) ^{\top}& A^{\top}\\ -diag\left( \lambda \right) Df\left( x \right)& -diag\left( f(x) \right)& 0\\ A& 0& 0\\\end{matrix} \right] \left[ \begin{array}{c} \Delta x_{pd}\\ \Delta \lambda _{pd}\\ \Delta v_{pd}\\\end{array} \right] =-\left[ \begin{array}{c} r_{dual}\\ r_{cent}\\ r_{pri}\\\end{array} \right] =-\left[ \begin{array}{c} \nabla f_0\left( x \right) +Df\left( x \right) ^{\top}\lambda +A^{\top}v\\ -diag\left( \lambda \right) f\left( x \right) -\frac{1}{t}1\\ Ax-b\\\end{array} \right] ∇2f0(x)+∑i=1mλi∇2fi(x)−diag(λ)Df(x)ADf(x)⊤−diag(f(x))0A⊤00 ΔxpdΔλpdΔvpd =− rdualrcentrpri =− ∇f0(x)+Df(x)⊤λ+A⊤v−diag(λ)f(x)−t11Ax−b
③停止准则 ∥ r p r i ∥ 2 ≤ ϵ f e a s , ∥ r d u a l ∥ 2 ≤ ϵ f e a s , η ( x , λ ) = − f ( x ) ⊤ λ ≤ ϵ \left\| r_{pri} \right\| _2\le \epsilon _{feas}\text{,}\left\| r_{dual} \right\| _2\le \epsilon _{feas}\text{, }\eta \left( x,\lambda \right) =-f\left( x \right) ^{\top}\lambda \le \epsilon ∥rpri∥2≤ϵfeas,∥rdual∥2≤ϵfeas, η(x,λ)=−f(x)⊤λ≤ϵ - 原对偶内点算法
① 初始化步骤:确定 x x x满足 f i ( x ) < 0 ( i = 1 , 2 , ⋯ , m ) , λ > 0 , μ > 1 , ϵ f e a s > 0 , ϵ > 0 f_i\left( x \right) <0\mathrm{ }\left( i=1,2,\cdots ,m \right) ,\mathrm{ }\lambda >0,\mu >1,\epsilon _{feas}>0,\mathrm{ }\epsilon >0 fi(x)<0(i=1,2,⋯,m),λ>0,μ>1,ϵfeas>0,ϵ>0
② 重复基本步骤
I.确定 t t t,令 t = μ m η ( x , λ ) = μ m − f ( x ) ⊤ λ t= \frac{\mu m}{\eta \left( x,\lambda \right)}=\frac{\mu m}{-f\left( x \right) ^{\top}\lambda} t=η(x,λ)μm=−f(x)⊤λμm η ( x , λ ) = − f ( x ) ⊤ λ \eta (x,\lambda )=-f\left( x \right) ^{\top}\lambda η(x,λ)=−f(x)⊤λ 当前代理对偶间隙对应的 t t t值
II.计算原对偶搜索方向 Δ y p d = ( Δ x p d , Δ λ p d , Δ v p d ) \Delta \mathrm{y}_{\mathrm{pd}}=\left( \Delta \mathrm{x}_{\mathrm{pd}},\Delta \mathrm{\lambda}_{\mathrm{pd}},\Delta \mathrm{v}_{\mathrm{pd}} \right) Δypd=(Δxpd,Δλpd,Δvpd)
III.以减少 ∥ r t ( y + s Δ y p d ) ∥ 2 \left\| r_t\left( y+s\Delta \mathrm{y}_{\mathrm{pd}} \right) \right\| _2 ∥rt(y+sΔypd)∥2为目标进行直线搜索,确定步长 s > 0 s>0 s>0,令 y = y + s Δ y p d y=y+s\Delta \mathrm{y}_{\mathrm{pd}} y=y+sΔypd
③ 停止准则 ∥ r p r i ∥ 2 ≤ ϵ f e a s , ∥ r d u a l ∥ 2 ≤ ϵ f e a s , η ( x , λ ) = − f ( x ) ⊤ λ ≤ ϵ \left\| r_{pri} \right\| _2\le \epsilon _{feas}\text{,}\left\| r_{dual} \right\| _2\le \epsilon _{feas}\text{, }\eta \left( x,\lambda \right) =-f\left( x \right) ^{\top}\lambda \le \epsilon ∥rpri∥2≤ϵfeas,∥rdual∥2≤ϵfeas, η(x,λ)=−f(x)⊤λ≤ϵ