ADMM算法

ADMM

对偶下降

m i n i m i z e    f ( x ) s u b j e c t    t o    A x = b minimize\ \ f(x) \\ subject\ \ to \ \ Ax=b minimize  f(x)subject  to  Ax=b

其中, x ∈ R n x\in R^n xRn A ∈ R m × n A\in R^{m×n} ARm×n f : R n → R f:R^n \to R f:RnR

该问题的拉格朗日式如下
L ( x , y ) = f ( x ) + y T ( A x − b ) L(x,y)=f(x)+y^T(Ax-b) L(x,y)=f(x)+yT(Axb)
对偶方程如下:
g ( y ) = * ⁡ i n f x L ( x , y ) = − * ⁡ s u p x ∈ d o m f ( − y T A x − f ( x ) ) − b T y = − f ∗ ( − A T y ) − b T y g(y)=\operatorname*{inf}_{x}L(x,y)=-\operatorname*{sup}_{x\in domf}(-y^TAx-f(x))-b^Ty=-f^*(-A^Ty)-b^Ty g(y)=*infxL(x,y)=*supxdomf(yTAxf(x))bTy=f(ATy)bTy
为什么这样做呢?
min ⁡ x L ( x , y ) = d ∗ ≤ p ∗ = min ⁡ x f ( x ) + I ( A x − b ) w h e r e    I ( u ) = { 0 u=0 ∞ u ≠ 0 \min_{x}L(x,y)=d^*\le p^*=\min_{x} f(x)+I(Ax-b) \\ where \ \ I(u)=\begin{cases} 0 & \text {u=0}\\[2ex] \infty & \text {u} \neq {0} \end{cases} xminL(x,y)=dp=xminf(x)+I(Axb)where  I(u)=0u=0u̸=0
对偶问题最优化
m a x i m i z e    g ( y ) maximize \ \ g(y) maximize  g(y)
y ∗ = * ⁡ a r g m a x y    g ( y ) y*=\operatorname*{argmax}_{y}\ \ g(y) y=*argmaxy  g(y) x ∗ = a r g m i n x    L ( x , y ∗ ) x*=argmin_x\ \ L(x,y*) x=argminx  L(x,y)

如果强对偶成立,则 d ∗ = p ∗ d*=p* d=p x ∗ x* x是原问题的最优解

对偶下降算法步骤如下:

x k + 1 : = a r g m i n x L ( x , y k ) y k + 1 : = y k + α k ( A x k + 1 − b ) x^{k+1}:=argmin_xL(x,y^k) \\ y^{k+1}:=y^k+\alpha^k(Ax^{k+1}-b) xk+1:=argminxL(x,yk)yk+1:=yk+αk(Axk+1b)

先最小化 L ( x , y k ) L(x,y^k) L(x,yk),后最大化 L ( x k + 1 , y ) L(x^{k+1},y) L(xk+1,y)

特点:
  • α k \alpha^k αk选择合适,且满足一定条件下, x k x^k xk y k y^k yk收敛到最优点
  • 然而,对许多例子不成立。比如 f ( x ) = B x f(x)=Bx f(x)=Bx,在更新 x x x时不存在最小值

对偶分解

在这里插入图片描述

每次迭代需要广播和集合两个步骤: x x x更新可以独立进行, y y y更新要集中进行

增广拉格朗日乘子法

L ρ ( x , y ) = f ( x ) + y T ( A x − b ) + ( ρ / 2 ) ∥ A x − b ∥ 2 2 L_\rho(x,y)=f(x)+y^T(Ax-b)+(\rho /2)\left\lVert Ax-b \right\rVert_2^2 Lρ(x,y)=f(x)+yT(Axb)+(ρ/2)Axb22

这等价于如下问题原拉格朗日:
m i n i m i z e    f ( x ) + ( ρ / 2 ) ∥ A x − b ∥ 2 2 s u b j e c t   t o    A x = b minimize \ \ f(x)+(\rho /2)\left\lVert Ax-b \right\rVert_2^2 \\ subject\ to \ \ Ax=b minimize  f(x)+(ρ/2)Axb22subject to  Ax=b

算法步骤:

x k + 1 : = a r g m i n x    L ρ ( x , y k ) y k + 1 : = y k + ρ ( A x k + 1 − b ) x^{k+1}:=argmin_x\ \ L_\rho(x,y^k) \\ y^{k+1}:=y^k+\rho(Ax^{k+1}-b) xk+1:=argminx  Lρ(x,yk)yk+1:=yk+ρ(Axk+1b)

为什么选择 ρ \rho ρ作为迭代步长?

最优条件必须满足
A x ∗ − b = 0 ∇ f ( x ∗ ) + A T y ∗ = 0 Ax^*-b=0 \\ \nabla f(x^*)+A^Ty^*=0 Axb=0f(x)+ATy=0
根据定义, x k + 1 x^{k+1} xk+1最小化 L ρ ( x , y k ) L_\rho(x,y^k) Lρ(x,yk),因此
0 = ∇ x L ρ ( x k + 1 , y k ) = ∇ x f ( x k + 1 ) + A T ( y k + ρ ( A x k + 1 − b ) ) = ∇ x f ( x k + 1 ) + A T y k + 1 0 =\nabla_xL_\rho(x^{k+1},y^k)\\ =\nabla_xf(x^{k+1})+A^T(y^k+\rho(Ax^{k+1}-b))\\ = \nabla_xf(x^{k+1})+A^Ty^{k+1} 0=xLρ(xk+1,yk)=xf(xk+1)+AT(yk+ρ(Axk+1b))=xf(xk+1)+ATyk+1
随着迭代的进行,只需要原条件的残差收敛到0,即可

评价:

  • 大大增加了收敛性
  • 不能进行分解

ADMM算法

m i n i m i z e    f ( x ) + g ( z ) s u b j e c t   t o    A x + B z = c minimize \ \ f(x)+g(z) \\ subject\ to \ \ Ax+Bz=c minimize  f(x)+g(z)subject to  Ax+Bz=c

最优解表示如下:
p ∗ = i n f { f ( x ) + g ( z )   ∣   A x + B z = c } p^*=inf\{ f(x)+g(z)\ |\ Ax+Bz=c \} p=inf{f(x)+g(z)  Ax+Bz=c}
增广拉格朗日:
L ρ ( x , y ) = f ( x ) + g ( z ) + y T ( A x + B z − c ) + ( ρ / 2 ) ∥ A x + B z − c ∥ 2 2 L_\rho(x,y)=f(x)+g(z)+y^T(Ax+Bz-c)+(\rho /2)\left\lVert Ax+Bz-c \right\rVert_2^2 Lρ(x,y)=f(x)+g(z)+yT(Ax+Bzc)+(ρ/2)Ax+Bzc22
算法步骤:
x k + 1 : = * ⁡ a r g m i n x L ρ ( x , z k , y k ) z k + 1 : = * ⁡ a r g m i n z L ρ ( x k + 1 , z , y k ) y k + 1 : = y k + ρ ( A x k + 1 + B z k + 1 − c ) x^{k+1}:=\operatorname*{argmin}_{x}L_\rho(x,z^k,y^k)\\ z^{k+1}:=\operatorname*{argmin}_{z}L_\rho(x^{k+1},z,y^k)\\ y^{k+1}:=y^k+\rho (Ax^{k+1}+Bz^{k+1}-c) xk+1:=*argminxLρ(x,zk,yk)zk+1:=*argminzLρ(xk+1,z,yk)yk+1:=yk+ρ(Axk+1+Bzk+1c)

归一化形式:

r = A x + B z − c r=Ax+Bz-c r=Ax+Bzc,则
y T r + ( ρ / 2 ) ∥ r ∥ 2 2 = ( ρ / 2 ) ∥ r + u ∥ 2 2 − ( ρ / 2 ) ∥ u ∥ 2 2 y^Tr+(\rho/2)\left\lVert r \right\rVert_2^2 = (\rho/2)\left\lVert r+u \right\rVert_2^2-(\rho/2)\left\lVert u \right\rVert_2^2 yTr+(ρ/2)r22=(ρ/2)r+u22(ρ/2)u22
其中, u = ( 1 / ρ ) y u=(1/\rho)y u=(1/ρ)y是归一化对偶变量

归一化算法步骤如下:
x k + 1 = * ⁡ a r g m i n x ( f ( x ) + ( ρ / 2 ) ∥ A x + B z k − c + u k ∥ 2 2 ) z k + 1 = * ⁡ a r g m i n z ( g ( z ) + ( ρ / 2 ) ∥ A x k + 1 + B z − c + u k ∥ 2 2 ) u k + 1 = u k + A x k + 1 + B z k + 1 − c x^{k+1}=\operatorname*{argmin}_{x}(f(x)+(\rho/2)\left\lVert Ax+Bz^k-c+u^k \right\rVert_2^2)\\ z^{k+1}=\operatorname*{argmin}_{z}(g(z)+(\rho/2)\left\lVert Ax^{k+1}+Bz-c+u^k \right\rVert_2^2)\\ u^{k+1}=u^k+Ax^{k+1}+Bz^{k+1}-c xk+1=*argminx(f(x)+(ρ/2)Ax+Bzkc+uk22)zk+1=*argminz(g(z)+(ρ/2)Axk+1+Bzc+uk22)uk+1=uk+Axk+1+Bzk+1c
可见,
u k = u 0 + ∑ j = i k r j r k = A x k + B z k − c u^k=u^0+\sum_{j=i}^{k}r^j \\ r^k = Ax^k+Bz^k-c uk=u0+j=ikrjrk=Axk+Bzkc

收敛性
  • 假设1: e p i   f = { ( x , t ) ∈ R n × R   ∣   f ( x ) ≤ t } epi \ f=\{(x,t)\in R^n×R \ |\ f(x)\le t \} epi f={(x,t)Rn×R  f(x)t}是闭的且非空的凸集
  • 假设2:存在 ( x ∗ , z ∗ , y ∗ ) (x^*,z^*,y^*) (x,z,y)使得下式成立

L 0 ( x ∗ , z ∗ , y ) ≤ L 0 ( x ∗ , z ∗ , y ∗ ) ≤ L 0 ( x , z , y ∗ ) L_0(x^*,z^*,y)\le L_0(x^*,z^*,y^*) \le L_0(x,z,y^*) L0(x,z,y)L0(x,z,y)L0(x,z,y)

满足上述条件下,有

  • 残差收敛: r k → 0   a s   k → ∞ r^k \to 0 \ as \ k \to ∞ rk0 as k
  • 目标函数收敛: f ( x k ) + g ( z k ) → p ∗   a s   k → ∞ f(x^k)+g(z^k) \to p^* \ as \ k \to ∞ f(xk)+g(zk)p as k。其中, p ∗ p^* p是最优值
  • 对偶变量收敛: y k → y ∗   a s   k → ∞ y^k \to y^* \ as \ k \to ∞ yky as k。其中, y ∗ y^* y是对偶最优点

注意, x k , y k x^k,y^k xk,yk不一定收敛到最优值,仅在上述条件满足下。

最优条件
  • 原问题的可行性

A x ∗ + B z ∗ − c = 0 Ax^*+Bz^*-c=0 Ax+Bzc=0

  • 对偶可行性

0 = ∇ f ( x ∗ ) + A T y ∗ 0=\nabla f(x^*)+A^Ty^* \\ 0=f(x)+ATy

0 = ∇ g ( z ∗ ) + B T y ∗ 0=\nabla g(z^*)+B^Ty^* 0=g(z)+BTy

因为 z k + 1 z^{k+1} zk+1最小化 L ρ ( x k + 1 , z , y k ) L_\rho (x^{k+1},z,y^k) Lρ(xk+1,z,yk),所以有
0 = ∇ g ( z k + 1 ) + B T y k + ρ B T ( A x k + 1 + B z k + 1 − c ) = ∇ g ( z k + 1 + + B T y k + ρ B T r k + 1 ) = ∇ g ( z k + 1 ) + B T y k + 1 0 =\nabla g(z^{k+1})+B^Ty^k+\rho B^T(Ax^{k+1}+Bz^{k+1}-c) \\ =\nabla g(z^{k+1}++B^Ty^k+\rho B^Tr^{k+1}) \\ = \nabla g(z^{k+1})+B^Ty^{k+1} 0=g(zk+1)+BTyk+ρBT(Axk+1+Bzk+1c)=g(zk+1++BTyk+ρBTrk+1)=g(zk+1)+BTyk+1
所以,式(25)总是成立的。

因为 x k + 1 x^{k+1} xk+1最小化 L ρ ( x , z k , y k ) L_\rho (x,z^k,y^k) Lρ(x,zk,yk),所以有
0 = ∇ f ( x k + 1 ) + A T y k + ρ A T ( A x k + 1 + B z k − c ) = ∇ f ( x k + 1 + + A T ( y k + ρ r k + 1 + ρ B ( z k − z k + 1 ) ) ) = ∇ f ( x k + 1 ) + A T y k + 1 + ρ A T B ( z k − z k + 1 ) 0 =\nabla f(x^{k+1})+A^Ty^k+\rho A^T(Ax^{k+1}+Bz^{k}-c) \\ =\nabla f(x^{k+1}++A^T(y^k+\rho r^{k+1}+\rho B(z^k-z^{k+1}))) \\ = \nabla f(x^{k+1})+A^T y^{k+1}+\rho A^TB(z^k-z^{k+1}) 0=f(xk+1)+ATyk+ρAT(Axk+1+Bzkc)=f(xk+1++AT(yk+ρrk+1+ρB(zkzk+1)))=f(xk+1)+ATyk+1+ρATB(zkzk+1)
因此, s k + 1 = ρ A T B ( z k + 1 − z k ) s^{k+1}=\rho A^TB(z^{k+1}-z^k) sk+1=ρATB(zk+1zk)可以看做式(24)的残差

停止条件

收敛条件得出:
f ( x k ) + g ( z k ) − p ∗ ≤ − ( y k ) T r k + ( x k − x ∗ ) T s k f(x^k)+g(z^k)-p^*\le -(y^k)^Tr^k+(x^k-x^*)^Ts^k f(xk)+g(zk)p(yk)Trk+(xkx)Tsk
假设 ∣ ∣ x k − x ∗ ∣ ∣ ≤ d ||x^k-x^*||\le d xkxd,则
f ( x k ) + g ( z k ) − p ∗ ≤ − ( y k ) T r k + d ∣ ∣ s k ∣ ∣ 2 ≤ ∣ ∣ y k ∣ ∣ 2 ∣ ∣ r k ∣ ∣ 2 + d ∣ ∣ s k ∣ ∣ 2 f(x^k)+g(z^k)-p^*\le -(y^k)^Tr^k+d||s^k||_2 \le ||y^k||_2||r^k||_2+d||s^k||_2 f(xk)+g(zk)p(yk)Trk+dsk2yk2rk2+dsk2
因此,终止条件设为:
∣ ∣ r k ∣ ∣ 2 ≤ ϵ p r i    a n d    ∣ ∣ s k ∣ ∣ 2 ≤ ϵ d u a l ||r^k||_2\le \epsilon^{pri} \ \ and \ \ ||s^k||_2\le \epsilon^{dual} rk2ϵpri  and  sk2ϵdual
其中,
ϵ p r i = p ϵ a b s + ϵ r e l m a x { ∣ ∣ A x k ∣ ∣ 2 , ∣ ∣ B z k ∣ ∣ 2 , ∣ ∣ c ∣ ∣ 2 } ϵ d u a l = n ϵ a b s + ϵ r e l ∣ ∣ A T y k ∣ ∣ 2 \epsilon^{pri} = \sqrt{p}\epsilon^{abs}+\epsilon^{rel}max\{ ||Ax^k||_2,||Bz^k||_2 ,||c||_2\} \\ \epsilon^{dual} = \sqrt{n}\epsilon^{abs}+\epsilon^{rel}||A^Ty^k||_2 ϵpri=p ϵabs+ϵrelmax{Axk2,Bzk2,c2}ϵdual=n ϵabs+ϵrelATyk2
其中, p , n p,n p,n是各自向量的维度

ADMM变种

变化的惩罚参数

ρ k + 1 : = { τ i n c r ρ k , if ∣ ∣ r k ∣ ∣ 2 > μ ∣ ∣ s k ∣ ∣ 2 ρ k / τ d e c r , if ∣ ∣ s k ∣ ∣ 2 > μ ∣ ∣ r k ∣ ∣ 2 ρ k , o t h e r w i s e \rho^{k+1}:=\begin{cases} \tau^{incr}\rho^k, & \text{if} ||r^k||_2>\mu ||s^k||_2 \\ \rho^k/\tau^{decr}, & \text{if} ||s^k||_2>\mu ||r^k||_2 \\ \rho^k, &otherwise \end{cases} ρk+1:=τincrρk,ρk/τdecr,ρk,ifrk2>μsk2ifsk2>μrk2otherwise

其中,一般 μ = 10 > 1 , τ i n c r = 2 > 1 , τ d e c r = 2 > 1 \mu=10>1,\tau^{incr}=2>1,\tau^{decr}=2>1 μ=10>1,τincr=2>1,τdecr=2>1

直观地, ρ \rho ρ变大,会使得 r k r^k rk变小

广义增广

( ρ / 2 ) ∣ ∣ r ∣ ∣ 2 2 → ( 1 / 2 ) r T P r (\rho/2)||r||_2^2 \to (1/2)r^TPr (ρ/2)r22(1/2)rTPr

其中,P是正定的。这可以看做一个新的标准ADMM。不过, r = 0 → F r = 0 r=0 \to Fr=0 r=0Fr=0。其中, F T F = P F^TF=P FTF=P

超松弛

在更新z和y时,
A x k + 1 → α k A x k + 1 − ( 1 − α k ) ( B z k − c ) Ax^{k+1} \to \alpha^kAx^{k+1}-(1-\alpha^k)(Bz^k-c) Axk+1αkAxk+1(1αk)(Bzkc)
一般,选择超松弛参数 α k ∈ [ 1.5 , 1.8 ] \alpha^k \in [1.5,1.8] αk[1.5,1.8]

常见形式

下面只考虑 x x x更新步骤,其他类似
x + = a r g m i n x ( f ( x ) + ρ / 2 ∣ ∣ A x − v ∣ ∣ 2 2 ) w h e r e    v = − B z + c − u x^+=argmin_x(f(x)+\rho/2||Ax-v||_2^2)\\ where \ \ v=-Bz+c-u x+=argminx(f(x)+ρ/2Axv22)where  v=Bz+cu

二次目标函数

f ( x ) = ( 1 / 2 ) x T P x + q T x + r f(x)=(1/2)x^TPx+q^Tx+r f(x)=(1/2)xTPx+qTx+r


x + = ( P + ρ A T A ) − 1 ( ρ A T v − q ) x^+=(P+\rho A^TA)^{-1}(\rho A^Tv-q) x+=(P+ρATA)1(ρATvq)
式(43)可以利用稀疏性、保存重复的分解等来加快速度

元素软阈值

f ( x ) = λ ∣ ∣ x ∣ ∣ 1 , A = I f(x)=\lambda ||x||_1,A=I f(x)=λx1A=I时。对于每个标量 x i x_i xi的更新为
x i + : = a r g m i n x i ( λ ∣ x i ∣ + ( ρ / 2 ) ( x i − v i ) 2 ) x_i^+:=argmin_{x_i}(\lambda|x_i|+(\rho/2)(x_i-v_i)^2) xi+:=argminxi(λxi+(ρ/2)(xivi)2)
尽管, f ( x ) f(x) f(x)不可导。但是存在简单的解析解:
x i + : = S λ / ρ ( v i ) x_i^+:=S_{\lambda/\rho}(v_i) xi+:=Sλ/ρ(vi)
软阈值操作符 S S S
S k ( a ) = { a − k a &gt; k 0 ∣ a ∣ ≤ k a + k a &lt; − k S_k(a)=\begin{cases} a-k &amp; a&gt;k\\ 0 &amp; |a|\le k \\ a+k &amp; a&lt;-k \end{cases} Sk(a)=ak0a+ka>kaka<k
或者表示为
S k ( a ) = ( a − k ) + − ( − a − k ) + S_k(a)=(a-k)_+-(-a-k)_+ Sk(a)=(ak)+(ak)+
这也是shrinkage operator,如下
S k ( a ) = ( 1 − k / ∣ a ∣ ) + a ,    f o r    a ≠ 0 S_k(a)=(1-k/|a|)_+a,\ \ for \ \ a \ne 0 Sk(a)=(1k/a)+a,  for  a̸=0

一阶范数问题

最小化 ∣ ∣ A x − b ∣ ∣ 1 ||Ax-b||_1 Axb1,转化为ADMM形式
m i n i m i z e    ∣ ∣ z ∣ ∣ 1 s u b j e c t    t o    A x − z = b minimize \ \ ||z||_1 \\ subject\ \ to \ \ Ax-z=b minimize  z1subject  to  Axz=b
上面式子中, f = 0 , g = ∣ ∣ ∗ ∣ ∣ 1 f=0,g=||*||_1 f=0,g=1。假设 A T A A^TA ATA可逆,则ADMM算法步骤如下:
x k + 1 = ( A T A ) − 1 A T ( b + z k − u k ) z k + 1 = S 1 / ρ ( A x k + 1 − b + u k ) u k + 1 = u k + A x k + 1 − z k + 1 − b x^{k+1} = (A^TA)^{-1}A^T(b+z^k-u^k) \\ z^{k+1} = S_{1/\rho}(Ax^{k+1}-b+u^k) \\ u^{k+1} = u^k+Ax^{k+1}-z^{k+1}-b xk+1=(ATA)1AT(b+zkuk)zk+1=S1/ρ(Axk+1b+uk)uk+1=uk+Axk+1zk+1b

openMVG代码

eigen线性代数
template <typename Linear_SolverT>
inline void Compute
(
  const Eigen::SparseMatrix<double>& spd_mat,
  Linear_SolverT * linear_solver
)
{
  linear_solver->compute(spd_mat);
}

template <typename Linear_SolverT>
inline void Compute
(
  const Eigen::MatrixXd& spd_mat,
  Linear_SolverT * linear_solver
)
{
  linear_solver->compute(spd_mat.sparseView());
}
迭代参数
struct Options {
    int max_num_iterations = 1000;
    // Rho is the augmented Lagrangian parameter.
    double rho = 1.0;
    // Alpha is the over-relaxation parameter (typically between 1.0 and 1.8).
    double alpha = 1.0;

    double absolute_tolerance = 1e-4;
    double relative_tolerance = 1e-2;
};
私有成员
  Options options_;

  // Matrix A where || Ax - b ||_1 is the problem we are solving.
  MatrixType a_;

  // Cholesky linear solver.
#ifdef EIGEN_MPL2_ONLY
  using Linear_Solver_T = Eigen::SparseLU<Eigen::SparseMatrix<double>>;
#else
  // Since our linear system will be a SPD matrix we can
  // utilize the Cholesky factorization.
  using Linear_Solver_T = Eigen::SimplicialLLT<Eigen::SparseMatrix<double>>;
#endif
  Linear_Solver_T linear_solver_;

  Eigen::VectorXd Shrinkage
  (
    const Eigen::VectorXd& vec, const double kappa
  ) const
  {
    Eigen::ArrayXd zero_vec(vec.size());
    zero_vec.setZero();
    return zero_vec.max( vec.array() - kappa) -
           zero_vec.max(-vec.array() - kappa);
  }
初始化
  L1Solver
  (
    const Options& options,
    const MatrixType& mat
  )
  : options_(options), a_(mat)
  {
    // Analyze the sparsity pattern once. Only the values of the entries will be
    // changed with each iteration.
    const MatrixType spd_mat = a_.transpose() * a_;
    l1_solver_internal::Compute(spd_mat, &linear_solver_);
  }
核心
  bool Solve
  (
    const Eigen::VectorXd& rhs,  // b = rhs
    Eigen::VectorXd* solution
  )
  {
    // Since constructor was called before we check Compute status
    if (linear_solver_.info() != Eigen::Success)
    {
      std::cerr << "Cannot compute the matrix factorization" << std::endl;
      return false;
    }

    Eigen::VectorXd& x = *solution;
    Eigen::VectorXd z(a_.rows()), u(a_.rows());
    z.setZero();
    u.setZero();

    Eigen::VectorXd a_times_x(a_.rows()), z_old(z.size()), ax_hat(a_.rows());
    // Precompute some convergence terms.
    const double rhs_norm = rhs.norm();
    const double primal_abs_tolerance_eps =
      std::sqrt(a_.rows()) * options_.absolute_tolerance;
    const double dual_abs_tolerance_eps =
      std::sqrt(a_.cols()) * options_.absolute_tolerance;

    for (int i = 0; i < options_.max_num_iterations; ++i)
    {
      // Update x.
      x.noalias() = linear_solver_.solve(a_.transpose() * (rhs + z - u));
      a_times_x.noalias() = a_ * x;
      ax_hat.noalias() = options_.alpha * a_times_x;
      ax_hat.noalias() += (1.0 - options_.alpha) * (z + rhs);

      // Update z and set z_old.
      std::swap(z, z_old);
      z.noalias() = Shrinkage(ax_hat - rhs + u, 1.0 / options_.rho);

      // Update u.
      u.noalias() += ax_hat - z - rhs;

      // Compute the convergence terms.
      const double r_norm = (a_times_x - z - rhs).norm();
      const double s_norm =
        (-options_.rho * a_.transpose() * (z - z_old)).norm();
      const double max_norm =
        std::max({a_times_x.norm(), z.norm(), rhs_norm});
      const double primal_eps =
        primal_abs_tolerance_eps + options_.relative_tolerance * max_norm;
      const double dual_eps =
        dual_abs_tolerance_eps +
        options_.relative_tolerance *
          (options_.rho * a_.transpose() * u).norm();

      // Log the result to the screen.
      // std::ostringstream os;
      // os << "Iteration: " << i << "\n"
      //   << "R norm: " << r_norm << "\n"
      //   << "S norm: " << s_norm << "\n"
      //   << "Primal eps: " << primal_eps << "\n"
      //   << "Dual eps: " << dual_eps << std::endl;
      // std::cout << os.str() << std::endl;

      // Determine if the minimizer has converged.
      if (r_norm < primal_eps && s_norm < dual_eps)
      {
        return true;
      }
    }
    return false;
    }
  • 0
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值