8.6 交替方向乘子法(续2)
8.6.5 收敛性分析
本节主要讨论交替方向乘子法 (8.6.5)—(8.6.7) 在问题 (8.6.1) 上的收敛性.在此之前我们先引入一些必要的假设.
假设 8.3 (1) f 1 ( x ) , f 2 ( x ) f_1(x),f_2(x) f1(x),f2(x) 均为闭凸函数,且每个 ADMM 迭代子问题存在唯一解;
\qquad\quad (2) 原始问题 (8.6.1) 的解集非空,且 Slater 条件满足.
假设 8.3 给出的条件是很基本的,
f
1
f_1
f1 和
f
2
f_2
f2 的凸性保证了要求解的问题是凸问题,每个子问题存在唯一解是为了保证迭代的良定义;而在 Slater 条件满足的情况下,原始问题的 KKT 对和最优解是对应的,因此可以很方便地使用 KKT 条件来讨论收敛性.
由于原始问题解集非空,不妨设
(
x
1
∗
,
x
2
∗
,
y
∗
)
(x_1^*,x_2^*,y^*)
(x1∗,x2∗,y∗) 是 KKT 对,即满足条件 (8.6.8)
−
A
1
T
y
∗
∈
∂
f
1
(
x
1
∗
)
,
−
A
2
T
y
∗
∈
∂
f
2
(
x
2
∗
)
,
A
1
x
1
∗
+
A
2
x
2
∗
=
b
.
-A_1^\mathrm{T}y^*\in\partial f_1(x_1^*),\quad-A_2^\mathrm{T}y^*\in\partial f_2(x_2^*),\quad A_1x_1^*+A_2x_2^*=b.
−A1Ty∗∈∂f1(x1∗),−A2Ty∗∈∂f2(x2∗),A1x1∗+A2x2∗=b.
我们最终的目的是证明 ADMM 迭代序列
{
(
x
1
k
,
x
2
k
,
y
k
)
}
\{(x_1^k,x_2^k,y^k)\}
{(x1k,x2k,yk)} 收敛到原始问题的一个 KKT 对,因此引入如下记号来表示当前迭代点和 KKT 对的误差:
(
e
1
k
,
e
2
k
,
e
y
k
)
=
def
(
x
1
k
,
x
2
k
,
y
k
)
−
(
x
1
∗
,
x
2
∗
,
y
∗
)
(e_1^k,e_2^k,e_y^k)\stackrel{\text{def}}{=}(x_1^k,x_2^k,y^k)-(x_1^*,x_2^*,y^*)
(e1k,e2k,eyk)=def(x1k,x2k,yk)−(x1∗,x2∗,y∗)
我们进一步引入如下辅助变量来简化之后的证明:
u
k
=
−
A
1
T
[
y
k
+
(
1
−
τ
)
ρ
(
A
1
e
1
k
+
A
2
e
2
k
)
+
ρ
A
2
(
x
2
k
−
1
−
x
2
k
)
]
v
k
=
−
A
2
T
[
y
k
+
(
1
−
τ
)
ρ
(
A
1
e
1
k
+
A
2
e
2
k
)
]
Ψ
k
=
1
τ
ρ
∥
e
y
k
∥
2
+
ρ
∥
A
2
e
2
k
∥
2
Φ
k
=
Ψ
k
+
max
(
1
−
τ
,
1
−
τ
−
1
)
ρ
∥
A
1
e
1
k
+
A
2
e
2
k
∥
2
(
8.6.39
)
\begin{aligned}&u^{k}=-A_{1}{}^{\mathrm{T}}[y^{k}+(1-\tau)\rho(A_{1}e_{1}^{k}+A_{2}e_{2}^{k})+\rho A_{2}(x_{2}^{k-1}-x_{2}^{k})]\\ &v^{k}=-A_{2}{}^{\mathrm{T}}[y^{k}+(1-\tau)\rho(A_{1}e_{1}^{k}+A_{2}e_{2}^{k})]\\ &\Psi_{k}=\frac{1}{\tau\rho}\|e_{y}^{k}\|^{2}+\rho\|A_{2}e_{2}^{k}\|^{2}\\&\Phi_{k}=\Psi_{k}+\max\:(1-\tau,1-\tau^{-1})\rho\|A_{1}e_{1}^{k}+A_{2}e_{2}^{k}\|^{2}\end{aligned}\qquad(8.6.39)
uk=−A1T[yk+(1−τ)ρ(A1e1k+A2e2k)+ρA2(x2k−1−x2k)]vk=−A2T[yk+(1−τ)ρ(A1e1k+A2e2k)]Ψk=τρ1∥eyk∥2+ρ∥A2e2k∥2Φk=Ψk+max(1−τ,1−τ−1)ρ∥A1e1k+A2e2k∥2(8.6.39)
在这些记号的基础上,我们有如下结果:
引理 8.7 假设 { ( x 1 k , x 2 k , y k ) } \{(x_1^k,x_2^k,y^k)\} {(x1k,x2k,yk)} 为交替方向乘子法产生一个迭代序列, 那么,对任意的 k ⩾ 1 k\geqslant 1 k⩾1 有
u k ∈ ∂ f 1 ( x 1 k ) , v k ∈ ∂ f 2 ( x 2 k ) ( 8.6.40 ) Φ k − Φ k + 1 ⩾ min ( τ , 1 + τ − τ 2 ) ρ ∥ A 2 ( x 2 k − x 2 k + 1 ) ∥ 2 + min ( 1 , 1 + τ − 1 − τ ) ρ ∥ A 1 e 1 k + 1 + A 2 e 2 k + 1 ∥ 2 ( 8.6.41 ) \begin{aligned}&u^k\in\partial f_1(x_1^k),\ v^k\in\partial f_2(x_2^k)\qquad(8.6.40) \\\Phi_{k}-\Phi_{k+1}&\geqslant\min(\tau,1+\tau-\tau^{2})\rho\|A_{2}(x_{2}^{k}-x_{2}^{k+1})\|^{2}+\min(1,1+\tau^{-1}-\tau)\rho\|A_{1}e_{1}^{k+1}+A_{2}e_{2}^{k+1}\|^{2}\qquad(8.6.41)\end{aligned} Φk−Φk+1uk∈∂f1(x1k), vk∈∂f2(x2k)(8.6.40)⩾min(τ,1+τ−τ2)ρ∥A2(x2k−x2k+1)∥2+min(1,1+τ−1−τ)ρ∥A1e1k+1+A2e2k+1∥2(8.6.41)
证明
先证明 (8.6.40) 式的两个结论.根据交替方向乘子法的迭代过程,对 x 1 k + 1 x_1^{k+1} x1k+1 我们有
0 ∈ ∂ f 1 ( x 1 k + 1 ) + A 1 T y k + ρ A 1 T ( A 1 x 1 k + 1 + A 2 x 2 k − b ) 0\in\partial f_1(x_1^{k+1})+A_1^\mathrm{T}y^k+\rho A_1^\mathrm{T}(A_1x_1^{k+1}+A_2x_2^k-b) 0∈∂f1(x1k+1)+A1Tyk+ρA1T(A1x1k+1+A2x2k−b)将 y k = y k + 1 − τ ρ ( A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ) y^k=y^{k+1}-\tau\rho(A_1x_1^{k+1}+A_2x_2^{k+1}-b) yk=yk+1−τρ(A1x1k+1+A2x2k+1−b) 代入上式,消去 y k y^k yk 就有
− A 1 T ( y k + 1 + ( 1 − τ ) ρ ( A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ) + ρ A 2 ( x 2 k − x 2 k + 1 ) ) ∈ ∂ f 1 ( x 1 k + 1 ) -A_{1}^{\mathrm{T}}\Big(y^{k+1}+(1-\tau)\rho(A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b)+\rho A_{2}(x_{2}^{k}-x_{2}^{k+1})\Big)\in\partial f_{1}(x_{1}^{k+1}) −A1T(yk+1+(1−τ)ρ(A1x1k+1+A2x2k+1−b)+ρA2(x2k−x2k+1))∈∂f1(x1k+1)根据 u k u^k uk 的定义自然有 u k ∈ ∂ f 1 ( x 1 k ) u^k\in\partial f_1(x_1^k) uk∈∂f1(x1k) (注意代回 b = A 1 x 1 ∗ + A 2 x 2 ∗ b=A_1x_1^*+A_2x_2^* b=A1x1∗+A2x2∗).
类似地,对 x 2 k + 1 x_2^{k+1} x2k+1 我们有
0 ∈ ∂ f 2 ( x 2 k + 1 ) + A 2 T y k + ρ A 2 T ( A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ) 0\in\partial f_2(x_2^{k+1})+A_2^\mathrm{T}y^k+\rho A_2^\mathrm{T}(A_1x_1^{k+1}+A_2x_2^{k+1}-b) 0∈∂f2(x2k+1)+A2Tyk+ρA2T(A1x1k+1+A2x2k+1−b)同样利用 y k y^k yk 的表达式消去 y k y^k yk, 得到
− A 2 T ( y k + 1 + ( 1 − τ ) ρ ( A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ) ) ∈ ∂ f 2 ( x 2 k + 1 ) -A_2^\mathrm{T}\Big(y^{k+1}+(1-\tau)\rho(A_1x_1^{k+1}+A_2x_2^{k+1}-b)\Big)\in\partial f_2(x_2^{k+1}) −A2T(yk+1+(1−τ)ρ(A1x1k+1+A2x2k+1−b))∈∂f2(x2k+1)根据 v k v^k vk 的定义自然有 v k ∈ ∂ f 2 ( x 2 k ) v^k\in\partial f_2(x_2^k) vk∈∂f2(x2k).
接下来证明不等式 (8.6.41). 首先根据 ( x 1 ∗ , x 2 ∗ , y ∗ ) (x_1^*,x_2^*,y^*) (x1∗,x2∗,y∗) 的最优性条件以及关系式 (8.6.40),
u k + 1 ∈ ∂ f 1 ( x 1 k + 1 ) , − A 1 T y ∗ ∈ ∂ f 1 ( x 1 ∗ ) , v k + 1 ∈ ∂ f 2 ( x 2 k + 1 ) , − A 2 T y ∗ ∈ ∂ f 2 ( x 2 ∗ ) . u^{k+1}\in\partial f_{1}(x_{1}^{k+1}),\quad-A_{1}^{\mathrm{T}}y^{*}\in\partial f_{1}(x_{1}^{*}),\\v^{k+1}\in\partial f_{2}(x_{2}^{k+1}),\quad-A_{2}^{\mathrm{T}}y^{*}\in\partial f_{2}(x_{2}^{*}). uk+1∈∂f1(x1k+1),−A1Ty∗∈∂f1(x1∗),vk+1∈∂f2(x2k+1),−A2Ty∗∈∂f2(x2∗).根据凸函数的单调性,
⟨ u k + 1 + A 1 T y ∗ , x 1 k + 1 − x 1 ∗ ⟩ ⩾ 0 ⟨ v k + 1 + A 2 T y ∗ , x 2 k + 1 − x 2 ∗ ⟩ ⩾ 0 \left\langle u^{k+1}+A_{1}^{\mathrm{T}}y^{*},x_{1}^{k+1}-x_{1}^{*}\right\rangle\geqslant 0 \\\left\langle v^{k+1}+A_{2}^{\mathrm{T}}y^{*},x_{2}^{k+1}-x_{2}^{*}\right\rangle\geqslant 0 ⟨uk+1+A1Ty∗,x1k+1−x1∗⟩⩾0⟨vk+1+A2Ty∗,x2k+1−x2∗⟩⩾0将上述两个不等式相加,结合 u k + 1 , v k + 1 u^{k+1},v^{k+1} uk+1,vk+1 的定义,并注意到恒等式
A 1 x 1 k + 1 + A 2 x 2 k + 1 − b = ( τ ρ ) − 1 ( y k + 1 − y k ) = ( τ ρ ) − 1 ( e y k + 1 − e y k ) ( 8.6.42 ) A_1x_1^{k+1}+A_2x_2^{k+1}-b=(\tau\rho)^{-1}(y^{k+1}-y^k)=(\tau\rho)^{-1}(e_y^{k+1}-e_y^k)\qquad(8.6.42) A1x1k+1+A2x2k+1−b=(τρ)−1(yk+1−yk)=(τρ)−1(eyk+1−eyk)(8.6.42)⟨ u k + 1 + A 1 T y ∗ , x 1 k + 1 − x 1 ∗ ⟩ + ⟨ v k + 1 + A 2 T y ∗ , x 2 k + 1 − x 2 ∗ ⟩ = ⟨ − A 1 T [ y k + 1 + ( 1 − τ ) ρ ( A 1 e 1 k + 1 + A 2 e 2 k + 1 ) + ρ A 2 ( x 2 k − x 2 k + 1 ) ] + A 1 T y ∗ , x 1 k + 1 − x 1 ∗ ⟩ + ⟨ − A 2 T [ y k + 1 + ( 1 − τ ) ρ ( A 1 e 1 k + 1 + A 2 e 2 k + 1 ) ] + A 2 T y ∗ , x 2 k + 1 − x 2 ∗ ⟩ = ⟨ − A 1 T e y k + 1 , x 1 k + 1 − x 1 ∗ ⟩ + ⟨ − A 1 T [ ( 1 − τ ) ρ ( A 1 e 1 k + 1 + A 2 e 2 k + 1 ) ] , x 1 k + 1 − x 1 ∗ ⟩ + ⟨ − A 2 T e y k + 1 , x 2 k + 1 − x 2 ∗ ⟩ + ⟨ − A 2 T [ ( 1 − τ ) ρ ( A 1 e 1 k + 1 + A 2 e 2 k + 1 ) ] , x 2 k + 1 − x 2 ∗ ⟩ + ⟨ − A 1 T [ ρ A 2 ( x 2 k − x 2 k + 1 ) ] , x 1 k + 1 − x 1 ∗ ⟩ = ⟨ − A 1 T e y k + 1 , x 1 k + 1 − x 1 ∗ ⟩ + ⟨ − A 2 T e y k + 1 , x 2 k + 1 − x 2 ∗ ⟩ + ⟨ − A 1 T [ ( 1 − τ ) ρ ( A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ) ] , x 1 k + 1 − x 1 ∗ ⟩ + ⟨ − A 2 T [ ( 1 − τ ) ρ ( A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ) ] , x 2 k + 1 − x 2 ∗ ⟩ + ⟨ − A 1 T [ ρ A 2 ( x 2 k − x 2 k + 1 ) ] , x 1 k + 1 − x 1 ∗ ⟩ = 1 τ ρ ⟨ e y k + 1 , e y k − e y k + 1 ⟩ − ( 1 − τ ) ρ ∥ A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ∥ 2 + ⟨ − A 1 T [ ρ A 2 ( x 2 k − x 2 k + 1 ) ] , x 1 k + 1 − x 1 ∗ ⟩ \begin{aligned}&\left\langle u^{k+1}+A_{1}^{\mathrm{T}}y^{*},x_{1}^{k+1}-x_{1}^{*}\right\rangle+\left\langle v^{k+1}+A_{2}^{\mathrm{T}}y^{*},x_{2}^{k+1}-x_{2}^{*}\right\rangle\\=&\left\langle -A_{1}{}^{\mathrm{T}}[y^{k+1}+(1-\tau)\rho(A_{1}e_{1}^{k+1}+A_{2}e_{2}^{k+1})+\rho A_{2}(x_{2}^{k}-x_{2}^{k+1})]+A_{1}^{\mathrm{T}}y^{*},x_{1}^{k+1}-x_{1}^{*}\right\rangle+\left\langle -A_{2}{}^{\mathrm{T}}[y^{k+1}+(1-\tau)\rho(A_{1}e_{1}^{k+1}+A_{2}e_{2}^{k+1})]+A_{2}^{\mathrm{T}}y^{*},x_{2}^{k+1}-x_{2}^{*}\right\rangle\\=&\left\langle-A_1^{\mathrm{T}}e_y^{k+1},x_{1}^{k+1}-x_{1}^{*}\right\rangle+\left\langle -A_{1}{}^{\mathrm{T}}[(1-\tau)\rho(A_{1}e_{1}^{k+1}+A_{2}e_{2}^{k+1})],x_{1}^{k+1}-x_{1}^{*}\right\rangle\\+&\left\langle-A_2^{\mathrm{T}}e_y^{k+1},x_{2}^{k+1}-x_{2}^{*}\right\rangle+\left\langle -A_{2}{}^{\mathrm{T}}[(1-\tau)\rho(A_{1}e_{1}^{k+1}+A_{2}e_{2}^{k+1})],x_{2}^{k+1}-x_{2}^{*}\right\rangle+\left\langle -A_{1}{}^{\mathrm{T}}[\rho A_{2}(x_{2}^{k}-x_{2}^{k+1})],x_{1}^{k+1}-x_{1}^{*}\right\rangle\\=&\left\langle-A_1^{\mathrm{T}}e_y^{k+1},x_{1}^{k+1}-x_{1}^{*}\right\rangle+\left\langle-A_2^{\mathrm{T}}e_y^{k+1},x_{2}^{k+1}-x_{2}^{*}\right\rangle\\+&\left\langle -A_{1}{}^{\mathrm{T}}[(1-\tau)\rho(A_1x_1^{k+1}+A_2x_2^{k+1}-b)],x_{1}^{k+1}-x_{1}^{*}\right\rangle+\left\langle -A_{2}{}^{\mathrm{T}}[(1-\tau)\rho(A_1x_1^{k+1}+A_2x_2^{k+1}-b)],x_{2}^{k+1}-x_{2}^{*}\right\rangle+\left\langle -A_{1}{}^{\mathrm{T}}[\rho A_{2}(x_{2}^{k}-x_{2}^{k+1})],x_{1}^{k+1}-x_{1}^{*}\right\rangle\\=&\frac{1}{\tau\rho}\left\langle e_{y}^{k+1},e_{y}^{k}-e_{y}^{k+1}\right\rangle-(1-\tau)\rho\|A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\|^{2}\\+&\left\langle -A_{1}{}^{\mathrm{T}}[\rho A_{2}(x_{2}^{k}-x_{2}^{k+1})],x_{1}^{k+1}-x_{1}^{*}\right\rangle\end{aligned} ==+=+=+⟨uk+1+A1Ty∗,x1k+1−x1∗⟩+⟨vk+1+A2Ty∗,x2k+1−x2∗⟩⟨−A1T[yk+1+(1−τ)ρ(A1e1k+1+A2e2k+1)+ρA2(x2k−x2k+1)]+A1Ty∗,x1k+1−x1∗⟩+⟨−A2T[yk+1+(1−τ)ρ(A1e1k+1+A2e2k+1)]+A2Ty∗,x2k+1−x2∗⟩⟨−A1Teyk+1,x1k+1−x1∗⟩+⟨−A1T[(1−τ)ρ(A1e1k+1+A2e2k+1)],x1k+1−x1∗⟩⟨−A2Teyk+1,x2k+1−x2∗⟩+⟨−A2T[(1−τ)ρ(A1e1k+1+A2e2k+1)],x2k+1−x2∗⟩+⟨−A1T[ρA2(x2k−x2k+1)],x1k+1−x1∗⟩⟨−A1Teyk+1,x1k+1−x1∗⟩+⟨−A2Teyk+1,x2k+1−x2∗⟩⟨−A1T[(1−τ)ρ(A1x1k+1+A2x2k+1−b)],x1k+1−x1∗⟩+⟨−A2T[(1−τ)ρ(A1x1k+1+A2x2k+1−b)],x2k+1−x2∗⟩+⟨−A1T[ρA2(x2k−x2k+1)],x1k+1−x1∗⟩τρ1⟨eyk+1,eyk−eyk+1⟩−(1−τ)ρ∥A1x1k+1+A2x2k+1−b∥2⟨−A1T[ρA2(x2k−x2k+1)],x1k+1−x1∗⟩
最后可以得到
1 τ ρ ⟨ e y k + 1 , e y k − e y k + 1 ⟩ − ( 1 − τ ) ρ ∥ A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ∥ 2 + ρ ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ⟩ − ρ ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 2 e 2 k + 1 ⟩ ⩾ 0 ( 8.6.43 ) \begin{aligned}&\frac{1}{\tau\rho}\left\langle e_{y}^{k+1},e_{y}^{k}-e_{y}^{k+1}\right\rangle-(1-\tau)\rho\|A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\|^{2} \\&+\rho\left\langle A_{2}(x_{2}^{k+1}-x_{2}^{k}),A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\right\rangle \\&-\rho\left\langle A_{2}(x_{2}^{k+1}-x_{2}^{k}),A_{2}e_{2}^{k+1}\right\rangle\geqslant 0\end{aligned}\qquad(8.6.43) τρ1⟨eyk+1,eyk−eyk+1⟩−(1−τ)ρ∥A1x1k+1+A2x2k+1−b∥2+ρ⟨A2(x2k+1−x2k),A1x1k+1+A2x2k+1−b⟩−ρ⟨A2(x2k+1−x2k),A2e2k+1⟩⩾0(8.6.43)不等式 (8.6.43) 的形式和不等式 (8.6.41) 还有一定差异,主要的差别就在
ρ ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ⟩ \rho\left\langle A_2(x_2^{k+1}-x_2^k),A_1x_1^{k+1}+A_2x_2^{k+1}-b\right\rangle ρ⟨A2(x2k+1−x2k),A1x1k+1+A2x2k+1−b⟩这一项上. 接下来估计这一项的上界. 为了方便,引入新符号
ν k + 1 = y k + 1 + ( 1 − τ ) ρ ( A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ) M k + 1 = ( 1 − τ ) ρ ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 1 x 1 k + A 2 x 2 k − b ⟩ \begin{aligned}\nu^{k+1}&=y^{k+1}+(1-\tau)\rho(A_1x_1^{k+1}+A_2x_2^{k+1}-b) \\M^{k+1}&=(1-\tau)\rho\left\langle A_2(x_2^{k+1}-x_2^k),A_1x_1^k+A_2x_2^k-b\right\rangle\end{aligned} νk+1Mk+1=yk+1+(1−τ)ρ(A1x1k+1+A2x2k+1−b)=(1−τ)ρ⟨A2(x2k+1−x2k),A1x1k+A2x2k−b⟩则 − A 2 T ν k + 1 ∈ ∂ f 2 ( x 2 k + 1 ) -A_2^\mathrm{T}\nu^{k+1}\in\partial f_2(x_2^{k+1}) −A2Tνk+1∈∂f2(x2k+1) 以及 − A 2 T ν k ∈ ∂ f 2 ( x 2 k ) -A_2^\mathrm{T}\nu^k\in\partial f_2(x_2^k) −A2Tνk∈∂f2(x2k). 再利用单调性知
⟨ − A 2 T ( ν k + 1 − ν k ) , x 2 k + 1 − x 2 k ⟩ ⩾ 0 ( 8.6.44 ) \left\langle-A_2^\mathrm{T}(\nu^{k+1}-\nu^k),x_2^{k+1}-x_2^k\right\rangle\geqslant 0\qquad(8.6.44) ⟨−A2T(νk+1−νk),x2k+1−x2k⟩⩾0(8.6.44)根据这些不等式关系我们最终得到
ρ ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ⟩ = ( 1 − τ ) ρ ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ⟩ + ⟨ A 2 ( x 2 k + 1 − x 2 k ) , y k + 1 − y k ⟩ = M k + 1 + ⟨ ν k + 1 − ν k , A 2 ( x 2 k + 1 − x 2 k ) ⟩ ⩽ M k + 1 \begin{aligned}&\rho\left\langle A_{2}(x_{2}^{k+1}-x_{2}^{k}),A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\right\rangle \\=&(1-\tau)\rho\left\langle A_{2}(x_{2}^{k+1}-x_{2}^{k}),A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\right\rangle+\left\langle A_{2}(x_{2}^{k+1}-x_{2}^{k}),y^{k+1}-y^{k}\right\rangle \\=&M^{k+1}+\left\langle\nu^{k+1}-\nu^{k},A_{2}(x_{2}^{k+1}-x_{2}^{k})\right\rangle\\\leqslant&M^{k+1}\end{aligned} ==⩽ρ⟨A2(x2k+1−x2k),A1x1k+1+A2x2k+1−b⟩(1−τ)ρ⟨A2(x2k+1−x2k),A1x1k+1+A2x2k+1−b⟩+⟨A2(x2k+1−x2k),yk+1−yk⟩Mk+1+⟨νk+1−νk,A2(x2k+1−x2k)⟩Mk+1
估计完这一项之后,不等式 (8.6.43) 可以放缩成
1 τ ρ ⟨ e y k + 1 , e y k − e y k + 1 ⟩ − ( 1 − τ ) ρ ∥ A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ∥ 2 + M k + 1 − ρ ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 2 e 2 k + 1 ⟩ ⩾ 0 \begin{gathered} \frac{1}{\tau\rho}\left\langle e_{y}^{k+1},e_{y}^{k}-e_{y}^{k+1}\right\rangle-(1-\tau)\rho\|A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\|^{2}+M^{k+1}-\rho\left\langle A_{2}(x_{2}^{k+1}-x_{2}^{k}),A_{2}e_{2}^{k+1}\right\rangle\geqslant 0\end{gathered} τρ1⟨eyk+1,eyk−eyk+1⟩−(1−τ)ρ∥A1x1k+1+A2x2k+1−b∥2+Mk+1−ρ⟨A2(x2k+1−x2k),A2e2k+1⟩⩾0上式中含有内积项,利用恒等式
⟨ a , b ⟩ = 1 2 ( ∥ a ∥ 2 + ∥ b ∥ 2 − ∥ a − b ∥ 2 ) = 1 2 ( ∥ a + b ∥ 2 − ∥ a ∥ 2 − ∥ b ∥ 2 ) \langle a,b\rangle=\frac{1}{2}(\|a\|^2+\|b\|^2-\|a-b\|^2)=\frac{1}{2}(\|a+b\|^2-\|a\|^2-\|b\|^2) ⟨a,b⟩=21(∥a∥2+∥b∥2−∥a−b∥2)=21(∥a+b∥2−∥a∥2−∥b∥2)进一步得到
1 τ ρ ( ∥ e y k ∥ 2 − ∥ e y k + 1 ∥ 2 ) − ( 2 − τ ) ρ ∥ A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ∥ 2 + 2 M k + 1 − ρ ∥ A 2 ( x 2 k + 1 − x 2 k ) ∥ 2 − ρ ∥ A 2 e 2 k + 1 ∥ 2 + ρ ∥ A 2 e 2 k ∥ 2 ⩾ 0 ( 8.6.45 ) \begin{aligned}&\frac{1}{\tau\rho}(\|e_{y}^{k}\|^{2}-\|e_{y}^{k+1}\|^{2})-(2-\tau)\rho\|A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\|^{2}\\&+2M^{k+1}-\rho\|A_{2}(x_{2}^{k+1}-x_{2}^{k})\|^{2}-\rho\|A_{2}e_{2}^{k+1}\|^{2}+\rho\|A_{2}e_{2}^{k}\|^{2}\geqslant 0\end{aligned}\qquad(8.6.45) τρ1(∥eyk∥2−∥eyk+1∥2)−(2−τ)ρ∥A1x1k+1+A2x2k+1−b∥2+2Mk+1−ρ∥A2(x2k+1−x2k)∥2−ρ∥A2e2k+1∥2+ρ∥A2e2k∥2⩾0(8.6.45)此时除了 M k + 1 M^{k+1} Mk+1 中的项,(8.6.45) 中的其他项均在不等式 (8.6.41) 中出现. 由于 M k + 1 M^{k+1} Mk+1 的符号和 τ \tau τ 的取法有关,下面我们针对 τ \tau τ 的两种取法进行讨论.
情形一 τ ∈ ( 0 , 1 ] \tau\in(0,1] τ∈(0,1], 此时 M k + 1 ⩾ 0 M^{k+1}\geqslant 0 Mk+1⩾0, 根据基本不等式,
2 ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 1 x 1 k + A 2 x 2 k − b ⟩ ⩽ ∥ A 2 ( x 2 k + 1 − x 2 k ) ∥ 2 + ∥ A 1 x 1 k + A 2 x 2 k − b ∥ 2 2\left\langle A_2(x_2^{k+1}-x_2^k),A_1x_1^k+A_2x_2^k-b\right\rangle\leqslant\|A_{2}(x_{2}^{k+1}-x_{2}^{k})\|^{2}+\|A_{1}x_{1}^{k}+A_{2}x_{2}^{k}-b\|^{2} 2⟨A2(x2k+1−x2k),A1x1k+A2x2k−b⟩⩽∥A2(x2k+1−x2k)∥2+∥A1x1k+A2x2k−b∥2代入不等式 (8.6.45) 得到
1 τ ρ ∥ e y k ∥ 2 + ρ ∥ A 2 e 2 k ∥ 2 + ( 1 − τ ) ρ ∥ A 1 e 1 k + A 2 e 2 k ∥ 2 [ 1 τ ρ ∥ e y k + 1 ∥ 2 + ρ ∥ A 2 e 2 k + 1 ∥ 2 + ( 1 − τ ) ρ ∥ A 1 e 1 k + 1 + A 2 e 2 k + 1 ∥ 2 ] ⩾ ρ ∥ A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ∥ 2 + τ ρ ∥ A 2 ( x 2 k + 1 − x 2 k ) ∥ 2 ( 8.6.46 ) \begin{aligned}&\frac{1}{\tau\rho}\|e_{y}^{k}\|^{2}+\rho\|A_{2}e_{2}^{k}\|^{2}+(1-\tau)\rho\|A_{1}e_{1}^{k}+A_{2}e_{2}^{k}\|^{2}\left[\frac{1}{\tau\rho}\|e_{y}^{k+1}\|^{2}+\rho\|A_{2}e_{2}^{k+1}\|^{2}+(1-\tau)\rho\|A_{1}e_{1}^{k+1}+A_{2}e_{2}^{k+1}\|^{2}\right]\\&\geqslant\rho\|A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\|^{2}+\tau\rho\|A_{2}(x_{2}^{k+1}-x_{2}^{k})\|^{2}\end{aligned}\qquad(8.6.46) τρ1∥eyk∥2+ρ∥A2e2k∥2+(1−τ)ρ∥A1e1k+A2e2k∥2[τρ1∥eyk+1∥2+ρ∥A2e2k+1∥2+(1−τ)ρ∥A1e1k+1+A2e2k+1∥2]⩾ρ∥A1x1k+1+A2x2k+1−b∥2+τρ∥A2(x2k+1−x2k)∥2(8.6.46)情形二 τ > 1 \tau>1 τ>1, 此时 M k + 1 < 0 M^{k+1}<0 Mk+1<0, 根据基本不等式,
− 2 ⟨ A 2 ( x 2 k + 1 − x 2 k ) , A 1 x 1 k + A 2 x 2 k − b ⟩ ⩽ τ ∥ A 2 ( x 2 k + 1 − x 2 k ) ∥ 2 + 1 τ ∥ A 1 x 1 k + A 2 x 2 k − b ∥ 2 -2\left\langle A_{2}(x_{2}^{k+1}-x_{2}^{k}),A_{1}x_{1}^{k}+A_{2}x_{2}^{k}-b\right\rangle\leqslant\tau\|A_{2}(x_{2}^{k+1}-x_{2}^{k})\|^{2}+\frac{1}{\tau}\|A_{1}x_{1}^{k}+A_{2}x_{2}^{k}-b\|^{2} −2⟨A2(x2k+1−x2k),A1x1k+A2x2k−b⟩⩽τ∥A2(x2k+1−x2k)∥2+τ1∥A1x1k+A2x2k−b∥2同样代入不等式 (8.6.45) 可以得到
1 τ ρ ∥ e y k ∥ 2 + ρ ∥ A 2 e 2 k ∥ 2 + ( 1 − 1 τ ) ρ ∥ A 1 e 1 k + A 2 e 2 k ∥ 2 − [ 1 τ ρ ∥ e y k + 1 ∥ 2 + ρ ∥ A 2 e 2 k + 1 ∥ 2 + ( 1 − 1 τ ) ρ ∥ A 1 e 1 k + 1 + A 2 e 2 k + 1 ∥ 2 ] ⩾ ( 1 + 1 τ − τ ) ρ ∥ A 1 x 1 k + 1 + A 2 x 2 k + 1 − b ∥ 2 + ( 1 + τ − τ 2 ) ρ ∥ A 2 ( x 2 k + 1 − x 2 k ) ∥ 2 ( 8.6.47 ) \begin{aligned}&\frac{1}{\tau\rho}\|e_{y}^{k}\|^{2}+\rho\|A_{2}e_{2}^{k}\|^{2}+\left(1-\frac{1}{\tau}\right)\rho\|A_{1}e_{1}^{k}+A_{2}e_{2}^{k}\|^{2}-\left[\frac{1}{\tau\rho}\|e_{y}^{k+1}\|^{2}+\rho\|A_{2}e_{2}^{k+1}\|^{2}+\left(1-\frac{1}{\tau}\right)\rho\|A_{1}e_{1}^{k+1}+A_{2}e_{2}^{k+1}\|^{2}\right]\\&\geqslant\left(1+\frac{1}{\tau}-\tau\right)\rho\|A_{1}x_{1}^{k+1}+A_{2}x_{2}^{k+1}-b\|^{2}+(1+\tau-\tau^{2})\rho\|A_{2}(x_{2}^{k+1}-x_{2}^{k})\|^{2}\end{aligned}\qquad(8.6.47) τρ1∥eyk∥2+ρ∥A2e2k∥2+(1−τ1)ρ∥A1e1k+A2e2k∥2−[τρ1∥eyk+1∥2+ρ∥A2e2k+1∥2+(1−τ1)ρ∥A1e1k+1+A2e2k+1∥2]⩾(1+τ1−τ)ρ∥A1x1k+1+A2x2k+1−b∥2+(1+τ−τ2)ρ∥A2(x2k+1−x2k)∥2(8.6.47)整合(8.6.46)式和(8.6.47)式即可得到不等式 (8.6.41). 注意,只有当 τ ∈ ( 0 , 1 + 5 2 ) \tau\in\left(0,\dfrac{1+\sqrt{5}}{2}\right) τ∈(0,21+5) 时,(8.6.41) 式中不等号右侧的项才为非负.
引理 8.7 中 (8.6.40) 式直接利用了每个子问题的最优性条件以及 KKT 条件,不等式 (8.6.41) 的直观解释是迭代点误差的某种度量 Φ k \Phi_k Φk 是单调有界的.
定理 8.16 在假设 8.3 的条件下,进一步假定 A 1 , A 2 A_1,A_2 A1,A2 列满秩. 如果 τ ∈ ( 0 , 1 + 5 2 ) \tau\in\left(0,\dfrac{1+\sqrt{5}}{2}\right) τ∈(0,21+5), 则序列 { ( x 1 k , x 2 k , y k ) } \left\{(x_{1}^{k},x_{2}^{k},y^{k})\right\} {(x1k,x2k,yk)} 收敛到原始问题的一个 KKT 对.
证明
引理 8.7 表明 Φ k \Phi_k Φk 是有界列,根据 Φ k \Phi_k Φk 的定义(8.6.39), 我们有:
Φ k = Ψ k + max ( 1 − τ , 1 − τ − 1 ) ρ ∥ A 1 e 1 k + A 2 e 2 k ∥ 2 \Phi_k=\Psi_k+\max(1-\tau,1-\tau^{-1})\rho\|A_1e_1^k+A_2e_2^k\|^2 Φk=Ψk+max(1−τ,1−τ−1)ρ∥A1e1k+A2e2k∥2由于 Φ k \Phi_k Φk 是有界的,所以 Ψ k \Psi_k Ψk 也是有界的。再根据 Ψ k \Psi_k Ψk 的定义:
Ψ k = 1 τ ρ ∥ e y k ∥ 2 + ρ ∥ A 2 e 2 k ∥ 2 \Psi_k=\frac{1}{\tau\rho}\|e_y^k\|^2+\rho\|A_2e_2^k\|^2 Ψk=τρ1∥eyk∥2+ρ∥A2e2k∥2可知
∥ e y k ∥ , ∥ A 2 e 2 k ∥ , ∥ A 1 e 1 k + A 2 e 2 k ∥ \|e_y^k\|,\quad\|A_2e_2^k\|,\quad\|A_1e_1^k+A_2e_2^k\| ∥eyk∥,∥A2e2k∥,∥A1e1k+A2e2k∥均有界. 根据不等式
∥ A 1 e 1 k ∥ ⩽ ∥ A 1 e 1 k + A 2 e 2 k ∥ + ∥ A 2 e 2 k ∥ \|A_1e_1^k\|\leqslant\|A_1e_1^k+A_2e_2^k\|+\|A_2e_2^k\| ∥A1e1k∥⩽∥A1e1k+A2e2k∥+∥A2e2k∥可以进一步推出 { ∥ A 1 e 1 k ∥ } \{\|A_1e_1^k\|\} {∥A1e1k∥} 也是有界序列. 注意到 A 1 T A 1 ≻ 0 , A 2 T A 2 ≻ 0 A_1^\mathrm{T}A_1\succ 0,A_2^\mathrm{T}A_2\succ 0 A1TA1≻0,A2TA2≻0, 因此以上有界性也等价于 { ( x 1 k , x 2 k , y k ) } \{(x_1^k,x_2^k,y^k)\} {(x1k,x2k,yk)} 是有界序列.
另一个直接结果就是无穷级数
∑ k = 0 ∞ ∥ A 1 e 1 k + A 2 e 2 k ∥ 2 , ∑ k = 0 ∞ ∥ A 2 ( x 2 k + 1 − x 2 k ) ∥ 2 \sum\limits_{k=0}^{\infty}\|A_{1}e_{1}^{k}+A_{2}e_{2}^{k}\|^{2},\quad \sum\limits_{k=0}^{\infty}\|A_{2}(x_{2}^{k+1}-x_{2}^{k})\|^{2} k=0∑∞∥A1e1k+A2e2k∥2,k=0∑∞∥A2(x2k+1−x2k)∥2都是收敛的,这表明
∥ A 1 e 1 k + A 2 e 2 k ∥ = ∥ A 1 x 1 k + A 2 x 2 k − b ∥ → 0 ∥ A 2 ( x 2 k + 1 − x 2 k ) ∥ → 0 ( 8.6.48 ) \begin{aligned}\|A_1e_1^k+A_2e_2^k\|=\|A_1x_1^k+A_2x_2^k-b\|\to 0\\\|A_2(x_2^{k+1}-x_2^k)\|\to 0\end{aligned}\qquad(8.6.48) ∥A1e1k+A2e2k∥=∥A1x1k+A2x2k−b∥→0∥A2(x2k+1−x2k)∥→0(8.6.48)下面推导收敛性.
首先证明迭代点子列的收敛性. 由于 { ( x 1 k , x 2 k , y k ) } \{(x_1^k,x_2^k,y^k)\} {(x1k,x2k,yk)} 是有界序列,因此它存在一个收敛子列,设
( x 1 k j , x 2 k j , y k j ) → ( x 1 ∞ , x 2 ∞ , y ∞ ) (x_1^{k_j},x_2^{k_j},y^{k_j})\to(x_1^\infty,x_2^\infty,y^\infty) (x1kj,x2kj,ykj)→(x1∞,x2∞,y∞)利用 (8.6.39) 式中的 u k u^k uk 和 v k v^k vk 的定义以及 (8.6.48) 式,有:
u k + 1 = − A 1 T [ y k + 1 + ( 1 − τ ) ρ ( A 1 e 1 k + 1 + A 2 e 2 k + 1 ) + ρ A 2 ( x 2 k − x 2 k + 1 ) ] v k + 1 = − A 2 T [ y k + 1 + ( 1 − τ ) ρ ( A 1 e 1 k + 1 + A 2 e 2 k + 1 ) ] \begin{aligned}&u^{k+1}=-A_1^\mathrm{T}\left[y^{k+1}+(1-\tau)\rho(A_1e_1^{k+1}+A_2e_2^{k+1})+\rho A_2(x_2^k-x_2^{k+1})\right]\\&v^{k+1}=-A_2^\mathrm{T}\left[y^{k+1}+(1-\tau)\rho(A_1e_1^{k+1}+A_2e_2^{k+1})\right]\end{aligned} uk+1=−A1T[yk+1+(1−τ)ρ(A1e1k+1+A2e2k+1)+ρA2(x2k−x2k+1)]vk+1=−A2T[yk+1+(1−τ)ρ(A1e1k+1+A2e2k+1)]当 k → ∞ k\to\infty k→∞ 时,由于 ∥ A 2 ( x 2 k + 1 − x 2 k ) ∥ → 0 \|A_2(x_2^{k+1}-x_2^k)\|\to 0 ∥A2(x2k+1−x2k)∥→0, 以及 ∥ A 1 e 1 k + A 2 e 2 k ∥ → 0 \|A_1e_1^k+A_2e_2^k\|\to 0 ∥A1e1k+A2e2k∥→0, 可得 { u k } \{u^k\} {uk} 与 { v k } \{v^k\} {vk} 相应的子列也收敛:
u ∞ = d e f lim j → ∞ u k j = − A 1 T y ∞ , v ∞ = lim j → ∞ v k j = − A 2 T y ∞ ( 8.6.49 ) u^{\infty}\stackrel{\mathrm{def}}{=}\lim_{j\to\infty}u^{k_{j}}=-A_{1}^{\mathrm{T}}y^{\infty},\quad v^{\infty}=\lim_{j\to\infty}v^{k_{j}}=-A_{2}^{\mathrm{T}}y^{\infty}\qquad(8.6.49) u∞=defj→∞limukj=−A1Ty∞,v∞=j→∞limvkj=−A2Ty∞(8.6.49)从 (8.6.40) 式可知对于任意的 k ⩾ 1 k\geqslant 1 k⩾1, 有 u k ∈ ∂ f 1 ( x 1 k ) , v k ∈ ∂ f 2 ( x 2 k ) u^k\in\partial f_1(x_1^k), v^k\in\partial f_2(x_2^k) uk∈∂f1(x1k),vk∈∂f2(x2k). 利用定理 2.19 中次梯度映射的图像是闭集可知
− A 1 y ∞ ∈ ∂ f 1 ( x 1 ∞ ) , − A 2 y ∞ ∈ ∂ f 2 ( x 2 ∞ ) -A_1y^\infty\in\partial f_1(x_1^\infty),\quad-A_2y^\infty\in\partial f_2(x_2^\infty) −A1y∞∈∂f1(x1∞),−A2y∞∈∂f2(x2∞)由 (8.6.48) 的第一式可知
lim j → ∞ ∥ A 1 x 1 k j + A 2 x 2 k j − b ∥ = ∥ A 1 x 1 ∞ + A 2 x 2 ∞ − b ∥ = 0 \lim\limits_{j\to\infty}\|A_1x_1^{k_j}+A_2x_2^{k_j}-b\|=\|A_1x_1^{\infty}+A_2x_2^{\infty}-b\|=0 j→∞lim∥A1x1kj+A2x2kj−b∥=∥A1x1∞+A2x2∞−b∥=0这表明 ( x 1 ∞ , x 2 ∞ , y ∞ ) (x_1^\infty,x_2^\infty,y^\infty) (x1∞,x2∞,y∞) 是原始问题的一个 KKT 对. 因此上述分析中的 ( x 1 ∗ , x 2 ∗ , y ∗ ) (x_1^*,x_2^*,y^*) (x1∗,x2∗,y∗) 均可替换为 ( x 1 ∞ , x 2 ∞ , y ∞ ) (x_1^\infty,x_2^\infty,y^\infty) (x1∞,x2∞,y∞).
为了说明 { ( x 1 k , x 2 k , y k ) } \{(x_1^k,x_2^k,y^k)\} {(x1k,x2k,yk)} 全序列的收敛性,我们注意到 Φ k \Phi_k Φk 是单调下降的,且对子列 { Φ k j } \left\{\Phi_{k_j}\right\} {Φkj} 有
lim j → ∞ Φ k j = lim j → ∞ ( 1 τ ρ ∥ e y k j ∥ 2 + ρ ∥ A 2 e 2 k j ∥ 2 + max { 1 − τ , 1 − 1 τ } ρ ∥ A 1 e 1 k j + A 2 e 2 k j ∥ 2 ) = 0 \begin{aligned}&\lim_{j\to\infty}\Phi_{k_{j}}\\=&\lim\limits_{j\to\infty}\left(\frac{1}{\tau\rho}\|e_{y}^{k_{j}}\|^{2}+\rho\|A_{2}e_{2}^{k_{j}}\|^{2}+\max\:\left\{1-\tau,1-\frac{1}{\tau}\right\}\rho\|A_{1}e_{1}^{k_{j}}+A_{2}e_{2}^{k_{j}}\|^{2}\right)\\=&0\end{aligned} ==j→∞limΦkjj→∞lim(τρ1∥eykj∥2+ρ∥A2e2kj∥2+max{1−τ,1−τ1}ρ∥A1e1kj+A2e2kj∥2)0由于单调序列的子列收敛等价于全序列收敛,因此 lim k → ∞ Φ k = 0 \lim\limits_{k\to\infty}\Phi_k=0 k→∞limΦk=0, 从而可以立即得到
0 ⩽ lim sup k → ∞ 1 τ ρ ∥ e y k ∥ 2 ⩽ lim sup k → ∞ Φ k = 0 0 ⩽ lim sup k → ∞ ρ ∥ A 2 e 2 k ∥ 2 ⩽ lim sup k → ∞ Φ k = 0 0 ⩽ lim sup k → ∞ { max { 1 − τ , 1 − 1 τ } ρ ∥ A 1 e 1 k + A 2 e 2 k ∥ 2 } ⩽ lim sup k → ∞ Φ k = 0 \begin{aligned}&0\leqslant\limsup_{k\to\infty}\frac{1}{\tau\rho}\|e_{y}^{k}\|^{2}\leqslant\limsup_{k\to\infty}\Phi_{k}=0\\&0\leqslant\limsup_{k\to\infty}\rho\|A_{2}e_{2}^{k}\|^{2}\leqslant\limsup_{k\to\infty}\Phi_{k}=0\\&0\leqslant\limsup_{k\to\infty}\left\{\max\:\{1-\tau,1-\frac{1}{\tau}\}\rho\|A_{1}e_{1}^{k}+A_{2}e_{2}^{k}\|^{2}\right\}\leqslant\limsup_{k\to\infty}\Phi_{k}=0\end{aligned} 0⩽k→∞limsupτρ1∥eyk∥2⩽k→∞limsupΦk=00⩽k→∞limsupρ∥A2e2k∥2⩽k→∞limsupΦk=00⩽k→∞limsup{max{1−τ,1−τ1}ρ∥A1e1k+A2e2k∥2}⩽k→∞limsupΦk=0这说明
∥ e y k ∥ → 0 , ∥ A 2 e 2 k ∥ → 0 , ∥ A 1 e 1 k + A 2 e 2 k ∥ → 0 , \|e_y^k\|\to 0,\quad\|A_2e_2^k\|\to 0,\quad\|A_1e_1^k+A_2e_2^k\|\to 0, ∥eyk∥→0,∥A2e2k∥→0,∥A1e1k+A2e2k∥→0,进一步有
0 ⩽ lim sup k → ∞ ∥ A 1 e 1 k ∥ ⩽ lim k → ∞ ( ∥ A 2 e 2 k ∥ + ∥ A 1 e 1 k + A 2 e 2 k ∥ ) = 0 0\leqslant\limsup\limits_{k\to\infty}\|A_1e_1^k\|\leqslant\lim\limits_{k\to\infty}\left(\|A_2e_2^k\|+\|A_1e_1^k+A_2e_2^k\|\right)=0 0⩽k→∞limsup∥A1e1k∥⩽k→∞lim(∥A2e2k∥+∥A1e1k+A2e2k∥)=0注意到 A 1 T A 1 ≻ 0 , A 2 T A 2 ≻ 0 A_1^\mathrm{T}A_1\succ 0,A_2^\mathrm{T}A_2\succ 0 A1TA1≻0,A2TA2≻0, 所以最终我们得到全序列收敛:
( x 1 k , x 2 k , y k ) → ( x 1 ∞ , x 2 ∞ , y ∞ ) (x_1^k,x_2^k,y^k)\to(x_1^\infty,x_2^\infty,y^\infty) (x1k,x2k,yk)→(x1∞,x2∞,y∞)