对于离散的线性时变系统
Δ
k
+
1
=
(
I
−
α
k
G
k
)
Δ
k
\Delta_{k+1}=(I-\alpha_kG_k)\Delta_{k}
Δk+1=(I−αkGk)Δk,其中
α
k
∈
R
\alpha_k \in R
αk∈R,
G
k
G_k
Gk未知但
∣
∣
G
k
∣
∣
≤
c
||G_k|| \leq c
∣∣Gk∣∣≤c,
Δ
k
∈
R
d
\Delta_k \in R^d
Δk∈Rd,
I
∈
R
d
×
d
I\in R^{d \times d}
I∈Rd×d为单位矩阵。
试问:当
α
k
\alpha_k
αk满足什么条件时,
Δ
k
\Delta_k
Δk能渐进稳定到0?
能求解出来者,私信我必有重谢!
2023.09.08
目前得到的一条引理是:
(引理1) 对于任意
k
k
k,若存在
ρ
<
1
\rho < 1
ρ<1,使仅有有限个
k
k
k使得
∣
∣
I
−
α
k
G
k
∣
∣
>
ρ
||I-\alpha_k G_k||>\rho
∣∣I−αkGk∣∣>ρ成立,则
Δ
k
→
0
\Delta_k\rightarrow 0
Δk→0.
Proof. 由
Δ
k
+
1
=
(
I
−
α
k
G
k
)
Δ
k
\Delta_{k+1}=(I-\alpha_kG_k)\Delta_{k}
Δk+1=(I−αkGk)Δk可知
Δ
k
=
∏
i
=
0
k
−
1
(
I
−
α
i
G
i
)
Δ
0
\Delta_k=\prod_{i=0}^{k-1}(I-\alpha_iG_i)\Delta_0
Δk=∏i=0k−1(I−αiGi)Δ0可以得到:
∣
∣
Δ
k
∣
∣
=
∣
∣
∏
i
=
0
k
−
1
(
I
−
α
i
G
i
)
Δ
0
∣
∣
≤
∏
i
=
0
k
−
1
∣
∣
(
I
−
α
i
G
i
)
∣
∣
∣
∣
Δ
0
∣
∣
||\Delta_k||=||\prod_{i=0}^{k-1}(I-\alpha_iG_i)\Delta_0||\\ \leq \prod_{i=0}^{k-1}||(I-\alpha_iG_i)||||\Delta_0||
∣∣Δk∣∣=∣∣i=0∏k−1(I−αiGi)Δ0∣∣≤i=0∏k−1∣∣(I−αiGi)∣∣∣∣Δ0∣∣ 而由已知假设有
L
L
L个整数
i
=
i
0
,
i
1
.
.
.
i
L
∈
N
i=i_0,i_1...i_L\in \mathbb N
i=i0,i1...iL∈N有:
∣
∣
(
I
−
α
i
G
i
)
∣
∣
>
ρ
||(I-\alpha_iG_i)|| > \rho
∣∣(I−αiGi)∣∣>ρ,而对于
i
=
i
L
,
i
L
+
1
.
.
.
i
N
∈
N
,
N
→
∞
i=i_L,i_{L+1}...i_{N}\in \mathbb N,N\rightarrow \infty
i=iL,iL+1...iN∈N,N→∞有:
∣
∣
(
I
−
α
i
G
i
)
∣
∣
≤
ρ
||(I-\alpha_iG_i)|| \leq \rho
∣∣(I−αiGi)∣∣≤ρ. 而
lim
k
→
∞
∏
i
=
0
k
−
1
∣
∣
(
I
−
α
i
G
i
)
∣
∣
\lim_{k\rightarrow \infty}\prod_{i=0}^{k-1}||(I-\alpha_iG_i)||
limk→∞∏i=0k−1∣∣(I−αiGi)∣∣可以展开为:
lim
k
→
∞
∏
i
=
0
k
−
1
∣
∣
(
I
−
α
i
G
i
)
∣
∣
∣
∣
Δ
0
∣
∣
=
∏
i
=
0
∞
∣
∣
(
I
−
α
i
G
i
)
∣
∣
∣
∣
Δ
0
∣
∣
=
∏
i
=
i
0
i
L
∣
∣
(
I
−
α
i
G
i
)
∣
∣
×
∏
i
=
i
L
+
1
∞
∣
∣
(
I
−
α
i
G
i
)
∣
∣
×
∣
∣
Δ
0
∣
∣
\lim_{k\rightarrow \infty}\prod_{i=0}^{k-1}||(I-\alpha_iG_i)||||\Delta_0||\\ = \prod_{i=0}^{\infty}||(I-\alpha_iG_i)||||\Delta_0|| \\ =\prod_{i=i_0}^{i_L}||(I-\alpha_iG_i)||\times \prod_{i=i_{L+1}}^{\infty}||(I-\alpha_iG_i)|| \times||\Delta_0||
k→∞limi=0∏k−1∣∣(I−αiGi)∣∣∣∣Δ0∣∣=i=0∏∞∣∣(I−αiGi)∣∣∣∣Δ0∣∣=i=i0∏iL∣∣(I−αiGi)∣∣×i=iL+1∏∞∣∣(I−αiGi)∣∣×∣∣Δ0∣∣容易知道:
∏
i
=
i
L
+
1
∞
∣
∣
(
I
−
α
i
G
i
)
∣
∣
=
lim
N
→
∞
∏
i
=
i
L
+
1
N
∣
∣
(
I
−
α
i
G
i
)
∣
∣
\prod_{i=i_{L+1}}^{\infty}||(I-\alpha_iG_i)||= \lim_{N\rightarrow \infty} \prod_{i=i_{L+1}}^{N}||(I-\alpha_iG_i)||\\
i=iL+1∏∞∣∣(I−αiGi)∣∣=N→∞limi=iL+1∏N∣∣(I−αiGi)∣∣因此:
∏
i
=
i
L
+
1
N
∣
∣
(
I
−
α
i
G
i
)
∣
∣
≤
ρ
N
−
i
L
+
1
+
1
\prod_{i=i_{L+1}}^{N}||(I-\alpha_iG_i)|| \leq\rho^{N- i_{L+1} + 1}
i=iL+1∏N∣∣(I−αiGi)∣∣≤ρN−iL+1+1故
∏
i
=
i
L
+
1
∞
∣
∣
(
I
−
α
i
G
i
)
∣
∣
∣
∣
Δ
0
∣
∣
=
0
\prod_{i=i_{L+1}}^{\infty}||(I-\alpha_iG_i)||||\Delta_0|| =0
∏i=iL+1∞∣∣(I−αiGi)∣∣∣∣Δ0∣∣=0,从而
lim
k
→
∞
∏
i
=
0
k
−
1
∣
∣
(
I
−
α
i
G
i
)
∣
∣
∣
∣
Δ
0
∣
∣
=
0
\lim_{k\rightarrow \infty}\prod_{i=0}^{k-1}||(I-\alpha_iG_i)||||\Delta_0||=0
limk→∞∏i=0k−1∣∣(I−αiGi)∣∣∣∣Δ0∣∣=0,即
∣
∣
Δ
k
∣
∣
→
0
||\Delta_k||\rightarrow 0
∣∣Δk∣∣→0,即
Δ
k
→
0
\Delta_k\rightarrow 0
Δk→0.
若
G
k
G_k
Gk是实对称对角矩阵,则
G
k
=
Q
k
T
Λ
k
Q
k
G_k=Q_k^T\Lambda_kQ_k
Gk=QkTΛkQk,其中
Q
k
Q_k
Qk是实对称正交矩阵:
Q
k
T
Q
k
=
I
Q_k^TQ_k=I
QkTQk=I.因此:
∣
∣
I
−
α
i
G
i
∣
∣
=
∣
∣
Q
i
T
Q
i
−
α
i
Q
i
T
Λ
i
Q
i
∣
∣
=
∣
∣
Q
i
T
(
I
−
α
i
Λ
i
)
Q
i
∣
∣
≤
∣
∣
Q
i
T
∣
∣
∣
∣
(
I
−
α
i
Λ
i
)
∣
∣
∣
∣
Q
i
∣
∣
=
∣
∣
(
I
−
α
i
Λ
i
)
∣
∣
=
max
λ
(
(
I
−
α
i
Λ
i
)
T
(
I
−
α
i
Λ
i
)
)
=
max
1
≤
j
≤
d
∣
1
−
α
i
λ
j
(
G
i
)
∣
||I-\alpha_i G_i||=||Q_i^TQ_i-\alpha_iQ_i^T\Lambda_iQ_i||\\=||Q_i^T(I-\alpha_i\Lambda_i)Q_i||\\\leq ||Q_i^T||||(I-\alpha_i\Lambda_i)||||Q_i||\\=||(I-\alpha_i\Lambda_i)||\\=\sqrt{\max \lambda((I-\alpha_i\Lambda_i)^T(I-\alpha_i\Lambda_i))}\\=\max_{1\leq j \leq d}|1-\alpha_i\lambda_j(G_i)|
∣∣I−αiGi∣∣=∣∣QiTQi−αiQiTΛiQi∣∣=∣∣QiT(I−αiΛi)Qi∣∣≤∣∣QiT∣∣∣∣(I−αiΛi)∣∣∣∣Qi∣∣=∣∣(I−αiΛi)∣∣=maxλ((I−αiΛi)T(I−αiΛi))=1≤j≤dmax∣1−αiλj(Gi)∣ 而使
∣
∣
I
−
α
i
G
i
∣
∣
≤
ρ
,
ρ
<
1
||I-\alpha_i G_i||\leq \rho,\rho < 1
∣∣I−αiGi∣∣≤ρ,ρ<1 对于任意无穷个
i
i
i成立即可保证
Δ
∞
→
0
\Delta_{\infty} \rightarrow 0
Δ∞→0.
实际上即是使学习率
α
i
\alpha_i
αi满足以下条件即可:
∃
ρ
<
1
,
∀
j
=
1
,
2..
d
\exists\rho <1 ,\forall j=1,2..d
∃ρ<1,∀j=1,2..d,对于无穷个
i
i
i:
1
−
ρ
λ
j
(
G
i
)
≤
α
i
≤
1
+
ρ
λ
j
(
G
i
)
\frac{1-\rho}{\lambda_j(G_i)}\leq \alpha_i \leq \frac{1+\rho}{\lambda_j(G_i)}
λj(Gi)1−ρ≤αi≤λj(Gi)1+ρ 以上条件只要保证有无穷个
i
i
i使得
α
i
∈
[
1
−
ρ
λ
min
1
≤
j
≤
d
(
G
i
)
,
1
+
ρ
λ
max
1
≤
j
≤
d
(
G
i
)
]
\alpha_i \in [\frac{1-\rho}{\lambda_{\min_{1\leq j \leq d}}(G_i)},\frac{1+\rho}{\lambda_{\max_{1\leq j \leq d}}(G_i)}]
αi∈[λmin1≤j≤d(Gi)1−ρ,λmax1≤j≤d(Gi)1+ρ]即可保证
Δ
k
→
0
\Delta_k \rightarrow 0
Δk→0。
--------------------2023.12.23更新----------------------
由之前的结论,这里找到了当
ρ
∈
[
c
−
1
c
+
1
,
1
)
\rho \in [\frac{c-1}{c+1},1)
ρ∈[c+1c−1,1),其中
c
=
sup
i
λ
max
(
G
i
)
λ
min
(
G
i
)
c=\sup_i{\frac{\lambda_{\max}(G_i)}{\lambda_{\min}(G_i)}}
c=supiλmin(Gi)λmax(Gi) 时,若有
α
i
→
α
∈
[
1
−
ρ
λ
min
1
≤
j
≤
d
(
G
i
)
,
1
+
ρ
λ
max
1
≤
j
≤
d
(
G
i
)
]
\alpha_i\rightarrow \alpha \in [\frac{1-\rho}{\lambda_{\min_{1\leq j \leq d}}(G_i)},\frac{1+\rho}{\lambda_{\max_{1\leq j \leq d}}(G_i)}]
αi→α∈[λmin1≤j≤d(Gi)1−ρ,λmax1≤j≤d(Gi)1+ρ]或在该区间内有无数个
α
i
\alpha_i
αi时,必然有
Δ
i
→
0
\Delta_i \rightarrow 0
Δi→0,且
ρ
\rho
ρ直接决定了收敛速度为
O
(
(
c
−
1
c
+
1
)
n
)
O((\frac{c-1}{c+1})^n)
O((c+1c−1)n).