参考学习视频如下:
B站大佬
1. [优化算法] 梯度下降法-强凸函数的收敛性分析(上)
1.1 概述
- f f f有下界,m-强凸,可微
- ∇ f \nabla f ∇f 是 L − L i p s c h i t z L-Lipschitz L−Lipschitz连续
- α ∈ ( 0 , 2 L + m ) \alpha \in (0,\frac{2}{L+m}) α∈(0,L+m2)
则 { x k } → Q − 线性收敛 x ∗ \{x_k \}\xrightarrow[]{Q-线性收敛}x^* {xk}Q−线性收敛x∗
1.2 证明
如果我们要证明
{
x
k
}
\{x_k \}
{xk}线性收敛,那么就需要证明:
∣
∣
x
k
+
1
−
x
∗
∣
∣
∣
∣
x
k
−
x
∗
∣
∣
=
c
,
c
∈
(
0
,
1
)
\begin{equation} \frac{||x_{k+1}-x^*||}{||x_{k}-x^*||}=c,c \in(0,1) \end{equation}
∣∣xk−x∗∣∣∣∣xk+1−x∗∣∣=c,c∈(0,1)
我们定义梯度下降法表示如下:
α
k
\alpha_k
αk表示步长,
p
k
p_k
pk表示方向
x
k
+
1
=
x
k
−
α
∇
f
(
x
k
)
\begin{equation} x_{k+1}=x_k-\alpha\nabla f(x_k) \end{equation}
xk+1=xk−α∇f(xk)
- 那么可得:
∣ ∣ x k + 1 − x ∗ ∣ ∣ 2 = ∣ ∣ x k − α k ∇ f ( x k ) − x ∗ ∣ ∣ 2 = ∣ ∣ x k − x ∗ ∣ ∣ 2 − 2 α k ∇ f T ( x k ) ( x k − x ∗ ) + α k 2 ∣ ∣ ∇ f ( x k ) ∣ ∣ 2 \begin{equation} ||x_{k+1}-x^*||^2=||x_k-\alpha_k\nabla f(x_k)-x^*||^2=||x_k-x^*||^2-2\alpha_k\nabla f^T(x_k)(x_k-x^*)+\alpha_k^2||\nabla f(x_k)||^2 \end{equation} ∣∣xk+1−x∗∣∣2=∣∣xk−αk∇f(xk)−x∗∣∣2=∣∣xk−x∗∣∣2−2αk∇fT(xk)(xk−x∗)+αk2∣∣∇f(xk)∣∣2 - 我们知道在
x
∗
x^*
x∗上的梯度为0,则
∇
f
T
(
x
∗
)
=
0
\nabla f^T(x^*)=0
∇fT(x∗)=0,整理上述公式可得:
∣ ∣ x k + 1 − x ∗ ∣ ∣ 2 = ∣ ∣ x k − x ∗ ∣ ∣ 2 − 2 α k [ ∇ f T ( x k ) − ∇ f T ( x ∗ ) ] ( x k − x ∗ ) + α k 2 ∣ ∣ ∇ f ( x k ) ∣ ∣ 2 \begin{equation} ||x_{k+1}-x^*||^2=||x_k-x^*||^2-2\alpha_k[\nabla f^T(x_k)-\nabla f^T(x^*)](x_k-x^*)+\alpha_k^2||\nabla f(x_k)||^2 \end{equation} ∣∣xk+1−x∗∣∣2=∣∣xk−x∗∣∣2−2αk[∇fT(xk)−∇fT(x∗)](xk−x∗)+αk2∣∣∇f(xk)∣∣2 - 定义
g
(
x
)
g(x)
g(x)函数如下:
g ( x ) ≜ f ( x ) − 1 2 m x T x ; ∇ g ( x ) = ∇ f ( x ) − m x \begin{equation} g(x)\triangleq f(x)-\frac{1}{2}mx^Tx;\nabla g(x)=\nabla f(x)-mx \end{equation} g(x)≜f(x)−21mxTx;∇g(x)=∇f(x)−mx - 因为 f是m-强凸函数,所以可得 g(x)也是凸的,因为f是可微的,所以g也是可微的。
h ( x ) ≜ 1 2 L x T x − f ( x ) → h ( x ) 为凸函数 \begin{equation} h(x)\triangleq \frac{1}{2}Lx^Tx-f(x)\rightarrow h(x)为凸函数 \end{equation} h(x)≜21LxTx−f(x)→h(x)为凸函数 - 整理可得:
h ( x ) = 1 2 L x T x − 1 2 m x T x − g ( x ) = 1 2 ( L − m ) x T x − g ( x ) \begin{equation} h(x)=\frac{1}{2}Lx^Tx-\frac{1}{2}mx^Tx-g(x)=\frac{1}{2}(L-m)x^Tx-g(x) \end{equation} h(x)=21LxTx−21mxTx−g(x)=21(L−m)xTx−g(x)
h ( x ) = 1 2 ( L − m ) x T x − g ( x ) \begin{equation} h(x)=\frac{1}{2}(L-m)x^Tx-g(x) \end{equation} h(x)=21(L−m)xTx−g(x) - 由于 g ( x ) , 1 2 ( L − m ) x T x − g ( x ) g(x),\frac{1}{2}(L-m)x^Tx-g(x) g(x),21(L−m)xTx−g(x)为凸函数,
- 由
白老爹定理
白老爹定理 条件2 --> 条件3,可得: ∇ g ( x ) 满足余强制性 \nabla g(x)满足余强制性 ∇g(x)满足余强制性
( ∇ g ( x ) − ∇ g ( y ) ) T ( x − y ) ≥ 1 L − m ∣ ∣ ∇ g ( x ) − ∇ g ( y ) ∣ ∣ 2 \begin{equation} (\nabla g(x)-\nabla g(y))^T(x-y)\ge \frac{1}{L-m}||\nabla g(x)-\nabla g(y)||^2 \end{equation} (∇g(x)−∇g(y))T(x−y)≥L−m1∣∣∇g(x)−∇g(y)∣∣2 - 将
∇
g
(
x
)
=
∇
f
(
x
)
−
m
x
\nabla g(x)=\nabla f(x)-mx
∇g(x)=∇f(x)−mx代入可得:
[ ∇ f ( x ) − ∇ f ( y ) − m ( x − y ) ] T ( x − y ) ≥ 1 L − m [ ∇ f ( x ) − ∇ f ( y ) − m ( x − y ) ] 2 \begin{equation} [\nabla f(x)-\nabla f(y)-m(x-y)]^T(x-y)\ge \frac{1}{L-m}[\nabla f(x)-\nabla f(y)-m(x-y)]^2 \end{equation} [∇f(x)−∇f(y)−m(x−y)]T(x−y)≥L−m1[∇f(x)−∇f(y)−m(x−y)]2 - 分解可得:
[ ∇ f ( x ) − ∇ f ( y ) ] T ( x − y ) − m ( x − y ) T ( x − y ) ≥ 1 L − m [ ∇ f ( x ) − ∇ f ( y ) − m ( x − y ) ] 2 \begin{equation} [\nabla f(x)-\nabla f(y)]^T(x-y)-m(x-y)^T(x-y)\ge \frac{1}{L-m}[\nabla f(x)-\nabla f(y)-m(x-y)]^2 \end{equation} [∇f(x)−∇f(y)]T(x−y)−m(x−y)T(x−y)≥L−m1[∇f(x)−∇f(y)−m(x−y)]2
[ ∇ f ( x ) − ∇ f ( y ) ] T ( x − y ) − m ∣ ∣ x − y ∣ ∣ 2 ≥ 1 L − m [ ∇ f ( x ) − ∇ f ( y ) − m ( x − y ) ] 2 \begin{equation} [\nabla f(x)-\nabla f(y)]^T(x-y)-m||x-y||^2\ge \frac{1}{L-m}[\nabla f(x)-\nabla f(y)-m(x-y)]^2 \end{equation} [∇f(x)−∇f(y)]T(x−y)−m∣∣x−y∣∣2≥L−m1[∇f(x)−∇f(y)−m(x−y)]2 - 将右边展开可得:
Q ( x ) = 1 L − m [ ∇ f ( x ) − ∇ f ( y ) ] 2 + 1 L − m m 2 ( x − y ) 2 − 2 L − m [ ∇ f ( x ) − ∇ f ( y ) ] T m ( x − y ) \begin{equation} Q(x)= \frac{1}{L-m}[\nabla f(x)-\nabla f(y)]^2+\frac{1}{L-m}m^2(x-y)^2- \frac{2}{L-m}[\nabla f(x)-\nabla f(y)]^Tm(x-y) \end{equation} Q(x)=L−m1[∇f(x)−∇f(y)]2+L−m1m2(x−y)2−L−m2[∇f(x)−∇f(y)]Tm(x−y) - 整理可得:
[ ∇ f ( x ) − ∇ f ( y ) ] T ( x − y ) − m ∣ ∣ x − y ∣ ∣ 2 ≥ Q ( x ) \begin{equation} [\nabla f(x)-\nabla f(y)]^T(x-y)-m||x-y||^2\ge Q(x) \end{equation} [∇f(x)−∇f(y)]T(x−y)−m∣∣x−y∣∣2≥Q(x) - 整理后可得:
( 1 + 2 m L − m ) [ ∇ f ( x ) − ∇ f ( y ) ] T ( x − y ) − m ∣ ∣ x − y ∣ ∣ 2 ≥ [ ∇ f ( x ) − ∇ f ( y ) ] 2 + m 2 ∣ ∣ x − y ∣ ∣ 2 L − m \begin{equation} (1+\frac{2m}{L-m})[\nabla f(x)-\nabla f(y)]^T(x-y)-m||x-y||^2\ge \frac{[\nabla f(x)-\nabla f(y)]^2+m^2||x-y||^2}{L-m} \end{equation} (1+L−m2m)[∇f(x)−∇f(y)]T(x−y)−m∣∣x−y∣∣2≥L−m[∇f(x)−∇f(y)]2+m2∣∣x−y∣∣2 - 进一步整理可得:
( 1 + 2 m L − m ) [ ∇ f ( x ) − ∇ f ( y ) ] T ( x − y ) ≥ [ ∇ f ( x ) − ∇ f ( y ) ] 2 L − m + ( m + m 2 L − m ) ∣ ∣ x − y ∣ ∣ 2 \begin{equation} (1+\frac{2m}{L-m})[\nabla f(x)-\nabla f(y)]^T(x-y)\ge \frac{[\nabla f(x)-\nabla f(y)]^2}{L-m}+(m+\frac{m^2}{L-m})||x-y||^2 \end{equation} (1+L−m2m)[∇f(x)−∇f(y)]T(x−y)≥L−m[∇f(x)−∇f(y)]2+(m+L−mm2)∣∣x−y∣∣2 - 整理系数可得:
[ ∇ f ( x ) − ∇ f ( y ) ] T ( x − y ) ≥ [ ∇ f ( x ) − ∇ f ( y ) ] 2 L + m + L m L + m ∣ ∣ x − y ∣ ∣ 2 \begin{equation} [\nabla f(x)-\nabla f(y)]^T(x-y)\ge \frac{[\nabla f(x)-\nabla f(y)]^2}{L+m}+\frac{Lm}{L+m}||x-y||^2 \end{equation} [∇f(x)−∇f(y)]T(x−y)≥L+m[∇f(x)−∇f(y)]2+L+mLm∣∣x−y∣∣2 - 令
x
=
x
k
,
y
=
x
∗
x=x_k,y=x^*
x=xk,y=x∗代入上式可得:
[ ∇ f ( x k ) − ∇ f ( x ∗ ) ] T ( x k − x ∗ ) ≥ [ ∇ f ( x k ) − ∇ f ( x ∗ ) ] 2 L + m + L m L + m ∣ ∣ x k − x ∗ ∣ ∣ 2 \begin{equation} [\nabla f(x_k)-\nabla f(x^*)]^T(x_k-x^*)\ge \frac{[\nabla f(x_k)-\nabla f(x^*)]^2}{L+m}+\frac{Lm}{L+m}||x_k-x^*||^2 \end{equation} [∇f(xk)−∇f(x∗)]T(xk−x∗)≥L+m[∇f(xk)−∇f(x∗)]2+L+mLm∣∣xk−x∗∣∣2 - 我们定义过如下公式:
∣ ∣ x k + 1 − x ∗ ∣ ∣ 2 = ∣ ∣ x k − x ∗ ∣ ∣ 2 − 2 α k [ ∇ f T ( x k ) − ∇ f T ( x ∗ ) ] T ( x k − x ∗ ) + α k 2 ∣ ∣ ∇ f ( x k ) ∣ ∣ 2 \begin{equation} ||x_{k+1}-x^*||^2=||x_k-x^*||^2-2\alpha_k[\nabla f^T(x_k)-\nabla f^T(x^*)]^T(x_k-x^*)+\alpha_k^2||\nabla f(x_k)||^2 \end{equation} ∣∣xk+1−x∗∣∣2=∣∣xk−x∗∣∣2−2αk[∇fT(xk)−∇fT(x∗)]T(xk−x∗)+αk2∣∣∇f(xk)∣∣2 - 整理后可得:
[ ∇ f ( x k ) − ∇ f ( x ∗ ) ] T ( x k − x ∗ ) = 1 2 α k [ ∣ ∣ x k − x ∗ ∣ ∣ 2 + α k 2 ∣ ∣ ∇ f ( x k ) ∣ ∣ 2 − ∣ ∣ x k + 1 − x ∗ ∣ ∣ 2 ] \begin{equation} [\nabla f(x_k)-\nabla f(x^*)]^T(x_k-x^*)=\frac{1}{2\alpha_k}[||x_k-x^*||^2+\alpha_k^2||\nabla f(x_k)||^2-||x_{k+1}-x^*||^2] \end{equation} [∇f(xk)−∇f(x∗)]T(xk−x∗)=2αk1[∣∣xk−x∗∣∣2+αk2∣∣∇f(xk)∣∣2−∣∣xk+1−x∗∣∣2] - 代入到不等式可得:
1 2 α k [ ∣ ∣ x k − x ∗ ∣ ∣ 2 + α k 2 ∣ ∣ ∇ f ( x k ) ∣ ∣ 2 − ∣ ∣ x k + 1 − x ∗ ∣ ∣ 2 ] ≥ [ ∇ f ( x k ) − ∇ f ( x ∗ ) ] 2 L + m + L m L + m ∣ ∣ x k − x ∗ ∣ ∣ 2 \begin{equation} \frac{1}{2\alpha_k}[||x_k-x^*||^2+\alpha_k^2||\nabla f(x_k)||^2-||x_{k+1}-x^*||^2]\ge \frac{[\nabla f(x_k)-\nabla f(x^*)]^2}{L+m}+\frac{Lm}{L+m}||x_k-x^*||^2 \end{equation} 2αk1[∣∣xk−x∗∣∣2+αk2∣∣∇f(xk)∣∣2−∣∣xk+1−x∗∣∣2]≥L+m[∇f(xk)−∇f(x∗)]2+L+mLm∣∣xk−x∗∣∣2 - 因为:
α
∈
(
0
,
2
L
+
m
)
\alpha \in (0,\frac{2}{L+m})
α∈(0,L+m2)
α k 2 ∣ ∣ ∇ f ( x k ) ∣ ∣ 2 − [ ∇ f ( x k ) ] 2 L + m ≤ 0 \begin{equation} \frac{\alpha_k}{2}||\nabla f(x_k)||^2- \frac{[\nabla f(x_k)]^2}{L+m}\le 0 \end{equation} 2αk∣∣∇f(xk)∣∣2−L+m[∇f(xk)]2≤0 - 所以缩放可得:
∣ ∣ x k + 1 − x ∗ ∣ ∣ 2 ≤ ( 1 − 2 α L m L + m ) ∣ ∣ x k − x ∗ ∣ ∣ 2 \begin{equation} ||x_{k+1}-x^*||^2\le(1-\frac{2\alpha Lm}{L+m})||x_k-x^*||^2 \end{equation} ∣∣xk+1−x∗∣∣2≤(1−L+m2αLm)∣∣xk−x∗∣∣2 - 显然 ( 1 − 2 α L m L + m ) < 1 (1-\frac{2\alpha Lm}{L+m})<1 (1−L+m2αLm)<1
- 因为:
α
∈
(
0
,
2
L
+
m
)
\alpha \in (0,\frac{2}{L+m})
α∈(0,L+m2)
1 − 2 α L m L + m > 1 − 4 L m ( L + m ) 2 = ( L − m ) 2 ( L + m ) 2 > 0 \begin{equation} 1-\frac{2\alpha Lm}{L+m}>1-\frac{4Lm}{(L+m)^2}=\frac{(L-m)^2}{(L+m)^2}>0 \end{equation} 1−L+m2αLm>1−(L+m)24Lm=(L+m)2(L−m)2>0 - 所以可得,
L
≠
m
L\ne m
L=m时:
∣ ∣ x k + 1 − x ∗ ∣ ∣ 2 ≤ c ∣ ∣ x k − x ∗ ∣ ∣ 2 ; c = ( 1 − 2 α L m L + m ) , 0 < c < 1 \begin{equation} ||x_{k+1}-x^*||^2\le c||x_k-x^*||^2;c=(1-\frac{2\alpha Lm}{L+m}), 0<c<1 \end{equation} ∣∣xk+1−x∗∣∣2≤c∣∣xk−x∗∣∣2;c=(1−L+m2αLm),0<c<1
∣ ∣ x k + 1 − x ∗ ∣ ∣ ≤ c ∣ ∣ x k − x ∗ ∣ ∣ ; c = ( 1 − 2 α L m L + m ) , 0 < c < 1 \begin{equation} ||x_{k+1}-x^*||\le \sqrt{c}||x_k-x^*||;c=(1-\frac{2\alpha Lm}{L+m}), 0<\sqrt{c}<1 \end{equation} ∣∣xk+1−x∗∣∣≤c∣∣xk−x∗∣∣;c=(1−L+m2αLm),0<c<1
∣ ∣ x k + 1 − x ∗ ∣ ∣ ∣ ∣ x k − x ∗ ∣ ∣ ≤ ( 1 − 2 α L m L + m ) 1 2 ; c = ( 1 − 2 α L m L + m ) , 0 < c < 1 \begin{equation} \frac{||x_{k+1}-x^*||}{||x_k-x^*||}\le (1-\frac{2\alpha Lm}{L+m})^{\frac{1}{2}};c=(1-\frac{2\alpha Lm}{L+m}), 0<\sqrt{c}<1 \end{equation} ∣∣xk−x∗∣∣∣∣xk+1−x∗∣∣≤(1−L+m2αLm)21;c=(1−L+m2αLm),0<c<1
则 { x k } → Q − 线性收敛 x ∗ \{x_k \}\xrightarrow[]{Q-线性收敛}x^* {xk}Q−线性收敛x∗
! ! ! 完结撒花 ! ! ! !!!完结撒花!!! !!!完结撒花!!!
2. [优化算法] 梯度下降法-强凸函数的收敛性分析(下)
2.1 概述
-
f
f
f 有下界,m-强凸,
二阶可微
- ∇ f \nabla f ∇f是L-Lipschitz连续
-
α
∈
(
0
,
2
L
+
m
)
\alpha \in(0,\frac{2}{L+m})
α∈(0,L+m2)
可得:
则 { x k } → Q − 线性收敛 x ∗ \{x_k \}\xrightarrow[]{Q-线性收敛}x^* {xk}Q−线性收敛x∗
∣ ∣ x k + 1 − x ∗ ∣ ∣ ∣ ∣ x k − x ∗ ∣ ∣ ≤ ( 1 − 2 α L m L + m ) 1 2 \begin{equation} \frac{||x_{k+1}-x^*||}{||x_k-x^*||}\le (1-\frac{2\alpha Lm}{L+m})^{\frac{1}{2}} \end{equation} ∣∣xk−x∗∣∣∣∣xk+1−x∗∣∣≤(1−L+m2αLm)21
2.2 证明
-
由 f 是m-强凸可得,
二阶可微
:
∇ 2 f ⪰ m I → ∇ 2 f − m I 为半正定矩阵 \begin{equation} \nabla^2 f\succeq mI \rightarrow \nabla^2 f- mI 为半正定矩阵 \end{equation} ∇2f⪰mI→∇2f−mI为半正定矩阵 -
由 ∇ f \nabla f ∇f是L-Lipschitz连续,
二阶可微
可得:
∇ 2 f ⪯ L I , 即 [ L I − ∇ 2 f ] 为半正定矩阵 \begin{equation} \nabla^2 f\preceq LI,即 [LI-\nabla^2 f]为半正定矩阵 \end{equation} ∇2f⪯LI,即[LI−∇2f]为半正定矩阵 -
综上所述可得:
m I ⪯ ∇ 2 f ⪯ L I \begin{equation} mI\preceq \nabla^2 f\preceq LI \end{equation} mI⪯∇2f⪯LI -
因为 ∇ 2 f \nabla^2f ∇2f是对称正定的矩阵,可以进行正交分解可得
∇ 2 f = Q Λ Q T , Λ = [ λ 1 λ 2 ⋱ λ n ] , λ 1 ≥ λ 2 ≥ ⋯ ≥ λ n \begin{equation} \nabla^2 f=Q\Lambda Q^T,\Lambda=\begin{bmatrix}\lambda_1\\\\&\lambda_2\\\\&&\ddots\\\\&&&\lambda_n\end{bmatrix},\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n \end{equation} ∇2f=QΛQT,Λ= λ1λ2⋱λn ,λ1≥λ2≥⋯≥λn -
化简如下公式可得:
∇ 2 f − m I = Q Λ Q − 1 − Q m I Q − 1 = Q [ λ 1 − m λ 2 − m ⋱ λ n − m ] Q − 1 ≥ 0 \begin{equation} \nabla^2 f- mI =Q\Lambda Q^{-1}-QmIQ^{-1}=Q \begin{bmatrix}\lambda_1-m\\\\&\lambda_2-m\\\\&&\ddots\\\\&&&\lambda_n-m\end{bmatrix}Q^{-1}\ge0 \end{equation} ∇2f−mI=QΛQ−1−QmIQ−1=Q λ1−mλ2−m⋱λn−m Q−1≥0 -
所以可得:
λ i − m ≥ 0 , ∀ i = 1 , 2 , ⋯ , n → λ m i n = λ n ≥ m \begin{equation} \lambda_i-m\ge0,\forall i=1,2,\cdots,n\rightarrow \lambda_{min}=\lambda_n\ge m \end{equation} λi−m≥0,∀i=1,2,⋯,n→λmin=λn≥m -
根据如下条件
m I ⪯ ∇ 2 f ⪯ L I \begin{equation} mI\preceq \nabla^2 f\preceq LI \end{equation} mI⪯∇2f⪯LI -
可得:
λ m a x = λ n ≤ L \begin{equation} \lambda_{max}=\lambda_n\le L \end{equation} λmax=λn≤L -
综上所述:
0 < m ≤ λ m i n ≤ λ m a x ≤ L \begin{equation} 0<m\le\lambda_{min}\le \lambda_{max}\le L \end{equation} 0<m≤λmin≤λmax≤L -
不妨令 L = λ m a x , m = λ m i n , α = 1 L L=\lambda_{max},m=\lambda_{min},\alpha=\frac{1}{L} L=λmax,m=λmin,α=L1,代入公式可得:
∣ ∣ x k + 1 − x ∗ ∣ ∣ ∣ ∣ x k − x ∗ ∣ ∣ ≤ ( λ m a x − λ m i n λ m a x + λ m i n ) 1 2 = ( λ m a x / λ m i n − 1 λ m a x / λ m i n + 1 ) 1 2 \begin{equation} \frac{||x_{k+1}-x^*||}{||x_k-x^*||}\le (\frac{\lambda_{max}-\lambda_{min}}{\lambda_{max}+\lambda_{min}})^{\frac{1}{2}}=(\frac{\lambda_{max}/\lambda_{min}-1}{\lambda_{max}/\lambda_{min}+1})^{\frac{1}{2}} \end{equation} ∣∣xk−x∗∣∣∣∣xk+1−x∗∣∣≤(λmax+λminλmax−λmin)21=(λmax/λmin+1λmax/λmin−1)21 -
我们定义 ∇ 2 f \nabla^2f ∇2f的条件数表示如下:
K ( ∇ 2 f ) = λ m a x λ m i n \begin{equation} \mathbb{K}(\nabla^2f)=\frac{\lambda_{max}}{\lambda_{min}} \end{equation} K(∇2f)=λminλmax -
那么综上所述可得:
∣ ∣ x k + 1 − x ∗ ∣ ∣ ∣ ∣ x k − x ∗ ∣ ∣ ≤ ( K ( ∇ 2 f ) − 1 K ( ∇ 2 f ) + 1 ) 1 2 \begin{equation} \frac{||x_{k+1}-x^*||}{||x_k-x^*||}\le (\frac{\mathbb{K}(\nabla^2f)-1}{\mathbb{K}(\nabla^2f)+1})^{\frac{1}{2}} \end{equation} ∣∣xk−x∗∣∣∣∣xk+1−x∗∣∣≤(K(∇2f)+1K(∇2f)−1)21 -
当 K ( ∇ 2 f ) → ∞ \mathbb{K}(\nabla^2f)\to \infty K(∇2f)→∞时,称作病态问题
lim K ( ∇ 2 f ) → ∞ ( K ( ∇ 2 f ) − 1 K ( ∇ 2 f ) + 1 ) 1 2 = 1 \begin{equation} \lim_{\mathbb{K}(\nabla^2f)\to \infty}(\frac{\mathbb{K}(\nabla^2f)-1}{\mathbb{K}(\nabla^2f)+1})^{\frac{1}{2}}=1 \end{equation} K(∇2f)→∞lim(K(∇2f)+1K(∇2f)−1)21=1