courant–friedrich极小极大定理
courant–friedrich极小极大定理
矩阵A是一个Hermite矩阵,其特征值 λ 1 ≤ λ 2 ⋯ ≤ λ n \lambda_1\le \lambda_2\dots\le \lambda_n λ1≤λ2⋯≤λn
有 λ i = min dim V = i , V ∈ C n max x ∈ V , ∣ ∣ x ∣ ∣ 2 = 1 x H A x \lambda_i=\min_{\dim V=i,V\in C^n} \quad \max_{x\in V ,||x||_{2}=1} x^HAx λi=dimV=i,V∈Cnminx∈V,∣∣x∣∣2=1maxxHAx
λ i = max dim V = n − i + 1 , V ∈ C n min x ∈ V , ∣ ∣ x ∣ ∣ 2 = 1 x H A x \lambda_i = \max_{\dim V=n-i+1,V\in C^n}\quad \min_{x\in V,||x||_2=1}x^HAx λi=dimV=n−i+1,V∈Cnmaxx∈V,∣∣x∣∣2=1minxHAx
Proof:
首先假设A的特征值对应的单位正交特征向量是
u
1
,
…
,
u
n
u_1,\dots,u_n
u1,…,un。
对于i维空间V,
dim
V
∩
s
p
a
n
{
u
i
,
…
,
u
n
}
=
dim
V
+
dim
s
p
a
n
{
u
i
,
…
,
u
n
}
−
dim
V
∪
s
p
a
n
{
u
i
,
…
,
u
n
}
≥
i
+
n
−
i
+
1
−
n
=
1
\dim V\cap span \{u_i,\dots,u_n\} = \dim V+\dim span\{u_i,\dots,u_n\}-\dim V\cup span \{u_i,\dots,u_n\}\ge i+n-i+1-n=1
dimV∩span{ui,…,un}=dimV+dimspan{ui,…,un}−dimV∪span{ui,…,un}≥i+n−i+1−n=1。
所以假设 x ∈ V ∩ s p a n { u i , … , u n } , x = ∑ k = i n a k u k , ∣ ∣ x ∣ ∣ 2 = 1 x\in V\cap span \{u_i,\dots,u_n\}, x=\sum_{k=i}^na_ku_k,||x||_2=1 x∈V∩span{ui,…,un},x=∑k=inakuk,∣∣x∣∣2=1。
x H A x = x H ∑ k = i n a k λ k u k = ∑ k = i n a k 2 λ k ≥ λ i x^HAx=x^H\sum_{k=i}^na_k\lambda_ku_k=\sum_{k=i}^na_k^2\lambda_k\ge \lambda_i xHAx=xHk=i∑nakλkuk=k=i∑nak2λk≥λi
所以
max
x
∈
V
,
∣
∣
x
∣
∣
2
=
1
x
H
A
x
≥
λ
i
\max_{x\in V ,||x||_{2}=1} x^HAx\ge \lambda_i
x∈V,∣∣x∣∣2=1maxxHAx≥λi
min dim V = i max x ∈ V , ∣ ∣ x ∣ ∣ 2 = 1 x H A x ≥ λ i \min_{\dim V=i} \max_{x\in V ,||x||_{2}=1} x^HAx\ge \lambda_i dimV=iminx∈V,∣∣x∣∣2=1maxxHAx≥λi
当 V = s p a n { u 1 , … , u i } V=span \{u_1,\dots,u_i\} V=span{u1,…,ui}时候可以取得等号。从而 λ i = min dim V = i , V ∈ C n max x ∈ V , ∣ ∣ x ∣ ∣ 2 = 1 x H A x \lambda_i=\min_{\dim V=i,V\in C^n} \quad \max_{x\in V ,||x||_{2}=1} x^HAx λi=dimV=i,V∈Cnminx∈V,∣∣x∣∣2=1maxxHAx
Weyl不等式
形式一
λ i ( A ) + λ 1 ( B ) ≤ λ i ( A + B ) ≤ λ i ( A ) + λ n ( B ) \lambda_i(A)+\lambda_1(B)\le \lambda_i(A+B)\le \lambda_i(A)+\lambda_n(B) λi(A)+λ1(B)≤λi(A+B)≤λi(A)+λn(B)
λ 1 ( B ) ≤ λ i ( A + B ) − λ i ( A ) ≤ λ n ( B ) \lambda_1(B)\le \lambda_i(A+B)-\lambda_i(A)\le \lambda_n(B) λ1(B)≤λi(A+B)−λi(A)≤λn(B)
Proof:
根据courant–friedrich极小极大定理
λ
i
(
A
+
B
)
=
min
dim
V
=
i
,
V
∈
C
n
max
x
∈
V
x
H
(
A
+
B
)
x
x
H
x
≤
min
dim
V
=
i
,
V
∈
C
n
(
max
x
∈
V
x
H
A
x
x
H
x
+
max
x
∈
V
x
H
B
x
x
H
x
)
≤
λ
i
(
A
)
+
λ
n
(
B
)
\begin{aligned}&\lambda_i(A+B)=\min_{\dim V=i,V\in C^{n}} \quad \max_{x\in V} \frac{x^H(A+B)x}{x^Hx}\\&\le \min_{\dim V=i,V\in C^{n}} \quad \bigg(\max_{x\in V} \frac{x^HAx}{x^Hx}+\max_{x\in V} \frac{x^HBx}{x^Hx}\bigg)\le \lambda_{i}(A)+\lambda_n(B)\end{aligned}
λi(A+B)=dimV=i,V∈Cnminx∈VmaxxHxxH(A+B)x≤dimV=i,V∈Cnmin(x∈VmaxxHxxHAx+x∈VmaxxHxxHBx)≤λi(A)+λn(B)
同理可证。
形式二
注意到 ∣ ∣ A ∣ ∣ 2 = max i = 1 … , n ∣ λ i ( A ) ∣ ||A||_2=\max_{i=1\dots,n} |\lambda_i(A)| ∣∣A∣∣2=maxi=1…,n∣λi(A)∣
max i = 1 , … , n ∣ λ i ( A + B ) − λ i ( A ) ∣ ≤ ∣ ∣ B ∣ ∣ 2 \max_{i=1,\dots,n}|\lambda_i(A+B)-\lambda_i(A)|\le ||B||_2 i=1,…,nmax∣λi(A+B)−λi(A)∣≤∣∣B∣∣2
形式三
若B是半正定,那么 λ 1 ( B ) = 0 \lambda_1(B)=0 λ1(B)=0
λ i ( A ) ≤ λ i ( A + B ) \lambda_i(A)\le \lambda_i(A+B) λi(A)≤λi(A+B)
Hoffman–Wielandt不等式
∑ i = 1 n ( λ i ( A ) − λ i ( B ) ) 2 ≤ ∣ ∣ A − B ∣ ∣ F 2 \sum_{i=1}^n(\lambda_i(A)-\lambda_i(B))^2\le ||A-B||_F^2 i=1∑n(λi(A)−λi(B))2≤∣∣A−B∣∣F2
Cauchy交错定理
形式一
如果C是A的n-1阶主子式,那么
λ
i
(
A
)
≤
λ
i
(
C
)
≤
λ
i
+
1
(
A
)
,
i
=
1
,
…
,
n
−
1
\lambda_i(A)\le\lambda_i(C)\le\lambda_{i+1}(A),i=1,\dots,n-1
λi(A)≤λi(C)≤λi+1(A),i=1,…,n−1
推论
若 λ \lambda λ是A的m重特征值,那么 λ \lambda λ至少是C的m-1重特征值。
形式二
如果C是A的n-k阶主子式,那么
λ
i
(
A
)
≤
λ
i
(
C
)
≤
λ
i
+
k
(
A
)
,
i
=
1
,
…
,
n
−
k
\lambda_i(A)\le\lambda_i(C)\le\lambda_{i+k}(A),i=1,\dots,n-k
λi(A)≤λi(C)≤λi+k(A),i=1,…,n−k
形式三
存在
X
∈
C
n
×
m
,
X
H
X
=
I
m
,
n
≥
m
X\in C^{n\times m},X^HX=I_m,n\ge m
X∈Cn×m,XHX=Im,n≥m,那么
λ
i
(
A
)
≤
λ
i
(
X
H
A
X
)
≤
λ
i
+
n
−
m
(
A
)
,
i
=
1
,
…
,
m
\lambda_i(A)\le\lambda_i(X^HAX)\le\lambda_{i+n-m}(A),i=1,\dots,m
λi(A)≤λi(XHAX)≤λi+n−m(A),i=1,…,m
Proof:
根据courant–friedrich极小极大定理
λ
i
(
X
H
A
X
)
=
min
dim
V
=
i
,
V
∈
C
m
max
x
∈
V
x
H
X
H
A
X
x
x
H
x
=
min
dim
V
=
i
,
V
∈
C
m
max
x
∈
V
x
H
X
H
A
X
x
(
X
x
)
H
X
x
=
min
dim
V
=
i
,
V
∈
C
m
,
W
=
X
V
max
y
=
X
x
∈
W
y
H
A
y
y
H
y
≥
min
dim
W
=
i
,
W
∈
C
n
max
y
=
X
x
∈
W
y
H
A
y
y
H
y
=
λ
i
(
A
)
\begin{aligned}&\lambda_i(X^HAX)=\min_{\dim V=i,V\in C^{m}} \quad \max_{x\in V} \frac{x^HX^HAXx}{x^Hx}=\min_{\dim V=i,V\in C^{m}} \quad \max_{x\in V} \frac{x^HX^HAXx}{(Xx)^HXx}\\&=\min_{\dim V=i,V\in C^{m},W=XV} \quad \max_{y=Xx\in W} \frac{y^HAy}{y^Hy}\ge \min_{\dim W=i,W\in C^{n}} \quad \max_{y=Xx\in W} \frac{y^HAy}{y^Hy}=\lambda_i(A)\end{aligned}
λi(XHAX)=dimV=i,V∈Cmminx∈VmaxxHxxHXHAXx=dimV=i,V∈Cmminx∈Vmax(Xx)HXxxHXHAXx=dimV=i,V∈Cm,W=XVminy=Xx∈WmaxyHyyHAy≥dimW=i,W∈Cnminy=Xx∈WmaxyHyyHAy=λi(A)
最后一个不等式成立是因为W=XV这个子空间变换虽然不改变子空间维数,但是并不是从m维复空间到n维复空间的满射。比如 X = ( I m 0 n − m , m ) X=\begin{pmatrix}I_m\\0_{n-m,m}\end{pmatrix} X=(Im0n−m,m), W = X V = ( V 0 n − m , m ) W=XV=\begin{pmatrix}V\\0_{n-m,m}\end{pmatrix} W=XV=(V0n−m,m)并不能代表整个n维复空间。左边min函数的约束强,所以成立不等式。
同样的,
λ i ( X H A X ) = max dim V = m − i + 1 , V ∈ C m min x ∈ V x H X H A X x x H x = max dim V = m − i + 1 , V ∈ C m min x ∈ V x H X H A X x ( X x ) H X x = max dim V = m − i + 1 , V ∈ C m , W = X V min y = X x ∈ W y H A y y H y ≤ max dim W = m − i + 1 , W ∈ C n min y = X x ∈ W y H A y y H y = λ n − m + i ( A ) \begin{aligned}&\lambda_i(X^HAX)=\max_{\dim V=m-i+1,V\in C^{m}} \quad \min_{x\in V} \frac{x^HX^HAXx}{x^Hx}=\max_{\dim V=m-i+1,V\in C^{m}} \quad \min_{x\in V} \frac{x^HX^HAXx}{(Xx)^HXx}\\&=\max_{\dim V=m-i+1,V\in C^{m},W=XV} \quad \min_{y=Xx\in W} \frac{y^HAy}{y^Hy}\le \max_{\dim W=m-i+1,W\in C^{n}} \quad \min_{y=Xx\in W} \frac{y^HAy}{y^Hy}=\lambda_{n-m+i}(A)\end{aligned} λi(XHAX)=dimV=m−i+1,V∈Cmmaxx∈VminxHxxHXHAXx=dimV=m−i+1,V∈Cmmaxx∈Vmin(Xx)HXxxHXHAXx=dimV=m−i+1,V∈Cm,W=XVmaxy=Xx∈WminyHyyHAy≤dimW=m−i+1,W∈Cnmaxy=Xx∈WminyHyyHAy=λn−m+i(A)
樊氏迹极小化定理
∑ i = 1 n λ i ( A ) = min X ∈ C n × m , X H X = I m t r ( X H A X ) \sum_{i=1}^n\lambda_i(A)=\min_{X\in C^{n\times m},X^HX=I_m}tr(X^HAX) i=1∑nλi(A)=X∈Cn×m,XHX=Immintr(XHAX)
这是Cauchy交错定理中形式三左边取等部分。