引理
设 A , B \boldsymbol{A}, \boldsymbol{B} A,B 为 m × n m \times n m×n 矩阵, m > n , B m>n, \boldsymbol{B} m>n,B 的秩为 n n n, A T A ⩾ ( A T B ) T ( B T B ) − 1 ( B T A ) \boldsymbol{A}^{\mathrm{T}} \boldsymbol{A} \geqslant\left(\boldsymbol{A}^{\mathrm{T}} \boldsymbol{B}\right)^{\mathrm{T}}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{A}\right) ATA⩾(ATB)T(BTB)−1(BTA)称为矩阵许瓦茨不等式。
证明
证明: 设有两个
n
n
n 维向量
λ
\boldsymbol{\lambda}
λ 和
α
\boldsymbol{\alpha}
α 如下:
λ
=
[
λ
1
λ
2
⋯
λ
n
]
T
,
α
=
[
α
1
α
2
⋯
α
n
]
T
\boldsymbol{\lambda}=\left[\begin{array}{llll}\lambda_{1} & \lambda_{2} & \cdots & \lambda_{n}\end{array}\right]^{\mathrm{T}}, \quad \boldsymbol{\alpha}=\left[\begin{array}{llll}\alpha_{1} & \alpha_{2} & \cdots & \alpha_{n}\end{array}\right]^{\mathrm{T}}
λ=[λ1λ2⋯λn]T,α=[α1α2⋯αn]T考虑下面非负定的标量乘积
(
B
λ
+
A
α
)
T
(
B
λ
+
A
α
)
⩾
0
(\boldsymbol{B} \boldsymbol{\lambda}+\boldsymbol{A} \boldsymbol{\alpha})^{\mathrm{T}}(\boldsymbol{B} \boldsymbol{\lambda}+\boldsymbol{A} \boldsymbol{\alpha}) \geqslant 0
(Bλ+Aα)T(Bλ+Aα)⩾0只有
B
λ
+
A
α
=
0
\boldsymbol{B} \boldsymbol{\lambda}+\boldsymbol{A} \boldsymbol{\alpha}=\mathbf{0}
Bλ+Aα=0 时,上式的等号才成立。
展开上式,可得
λ
T
B
T
B
λ
+
α
T
A
T
B
λ
+
λ
T
B
T
A
α
+
α
T
A
T
A
α
⩾
0
\boldsymbol{\lambda}^{\mathrm{T}} \boldsymbol{B}^{\mathrm{T}} \boldsymbol{B} \boldsymbol{\lambda}+\boldsymbol{\alpha}^{\mathrm{T}} \boldsymbol{A}^{\mathrm{T}} \boldsymbol{B} \boldsymbol{\lambda}+\boldsymbol{\lambda}^{\mathrm{T}} \boldsymbol{B}^{\mathrm{T}} \boldsymbol{A} \boldsymbol{\alpha}+\boldsymbol{\alpha}^{\mathrm{T}} \boldsymbol{A}^{\mathrm{T}} \boldsymbol{A} \boldsymbol{\alpha} \geqslant 0
λTBTBλ+αTATBλ+λTBTAα+αTATAα⩾0因为假定
B
\boldsymbol{B}
B 是满秩的, 所以
(
B
T
B
)
−
1
\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1}
(BTB)−1 存在, 可将上式写成
[
λ
+
(
B
T
B
)
−
1
B
T
A
α
]
T
B
T
B
[
λ
+
(
B
T
B
)
−
1
B
T
A
α
]
+
α
T
[
A
T
A
−
(
A
T
B
)
T
(
B
T
B
)
−
1
(
B
T
A
)
]
α
⩾
0
\begin{aligned}{\left[\boldsymbol{\lambda}+\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1} \boldsymbol{B}^{\mathrm{T}} \boldsymbol{A} \boldsymbol{\alpha}\right]^{\mathrm{T}} \boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\left[\boldsymbol{\lambda}+\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1} \boldsymbol{B}^{\mathrm{T}} \boldsymbol{A} \boldsymbol{\alpha}\right]+}\boldsymbol{\alpha}^{\mathrm{T}}\left[\boldsymbol{A}^{\mathrm{T}} \boldsymbol{A}-\left(\boldsymbol{A}^{\mathrm{T}} \boldsymbol{B}\right)^{\mathrm{T}}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{A}\right)\right] \boldsymbol{\alpha} \geqslant 0 \end{aligned}
[λ+(BTB)−1BTAα]TBTB[λ+(BTB)−1BTAα]+αT[ATA−(ATB)T(BTB)−1(BTA)]α⩾0
上式对于任意
λ
\lambda
λ 与
α
\boldsymbol{\alpha}
α 都成立。选
λ
\lambda
λ 为
λ
=
−
(
B
T
B
)
−
1
B
T
A
α
\boldsymbol{\lambda}=-\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1} \boldsymbol{B}^{\mathrm{T}} \boldsymbol{A} \boldsymbol{\alpha}
λ=−(BTB)−1BTAα
则上式变成
α
T
[
A
T
A
−
(
A
T
B
)
T
(
B
T
B
)
−
1
(
B
T
A
)
]
α
⩾
0
\quad \boldsymbol{\alpha}^{\mathrm{T}}\left[\boldsymbol{A}^{\mathrm{T}} \boldsymbol{A}-\left(\boldsymbol{A}^{\mathrm{T}} \boldsymbol{B}\right)^{\mathrm{T}}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{A}\right)\right] \boldsymbol{\alpha} \geqslant 0
αT[ATA−(ATB)T(BTB)−1(BTA)]α⩾0因为
α
\boldsymbol{\alpha}
α 是任意的,只有当
[
A
T
A
−
(
A
T
B
)
T
(
B
T
B
)
−
1
(
B
T
A
)
]
\left[\boldsymbol{A}^{\mathrm{T}} \boldsymbol{A}-\left(\boldsymbol{A}^{\mathrm{T}} \boldsymbol{B}\right)^{\mathrm{T}}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{A}\right)\right]
[ATA−(ATB)T(BTB)−1(BTA)] 是非负定时,这个二次型才是非负定的, 因此有式
A
T
A
⩾
(
A
T
B
)
T
(
B
T
B
)
−
1
(
B
T
A
)
\boldsymbol{A}^{\mathrm{T}} \boldsymbol{A} \geqslant\left(\boldsymbol{A}^{\mathrm{T}} \boldsymbol{B}\right)^{\mathrm{T}}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{B}\right)^{-1}\left(\boldsymbol{B}^{\mathrm{T}} \boldsymbol{A}\right)
ATA⩾(ATB)T(BTB)−1(BTA)证毕。