1.谱定理
假设
A
∈
R
n
,
n
\mathbf{A} \in R^{n, n}
A∈Rn,n是一个实对称矩阵,
λ
i
∈
R
,
i
=
1
,
2
,
⋯
,
n
\lambda_{i} \in R, i=1,2, \cdots, n
λi∈R,i=1,2,⋯,n 是
A
\mathbf{A}
A的特征值,
u
i
∈
R
n
,
i
=
1
,
2
,
⋯
,
n
u_{i} \in R_{n}, i=1,2, \cdots, n
ui∈Rn,i=1,2,⋯,n是
A
\mathbf{A}
A的特征向量,那么
A
u
i
=
\mathbf{A} u_{i}=
Aui=
λ
i
u
i
\lambda_{i} u_{i}
λiui。同时,还存在一个正交矩阵
U
=
[
u
1
,
⋯
,
u
n
]
U=\left[u_{1}, \cdots, u_{n}\right]
U=[u1,⋯,un] (
U
U
T
=
U U^{T}=
UUT=
U
T
U
=
I
n
U^{T} U=I_{n}
UTU=In ),有:
A
=
U
Λ
U
T
=
[
u
1
⋯
u
n
]
[
λ
1
0
⋱
0
λ
n
]
[
u
1
T
⋮
u
n
T
]
=
[
λ
1
u
1
⋯
λ
n
u
n
]
[
u
1
T
⋮
u
n
T
]
=
∑
i
=
1
n
λ
i
u
i
u
i
T
(1)
\mathbf{A}=U \Lambda U^{T}=\left[\begin{array}{lll} \mathbf{u}_{1} & \cdots & \mathbf{u}_{n} \end{array}\right]\left[\begin{array}{ccc} \lambda_{1} & & 0 \\ & \ddots & \\ 0 & & \lambda_{n} \end{array}\right]\left[\begin{array}{c} \mathbf{u}_{1}^{T} \\ \vdots \\ \mathbf{u}_{n}^{T} \end{array}\right] \\ =\left[\begin{array}{lll} \lambda_{1} \mathbf{u}_{1} & \cdots & \lambda_{n} \mathbf{u}_{n} \end{array}\right]\left[\begin{array}{c} \mathbf{u}_{1}^{T} \\ \vdots \\ \mathbf{u}_{n}^{T} \end{array}\right] \\ =\sum_{i=1}^{n} \lambda_{i} \mathbf{u}_{i} \mathbf{u}_{i}^{T} \tag{1}
A=UΛUT=[u1⋯un]⎣⎡λ10⋱0λn⎦⎤⎣⎢⎡u1T⋮unT⎦⎥⎤=[λ1u1⋯λnun]⎣⎢⎡u1T⋮unT⎦⎥⎤=i=1∑nλiuiuiT(1)
我们有公式 A = λ 1 u 1 u 1 T + λ 2 u 2 u 2 T + ⋯ + λ n u n u n T A=\lambda_{1} \mathbf{u}_{1} \mathbf{u}_{1}^{T}+\lambda_{2} \mathbf{u}_{2} \mathbf{u}_{2}^{T}+\cdots+\lambda_{n} \mathbf{u}_{n} \mathbf{u}_{n}^{T} A=λ1u1u1T+λ2u2u2T+⋯+λnununT.
这个 A A A 的表示称为 A A A 的谱分解 (spectral decomposition), 因为它将 A A A 分解成 A A A 的谱 (特征值) 决定的小块. 每个矩阵 u j u j T \mathbf{u}_{j} \mathbf{u}_{j}^{T} ujujT 是投影矩阵(projection matrix), 对每个向量 x ∈ R n , u j u j T x = ( u j T x ) u j \mathbf{x} \in \mathbb{R}^{n}, \mathbf{u}_{j} \mathbf{u}_{j}^{T} \mathbf{x}=\left(\mathbf{u}_{j}^{T} \mathbf{x}\right) \mathbf{u}_{j} x∈Rn,ujujTx=(ujTx)uj.
2.瑞利熵
假设
A
∈
R
n
,
n
\mathbf{A} \in R^{n, n}
A∈Rn,n是一个实对称矩阵,
λ
i
∈
R
,
i
=
1
,
2
,
⋯
,
n
\lambda_{i} \in R, i=1,2, \cdots, n
λi∈R,i=1,2,⋯,n 是
A
\mathbf{A}
A的特征值,
u
i
∈
R
n
,
i
=
1
,
2
,
⋯
,
n
u_{i} \in R_{n}, i=1,2, \cdots, n
ui∈Rn,i=1,2,⋯,n是
A
\mathbf{A}
A的特征向量,那么:
λ
min
(
A
)
≤
x
T
A
x
x
T
x
≤
λ
max
(
A
)
,
∀
x
≠
0
λ
max
(
A
)
=
max
x
:
∥
x
∥
2
=
1
x
T
A
x
λ
min
(
A
)
=
min
x
:
∥
x
∥
2
=
1
x
T
A
x
\begin{aligned} &\lambda_{\min }(A) \leq \frac{x^{T} A x}{x^{T} x} \leq \lambda_{\max }(A), \forall x \neq 0 \\ &\lambda_{\max }(A)=\max _{x:\|x\|_{2}=1} x^{T} A x \\ &\lambda_{\min }(A)=\min _{x:\|x\|_{2}=1} x^{T} A x \end{aligned}
λmin(A)≤xTxxTAx≤λmax(A),∀x=0λmax(A)=x:∥x∥2=1maxxTAxλmin(A)=x:∥x∥2=1minxTAx
其中最大和最小的特征值对用的特征向量分别为
x
=
u
1
x=u_{1}
x=u1和
x
=
u
n
x=u_{n}
x=un。
证明:
- 利用谱定理,
U
U
U 是正交的,
Λ
\Lambda
Λ 是对角矩阵:
x T A x = x T U Λ U T x = x ˉ T Λ x ˉ = ∑ i = 1 n λ i x ˉ i 2 x^{T} A x=x^{T} U \Lambda U^{T} x=\bar{x}^{T} \Lambda \bar{x}=\sum_{i=1}^{n} \lambda_{i} \bar{x}_{i}^{2} xTAx=xTUΛUTx=xˉTΛxˉ=i=1∑nλixˉi2 - 很明显:
λ min ∑ i = 1 n x ˉ i 2 ≤ ∑ i = 1 n λ i x ˉ i 2 ≤ λ max ∑ i = 1 n x ˉ i 2 \lambda_{\min } \sum_{i=1}^{n} \bar{x}_{i}^{2} \leq \sum_{i=1}^{n} \lambda_{i} \bar{x}_{i}^{2} \leq \lambda_{\max } \sum_{i=1}^{n} \bar{x}_{i}^{2} λmini=1∑nxˉi2≤i=1∑nλixˉi2≤λmaxi=1∑nxˉi2 - 此外, 正交矩阵
U
U
U 无法改变任何向量的norm:
∑ i = 1 n x i 2 = x T x = x T U U T x = ( U T x ) T ( U T x ) = x ˉ T x ˉ = ∑ i = 1 n x ˉ i 2 \sum_{i=1}^{n} x_{i}^{2}=x^{T} x=x^{T} U U^{T} x=\left(U^{T} x\right)^{T}\left(U^{T} x\right)=\bar{x}^{T} \bar{x}=\sum_{i=1}^{n} \bar{x}_{i}^{2} i=1∑nxi2=xTx=xTUUTx=(UTx)T(UTx)=xˉTxˉ=i=1∑nxˉi2 - 对上述的三个式子进行结合得到:
λ min x T x ≤ x T A x ≤ λ max x T x \lambda_{\min } x^{T} x \leq x^{T} A x \leq \lambda_{\max } x^{T} x λminxTx≤xTAx≤λmaxxTx