一、两种范数的定义
1.1 F-范数
∣ ∣ A ∣ ∣ F = ∑ 0 ≤ i , j ≤ n a i j 2 ||A||_F = \sqrt{\sum _{0\le i,j\le n}a_{ij} ^ 2} ∣∣A∣∣F=0≤i,j≤n∑aij2
1.2 2-范数
1.2.1 计算公式
简单来说,矩阵A的2范数可以用下面的公式计算:
∣
∣
A
∣
∣
2
=
λ
m
||A||_2 = \sqrt{\lambda_m}\\
∣∣A∣∣2=λm
其中
λ
m
\lambda_m
λm是
A
T
A
A^TA
ATA的最大的特征值
1.2.2 完整的定义
向量范数的定义: ∣ ∣ a ∣ ∣ p = ( ∑ i a i p ) 1 / p ||a||_p = (\sum_i a_i^p)^{1/p} ∣∣a∣∣p=(∑iaip)1/p
由向量范数构造矩阵范数:
设给定向量范数,对任意矩阵
A
∈
R
n
×
n
A\in R^{n\times n}
A∈Rn×n,let
∣
∣
A
∣
∣
=
max
∣
∣
x
∣
∣
=
1
∣
∣
A
x
∣
∣
||A|| = \max _{||x||=1}||Ax||
∣∣A∣∣=∣∣x∣∣=1max∣∣Ax∣∣
则上式定义的
∣
∣
A
∣
∣
||A||
∣∣A∣∣是矩阵范数,并且与
∣
∣
x
∣
∣
||x||
∣∣x∣∣相容
定义
∣
∣
A
∣
∣
p
=
max
∣
∣
x
∣
∣
p
=
1
∣
∣
A
x
∣
∣
p
||A||_p = \max_{||x||_p = 1} ||Ax||_p
∣∣A∣∣p=∣∣x∣∣p=1max∣∣Ax∣∣p
∣
∣
⋅
∣
∣
p
||\cdot||_p
∣∣⋅∣∣p与矩阵的p-范数相容
1.2.3 计算公式推导
A T A A^TA ATA是非负定对称矩阵,可以相似对角化, 并且特征值全部非负。
设矩阵 A T A A^TA ATA经过正交化的单位特征向量为 x 1 , x 2 , . . . , x n , ∣ ∣ x i ∣ ∣ 2 = 1 , ( x i , x j ) i ≠ j = 0 x_1, x_2,...,x_n,||x_i||_2 = 1,(x_i,x_j)_{i\ne j} = 0 x1,x2,...,xn,∣∣xi∣∣2=1,(xi,xj)i=j=0
向量 x = c 1 x 1 + . . . + c n x n x = c_1x_1 + ... + c_n x_n x=c1x1+...+cnxn
A T A x = A T A ( c 1 x 1 + . . . + c n x n ) = c 1 λ 1 x 1 + . . . + c n λ n x n A^TAx = A^TA( c_1x_1 + ... + c_n x_n) = c_1 \lambda_1 x_1 + ... + c_n \lambda_n x_n ATAx=ATA(c1x1+...+cnxn)=c1λ1x1+...+cnλnxn
1 = ∣ ∣ x ∣ ∣ 2 2 = ∑ i c i 2 1 = ||x||_2^2 = \sum_i c_i ^2 1=∣∣x∣∣22=∑ici2
∣ ∣ A x ∣ ∣ 2 2 = ( A x ) T ( A x ) = x T A T A x = ∑ i c i λ i x T x i = ∑ i c i 2 λ i ≤ λ m ||Ax||_2^2 = (Ax)^T(Ax) = x^T A^TAx = \sum_i c_i \lambda_i x^Tx_i = \sum_i c_i^2 \lambda_i \le \lambda_m ∣∣Ax∣∣22=(Ax)T(Ax)=xTATAx=∑iciλixTxi=∑ici2λi≤λm
当 x = x m x = x_m x=xm时取等号,因此
∣ ∣ A ∣ ∣ 2 = max ∣ ∣ x ∣ ∣ 2 = 1 ∣ ∣ A x ∣ ∣ 2 = λ m ||A||_2 = \max_{||x||_2 = 1} ||Ax||_2 = \sqrt{\lambda_m} ∣∣A∣∣2=max∣∣x∣∣2=1∣∣Ax∣∣2=λm
二、证明正交变换不改变这两种范数
设 B = P T A P B = P^TAP B=PTAP,P 是正交矩阵。
只需证明 ∣ ∣ A ∣ ∣ = ∣ ∣ B ∣ ∣ ||A|| = ||B|| ∣∣A∣∣=∣∣B∣∣
2.1 2-范数
由于正交相似变换不改变矩阵的特征值,因此
∣ ∣ A ∣ ∣ 2 = ∣ ∣ B ∣ ∣ 2 ||A||_2 = ||B||_2 ∣∣A∣∣2=∣∣B∣∣2
2.2 F范数
可以证明 ∣ ∣ A P ∣ ∣ F = ∣ ∣ A ∣ ∣ F ||AP||_F = ||A||_F ∣∣AP∣∣F=∣∣A∣∣F, ∣ ∣ P B ∣ ∣ F = ∣ ∣ B ∣ ∣ F ||PB||_F = ||B||_F ∣∣PB∣∣F=∣∣B∣∣F,因为 A P = P B AP = PB AP=PB,所以得到 ∣ ∣ A ∣ ∣ F = ∣ ∣ B ∣ ∣ F ||A||_F = ||B||_F ∣∣A∣∣F=∣∣B∣∣F
下面证明 ∣ ∣ P B ∣ ∣ F = ∣ ∣ B ∣ ∣ F ||PB||_F = ||B||_F ∣∣PB∣∣F=∣∣B∣∣F
将矩阵B按列分块
∣ ∣ P B ∣ ∣ F 2 = ∣ ∣ P ( b 1 , . . . b n ) ∣ ∣ F = ∑ i ∣ ∣ P b i ∣ ∣ 2 2 = ∑ i ( P b i ) T ( P b i ) = ∑ i b i T b i = ∑ i ∣ ∣ b i ∣ ∣ 2 2 = ∣ ∣ B ∣ ∣ F \begin{aligned} ||PB||_F^2 &= ||P(b_1,...b_n)||_F\\ &= \sum_i ||Pb_i||_2^2\\ &= \sum_i (Pb_i)^T(Pb_i)\\ &= \sum_i b_i^Tb_i\\ &= \sum_i ||b_i||_2^2\\ &= ||B||_F\\ \end{aligned} ∣∣PB∣∣F2=∣∣P(b1,...bn)∣∣F=i∑∣∣Pbi∣∣22=i∑(Pbi)T(Pbi)=i∑biTbi=i∑∣∣bi∣∣22=∣∣B∣∣F
∣
∣
A
P
∣
∣
F
=
∣
∣
(
A
P
)
T
∣
∣
F
=
∣
∣
P
T
A
T
∣
∣
F
=
∣
∣
A
T
∣
∣
F
=
∣
∣
A
∣
∣
F
||AP||_F = ||(AP)^T||_F = ||P^TA^T||_F = ||A^T||_F = ||A||_F
∣∣AP∣∣F=∣∣(AP)T∣∣F=∣∣PTAT∣∣F=∣∣AT∣∣F=∣∣A∣∣F
证毕,因此
∣
∣
A
∣
∣
F
=
∣
∣
B
∣
∣
F
||A||_F = ||B||_F
∣∣A∣∣F=∣∣B∣∣F