在线性最小二乘里,最后得到的闭式解为 ( A T A ) x = A T b ({A^T}A)x = {A^T}b (ATA)x=ATb,对于该等式的求解,其中一种方法是正交变换法求解,而正交变换求解线性最小二乘的实质就是对矩阵 A A A进行QR分解。
QR分解的概念:若实非奇异对称矩阵A能化为正交矩阵Q和实非奇异上三角矩阵R的乘积,即 A = Q R A=QR A=QR,则称该式为A的QR分解。
事实上,任何的实非奇异n阶矩阵A都可以分解成正交矩阵Q和上三角矩阵R的乘积,这可以通过施密特正交化来证明。
证明过程如下:
对于实非奇异n阶矩阵A的各个列向量依次为
α
1
,
α
2
,
⋯
α
n
{\alpha _1},{\alpha _2}, \cdots {\alpha _n}
α1,α2,⋯αn,那么,通过施密特正交化得到的n个标准正交向量如下,
{
β
1
=
b
11
α
1
β
2
=
b
12
α
1
+
b
22
α
2
⋮
β
n
=
b
1
n
α
1
+
b
2
n
α
2
+
⋯
+
b
n
n
α
n
\left\{ {\begin{array}{l} {{\beta _1} = {b_{11}}{\alpha _1}}\\ {{\beta _2} = {b_{12}}{\alpha _1} + {b_{22}}{\alpha _2}}\\ \vdots \\ {{\beta _n} = {b_{1n}}{\alpha _1} + {b_{2n}}{\alpha _2} + \cdots + {b_{nn}}{\alpha _n}} \end{array}} \right.
⎩⎪⎪⎪⎨⎪⎪⎪⎧β1=b11α1β2=b12α1+b22α2⋮βn=b1nα1+b2nα2+⋯+bnnαn
则,写成矩阵形式为,
(
β
1
,
β
2
,
⋯
β
n
)
=
(
α
1
,
α
2
,
⋯
α
n
)
B
({\beta _1},{\beta _2}, \cdots {\beta _n}) = ({\alpha _1},{\alpha _2}, \cdots {\alpha _n})B
(β1,β2,⋯βn)=(α1,α2,⋯αn)B
即,
Q
=
A
B
Q=AB
Q=AB
其中,
B
=
[
b
11
b
12
⋯
b
1
n
b
22
⋯
b
2
n
⋱
⋮
b
n
n
]
B = \left[ {\begin{array}{l} {{b_{11}}}&{{b_{12}}}& \cdots &{{b_{1n}}}\\ {}&{{b_{22}}}& \cdots &{{b_{2n}}}\\ {}&{}& \ddots & \vdots \\ {}&{}&{}&{{b_{nn}}} \end{array}} \right]
B=⎣⎢⎢⎢⎡b11b12b22⋯⋯⋱b1nb2n⋮bnn⎦⎥⎥⎥⎤
很显然,矩阵B就是一个上三角矩阵,而且B可逆,QR分解中的R就是
R
=
B
−
1
R = {B^{ - 1}}
R=B−1,
Q
=
[
β
1
,
β
2
,
⋯
β
n
]
Q=[{\beta _1},{\beta _2}, \cdots {\beta _n}]
Q=[β1,β2,⋯βn]是正交矩阵(施密特正交化的结果),则有,
A
=
Q
B
−
1
=
Q
R
A=QB^{-1}=QR
A=QB−1=QR
这就说明了QR分解的存在性。
从上述的QR分解存在性的证明过程可以看出,QR分解步骤的关键就是找到一个上三角矩阵R和一个正交矩阵Q。那么,怎么才能得到一个Q和R,正交变换就能做到这一点。所以,QR分解的方法就有三种,第一种就是上述证明过程所示的,用施密特正交化求解,其他两种是Givens变换(初等旋转变换)和Housholder(镜像变换)。
施密特正交化
设
α
1
,
α
2
,
⋯
α
n
{\alpha _1},{\alpha _2}, \cdots {\alpha _n}
α1,α2,⋯αn为一组线性无关的向量,施密特正交化后的向量为
β
1
,
β
2
,
⋯
β
n
{\beta _1},{\beta _2}, \cdots {\beta _n}
β1,β2,⋯βn,则施密特正交化的一般式如下,
{
β
m
=
α
m
+
k
m
1
β
m
−
1
+
k
m
2
β
m
−
2
+
⋯
+
k
m
,
m
−
1
β
1
k
m
i
=
−
(
x
m
,
β
m
−
i
)
(
β
m
−
i
,
β
m
−
i
)
i
=
1
,
2
,
⋯
m
−
1
\left\{ {\begin{array}{l} {{\beta _m} = {\alpha _m} + {k_{m1}}{\beta _{m - 1}} + {k_{m2}}{\beta _{m - 2}} + \cdots + {k_{m,m - 1}}{\beta _1}}\\\\ {{k_{mi}} = - \frac{{({x_m},{\beta _{m - i}})}}{{({\beta _{m - i}},{\beta _{m - i}})}}{\kern 1pt} {\kern 1pt} {\kern 1pt} \;\;\;\;\;\;\;i = 1,2, \cdots m - 1} \end{array}} \right.
⎩⎨⎧βm=αm+km1βm−1+km2βm−2+⋯+km,m−1β1kmi=−(βm−i,βm−i)(xm,βm−i)i=1,2,⋯m−1
举个例子,对矩阵 A = [ 1 1 0 1 − 1 1 0 0 2 ] A = \left[ {\begin{array}{l} 1&1&0\\ 1&{ - 1}&1\\ 0&0&2 \end{array}} \right] A=⎣⎡1101−10012⎦⎤进行QR分解。
令 A = [ α 1 , α 2 , α 3 ] A=[\alpha_1,\alpha_2,\alpha_3] A=[α1,α2,α3],则 α 1 = ( 1 , 1 , 0 ) T , α 2 = ( 1 , − 1 , 0 ) T , α 1 = ( 0 , 1 , 2 ) T \alpha_1=(1,1,0)^T,\alpha_2=(1,-1,0)^T,\alpha_1=(0,1,2)^T α1=(1,1,0)T,α2=(1,−1,0)T,α1=(0,1,2)T,则根据上述的施密特方法得,
{
β
1
′
=
α
1
=
(
1
,
1
,
0
)
T
β
2
′
=
α
2
−
(
β
1
′
,
α
2
)
(
β
1
′
,
β
1
′
)
β
1
′
=
(
1
,
−
1
,
0
)
T
β
3
′
=
α
3
−
(
β
1
′
,
α
3
)
(
β
1
′
,
β
1
′
)
β
1
′
−
(
β
2
′
,
α
3
)
(
β
2
′
,
β
2
′
)
β
2
′
=
(
0
,
0
,
2
)
T
\left\{ {\begin{array}{l} {\beta _1^{'} = {\alpha _1} = {{(1,1,0)}^T}}\\\\ {\beta _2^{'} = {\alpha _2} - \frac{{(\beta _1^{'},{\alpha _2})}}{{(\beta _1^{'},\beta _1^{'})}}\beta _1^{'} = {{(1, - 1,0)}^T}}\\\\ {\beta _3^{'} = {\alpha _3} - \frac{{(\beta _1^{'},{\alpha _3})}}{{(\beta _1^{'},\beta _1^{'})}}\beta _1^{'} - \frac{{(\beta _2^{'},{\alpha _3})}}{{(\beta _2^{'},\beta _2^{'})}}\beta _2^{'} = {{(0,0,2)}^T}} \end{array}} \right.
⎩⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎧β1′=α1=(1,1,0)Tβ2′=α2−(β1′,β1′)(β1′,α2)β1′=(1,−1,0)Tβ3′=α3−(β1′,β1′)(β1′,α3)β1′−(β2′,β2′)(β2′,α3)β2′=(0,0,2)T
单位化后得,
{
β
1
=
(
1
2
,
1
2
,
0
)
T
=
1
2
α
1
β
2
=
(
1
2
,
−
1
2
,
0
)
T
=
1
2
α
2
β
3
=
(
0
,
0
,
1
)
T
=
−
1
4
α
1
+
1
4
α
2
+
1
2
α
3
\left\{ {\begin{array}{l} {{\beta _1} = {{(\frac{1}{{\sqrt 2 }},\frac{1}{{\sqrt 2 }},0)}^T} = \frac{1}{{\sqrt 2 }}{\alpha _1}}\\\\ {{\beta _2} = {{(\frac{1}{{\sqrt 2 }}, - \frac{1}{{\sqrt 2 }},0)}^T} = \frac{1}{{\sqrt 2 }}{\alpha _2}}\\\\ {{\beta _3} = {{(0,0,1)}^T} = - \frac{1}{4}{\alpha _1} + \frac{1}{4}{\alpha _2} + \frac{1}{2}{\alpha _3}} \end{array}} \right.
⎩⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎧β1=(21,21,0)T=21α1β2=(21,−21,0)T=21α2β3=(0,0,1)T=−41α1+41α2+21α3
上式写成矩阵形式,
(
β
1
,
β
2
,
β
3
)
⏟
Q
=
(
α
1
,
α
2
,
α
3
)
⏟
A
[
1
2
0
−
1
4
0
1
2
1
4
0
0
1
2
]
\underbrace {({\beta _1},{\beta _2},{\beta _3})}_Q = \underbrace {({\alpha _1},{\alpha _2},{\alpha _3})}_A\left[ {\begin{array}{l} {\frac{1}{{\sqrt 2 }}}&0&{ - \frac{1}{4}}\\\\ 0&{\frac{1}{{\sqrt 2 }}}&{\frac{1}{4}}\\\\ 0&0&{\frac{1}{2}} \end{array}} \right]
Q
(β1,β2,β3)=A
(α1,α2,α3)⎣⎢⎢⎢⎢⎡21000210−414121⎦⎥⎥⎥⎥⎤
A = ( β 1 , β 2 , β 3 ) [ 1 2 0 − 1 4 0 1 2 1 4 0 0 1 2 ] − 1 = [ 1 2 1 2 0 1 2 − 1 2 0 0 0 1 ] [ 2 0 1 2 0 2 − 1 2 0 0 2 ] = Q R \begin{array}{l} A = ({\beta _1},{\beta _2},{\beta _3}){\left[ {\begin{array}{l} {\frac{1}{{\sqrt 2 }}}&0&{ - \frac{1}{4}}\\\\ 0&{\frac{1}{{\sqrt 2 }}}&{\frac{1}{4}}\\\\ 0&0&{\frac{1}{2}} \end{array}} \right]^{ - 1}}\\\\ \;{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \left[ {\begin{array}{l} {\frac{1}{{\sqrt 2 }}}&{\frac{1}{{\sqrt 2 }}}&0\\\\ {\frac{1}{{\sqrt 2 }}}&{ - \frac{1}{{\sqrt 2 }}}&0\\\\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{l} {\sqrt 2 }&0&{\frac{1}{{\sqrt 2 }}}\\\\ 0&{\sqrt 2 }&{ - \frac{1}{{\sqrt 2 }}}\\\\ 0&0&2 \end{array}} \right]\\\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = QR \end{array} A=(β1,β2,β3)⎣⎢⎢⎢⎢⎡21000210−414121⎦⎥⎥⎥⎥⎤−1=⎣⎢⎢⎢⎢⎡2121021−210001⎦⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎡20002021−212⎦⎥⎥⎥⎥⎤=QR
从上例可以看出,矩阵的QR实际上就是施密特正交化的特性,施密特正交化可以得到一个上三角矩阵和一个正交矩阵,所以矩阵可以QR分解。
Givens变换法
Givens变换,即初等旋转变换,正交变换的一种,对应的初等旋转矩阵形式如下,
R
i
j
=
[
1
⋱
1
cos
θ
0
⋯
0
sin
θ
0
1
⋮
⋮
⋮
⋮
0
0
⋯
1
0
−
sin
θ
0
⋯
0
cos
θ
1
⋱
1
]
←
i
行
←
j
行
{R_{ij}} = \left[ {\begin{array}{l} 1&{}&{}&{}&{}&{}&{}&{}&{}&{}&{}\\ {}& \ddots &{}&{}&{}&{}&{}&{}&{}&{}&{}\\ {}&{}&1&{}&{}&{}&{}&{}&{}&{}&{}\\ {}&{}&{}&{\cos \theta }&0& \cdots &0&{\sin \theta }&{}&{}&{}\\ {}&{}&{}&0&1&{}&{}&{}&{}&{}&{}\\ {}&{}&{}& \vdots & \vdots &{}& \vdots & \vdots &{}&{}&{}\\ {}&{}&{}&0&0& \cdots &1&0&{}&{}&{}\\ {}&{}&{}&{ - \sin \theta }&0& \cdots &0&{\cos \theta }&{}&{}&{}\\ {}&{}&{}&{}&{}&{}&{}&{}&1&{}&{}\\ {}&{}&{}&{}&{}&{}&{}&{}&{}& \ddots &{}\\ {}&{}&{}&{}&{}&{}&{}&{}&{}&{}&1 \end{array}} \right]\begin{array}{l} {}\\ {}\\ {}\\ { \leftarrow i行}\\ {}\\ {}\\ {}\\ { \leftarrow j行}\\ {}\\ {}\\ {} \end{array}
Rij=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡1⋱1cosθ0⋮0−sinθ01⋮00⋯⋯⋯0⋮10sinθ⋮0cosθ1⋱1⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤←i行←j行
初等变换矩阵在矩阵化简方面有很好的应用,即,若以
R
i
j
R_{ij}
Rij左乘一个矩阵A,并且其中的s和c按如下方式取值,
s
=
x
j
x
i
2
+
x
j
2
,
c
=
x
i
x
i
2
+
x
j
2
s = \frac{{{x_j}}}{{\sqrt {x_i^2 + x_j^{\rm{2}}} }},c = \frac{{{x_i}}}{{\sqrt {x_i^2 + x_j^{\rm{2}}} }}
s=xi2+xj2xj,c=xi2+xj2xi则就可以把矩阵A的第j个分量化为0。
举个例子,把向量
x
=
(
1
,
2
,
3
)
T
x=(1,2,3)^T
x=(1,2,3)T通过Givens变换化成与
(
1
,
0
,
0
)
T
(1,0,0)^T
(1,0,0)T同方向。
第一步,可以选择把向量x中的第二项化为0,则,
c
=
x
1
x
1
2
+
x
2
2
=
1
5
s
=
x
2
x
1
2
+
x
2
2
=
2
5
\begin{array}{l} c = \frac{{{x_1}}}{{\sqrt {x_1^2 + x_2^{\rm{2}}} }} = \frac{1}{{\sqrt 5 }}\\\\ s = \frac{{{x_2}}}{{\sqrt {x_1^2 + x_2^{\rm{2}}} }} = \frac{2}{{\sqrt 5 }} \end{array}
c=x12+x22x1=51s=x12+x22x2=52
得到初等旋转矩阵为,
R
12
=
[
1
5
2
5
0
−
2
5
1
5
0
0
0
1
]
{R_{12}} = \left[ {\begin{array}{l} {\frac{1}{{\sqrt 5 }}}&{\frac{2}{{\sqrt 5 }}}&0\\\\ { - \frac{2}{{\sqrt 5 }}}&{\frac{1}{{\sqrt 5 }}}&0\\\\ 0&0&1 \end{array}} \right]
R12=⎣⎢⎢⎢⎢⎡51−52052510001⎦⎥⎥⎥⎥⎤
从而把x的第二项化为0,即,
y
=
R
12
x
=
(
5
,
0
,
3
)
T
y = {R_{12}}x = {(\sqrt 5 ,0,3)^T}
y=R12x=(5,0,3)T
第二步,把y中的第3项化为0,同样,取c和s为,
c
=
x
1
x
1
2
+
x
3
2
=
5
14
s
=
x
3
x
1
2
+
x
3
2
=
3
14
\begin{array}{l} {c = \frac{{{x_1}}}{{\sqrt {x_1^2 + x_3^{\rm{2}}} }} = \sqrt {\frac{5}{{14}}} }\\ {}\\ {s = \frac{{{x_3}}}{{\sqrt {x_1^2 + x_3^{\rm{2}}} }} = \frac{3}{{\sqrt {14} }}} \end{array}
c=x12+x32x1=145s=x12+x32x3=143
得到初等旋转矩阵为,
R
13
=
[
5
14
0
3
14
0
1
0
−
3
14
0
5
14
]
{R_{13}} = \left[ {\begin{array}{l} {\sqrt {\frac{5}{{14}}} }&0&{\frac{3}{{\sqrt {14} }}}\\ {}&{}&{}\\ 0&1&0\\ {}&{}&{}\\ { - \frac{3}{{\sqrt {14} }}}&0&{\sqrt {\frac{5}{{14}}} } \end{array}} \right]
R13=⎣⎢⎢⎢⎢⎢⎢⎡1450−1430101430145⎦⎥⎥⎥⎥⎥⎥⎤
这里就把第三项化为0,即,
z
=
R
13
y
=
R
13
R
12
x
=
(
14
,
0
,
0
)
T
z = {R_{13}}y = {R_{13}}{R_{12}}x = {\left( {\sqrt {14} ,0,0} \right)^T}
z=R13y=R13R12x=(14,0,0)T
对一个向量可以这样化简,那么对于矩阵而言,矩阵可以看作是由n个列向量组成的,所以把矩阵A左乘一系列的初等旋转矩阵,就可以把A化简为结构较为简单的上三角矩阵R。而前面已经证明了一个实非奇异的矩阵A可以进行QR分解,那么这里的Givens变换就能把一个矩阵A化成上三角矩阵,所以,Givens变换可以用于QR分解。
R n − 1 , n ⋯ R 2 n ⋯ R 23 R 1 n ⋯ R 12 A = R Q = ( R n − 1 , n ⋯ R 2 n ⋯ R 23 R 1 n ⋯ R 12 ) − 1 = ( R n − 1 , n ⋯ R 2 n ⋯ R 23 R 1 n ⋯ R 12 ) H \begin{array}{l} {R_{n - 1,n}} \cdots {R_{2n}} \cdots {R_{23}}{R_{1n}} \cdots {R_{12}}A = R\\\\ Q = {({R_{n - 1,n}} \cdots {R_{2n}} \cdots {R_{23}}{R_{1n}} \cdots {R_{12}})^{ - 1}}\\\\ \;{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{ = }}{({R_{n - 1,n}} \cdots {R_{2n}} \cdots {R_{23}}{R_{1n}} \cdots {R_{12}})^H} \end{array} Rn−1,n⋯R2n⋯R23R1n⋯R12A=RQ=(Rn−1,n⋯R2n⋯R23R1n⋯R12)−1=(Rn−1,n⋯R2n⋯R23R1n⋯R12)H
举个例子,用Givens方法对矩阵 A = [ 12 − 20 41 9 − 15 − 63 20 50 35 ] A = \left[ {\begin{array}{l} {12}&{ - 20}&{41}\\\\ 9&{ - 15}&{ - 63}\\\\ {20}&{50}&{35} \end{array}} \right] A=⎣⎢⎢⎢⎢⎡12920−20−155041−6335⎦⎥⎥⎥⎥⎤进行QR分解。
第一步,用 R 12 R_{12} R12左乘A,消去第1列第2行处的元素,
c
=
12
12
2
+
9
2
=
12
15
=
4
5
,
s
=
9
12
2
+
9
2
=
3
5
c = \frac{{12}}{{\sqrt {{{12}^2} + {9^2}} }} = \frac{{12}}{{15}} = \frac{4}{5},s = \frac{9}{{\sqrt {{{12}^2} + {9^2}} }} = \frac{3}{5}
c=122+9212=1512=54,s=122+929=53故,
R
12
A
=
[
4
5
3
5
0
−
3
5
4
5
0
0
0
1
]
A
=
[
15
−
25
−
5
0
0
−
75
20
50
35
]
{R_{12}}A = \left[ {\begin{array}{l} {\frac{4}{5}}&{\frac{3}{5}}&0\\\\ { - \frac{3}{5}}&{\frac{4}{5}}&0\\\\ 0&0&1 \end{array}} \right]A = \left[ {\begin{array}{l} {15}&{ - 25}&{ - 5}\\\\ 0&0&{ - 75}\\\\ {20}&{50}&{35} \end{array}} \right]
R12A=⎣⎢⎢⎢⎢⎡54−53053540001⎦⎥⎥⎥⎥⎤A=⎣⎢⎢⎢⎢⎡15020−25050−5−7535⎦⎥⎥⎥⎥⎤
同理,接着消去A中第1列第3个元素,
c
=
15
15
2
+
20
2
=
15
25
=
3
5
,
s
=
20
15
2
+
20
2
=
4
5
c = \frac{{15}}{{\sqrt {{{15}^2} + {{20}^2}} }} = \frac{{15}}{{25}} = \frac{3}{5},s = \frac{{20}}{{\sqrt {{{15}^2} + {{20}^2}} }} = \frac{4}{5}
c=152+20215=2515=53,s=152+20220=54故
A
(
1
)
=
R
13
R
12
A
=
[
3
5
0
4
5
0
1
0
−
4
5
0
3
5
]
R
12
A
=
[
25
25
25
0
0
−
75
0
50
25
]
{A^{(1)}} = {R_{13}}{R_{12}}A = \left[ {\begin{array}{l} {\frac{3}{5}}&0&{\frac{4}{5}}\\\\ 0&1&0\\\\ { - \frac{4}{5}}&0&{\frac{3}{5}} \end{array}} \right]{R_{12}}A = \left[ {\begin{array}{l} {25}&{25}&{25}\\\\ 0&0&{ - 75}\\\\ 0&{50}&{25} \end{array}} \right]
A(1)=R13R12A=⎣⎢⎢⎢⎢⎡530−5401054053⎦⎥⎥⎥⎥⎤R12A=⎣⎢⎢⎢⎢⎡25002505025−7525⎦⎥⎥⎥⎥⎤
这里就把矩阵A中的第一列的第二个和第三个元素化为0,即第一列(
A
(
1
)
A^{(1)}
A(1))化简完成,下面开始第二列的化简。
第二步,进行第二列的化简,即把第二列的第三个元素化为0,
c
=
0
0
2
+
50
2
=
0
,
s
=
50
0
2
+
50
2
=
1
c = \frac{0}{{\sqrt {{0^2} + {{50}^2}} }} = 0,s = \frac{{50}}{{\sqrt {{0^2} + {{50}^2}} }} = 1
c=02+5020=0,s=02+50250=1故,
R
23
=
[
1
0
0
0
0
1
0
−
1
0
]
{R_{23}} = \left[ {\begin{array}{l} 1&0&0\\ 0&0&1\\ 0&{ - 1}&0 \end{array}} \right]
R23=⎣⎡10000−1010⎦⎤
A
(
2
)
=
R
23
A
(
1
)
=
[
25
25
25
0
50
25
0
0
75
]
{A^{(2)}} = {R_{23}}{A^{(1)}} = \left[ {\begin{array}{l} {25}&{25}&{25}\\\\ 0&{50}&{25}\\\\ 0&0&{75} \end{array}} \right]
A(2)=R23A(1)=⎣⎢⎢⎢⎢⎡250025500252575⎦⎥⎥⎥⎥⎤
这里的
A
(
2
)
A^{(2)}
A(2)就是上三角矩阵,即QR分解里的R,而Q的求解如下,
Q
=
(
R
23
R
13
R
12
)
−
1
=
[
12
25
9
25
20
25
−
16
25
−
12
25
15
25
15
25
−
20
25
0
]
−
1
=
[
12
25
9
25
20
25
−
16
25
−
12
25
15
25
15
25
−
20
25
0
]
H
=
[
12
25
−
16
25
15
25
9
25
−
12
25
−
20
25
20
25
15
25
0
]
\begin{array}{l} Q = {({R_{23}}{R_{13}}{R_{12}})^{ - 1}}\\\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = {\left[ {\begin{array}{l} {\frac{{12}}{{25}}}&{\frac{9}{{25}}}&{\frac{{20}}{{25}}}\\\\ { - \frac{{16}}{{25}}}&{ - \frac{{12}}{{25}}}&{\frac{{15}}{{25}}}\\\\ {\frac{{15}}{{25}}}&{ - \frac{{20}}{{25}}}&0 \end{array}} \right]^{ - 1}}\\\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = {\left[ {\begin{array}{l} {\frac{{12}}{{25}}}&{\frac{9}{{25}}}&{\frac{{20}}{{25}}}\\\\ { - \frac{{16}}{{25}}}&{ - \frac{{12}}{{25}}}&{\frac{{15}}{{25}}}\\\\ {\frac{{15}}{{25}}}&{ - \frac{{20}}{{25}}}&0 \end{array}} \right]^H}\\\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \left[ {\begin{array}{l} {\frac{{12}}{{25}}}&{ - \frac{{16}}{{25}}}&{\frac{{15}}{{25}}}\\\\ {\frac{9}{{25}}}&{ - \frac{{12}}{{25}}}&{ - \frac{{20}}{{25}}}\\\\ {\frac{{20}}{{25}}}&{\frac{{15}}{{25}}}&0 \end{array}} \right] \end{array}
Q=(R23R13R12)−1=⎣⎢⎢⎢⎢⎡2512−25162515259−2512−2520252025150⎦⎥⎥⎥⎥⎤−1=⎣⎢⎢⎢⎢⎡2512−25162515259−2512−2520252025150⎦⎥⎥⎥⎥⎤H=⎣⎢⎢⎢⎢⎡25122592520−2516−251225152515−25200⎦⎥⎥⎥⎥⎤
从上例可以看出,Givens方法需要作
n
(
n
−
1
)
2
\frac{{n(n - 1)}}{2}
2n(n−1)个初等旋转矩阵的乘积,当
n
n
n比较大时,计算量较大。
Housholder变换法
Housholder变换,即镜像变换,正交变换的一种,对应的初等反射矩阵形式如下, H = I − 2 ω ω T H = I - 2\omega {\omega ^T} H=I−2ωωT其中 ω \omega ω表示某单位向量。
其镜面关系为:若 η = H ξ \eta = H\xi η=Hξ,则向量 η \eta η和向量 ξ \xi ξ关于某单位向量 ω \omega ω的正交轴 l l l镜像对称。
用初等反射矩阵也可以完成对矩阵的简化,其简化原理用的是如下定理:
镜像变换可以使任何非零向量
ξ
\xi
ξ变成与给定单位向量
a
a
a同方向的向量
η
\eta
η。
用这一定理可以构造出H里单位向量
ω
\omega
ω如下,
ω
=
ξ
−
∣
ξ
∣
a
∣
ξ
−
∣
ξ
∣
a
∣
\omega = \frac{{\xi - \left| \xi \right|a}}{{\left| {\xi - \left| \xi \right|a} \right|}}
ω=∣ξ−∣ξ∣a∣ξ−∣ξ∣a那么,可以通过构造
ω
\omega
ω的方法来简化矩阵A,使其变为一个上三角阵。
举个例子,用镜像变化把向量
ξ
=
(
0
,
3
,
0
,
4
)
T
\xi=(0,3,0,4)^T
ξ=(0,3,0,4)T变为与向量
e
1
=
(
1
,
0
,
0
,
0
)
T
e_1=(1,0,0,0)^T
e1=(1,0,0,0)T同方向的向量。
首先就是构造
ω
\omega
ω,
ω
=
ξ
−
∣
ξ
∣
e
1
∣
ξ
−
∣
ξ
∣
e
1
∣
=
(
−
5
,
3
,
0
,
4
)
T
∣
(
−
5
,
3
,
0
,
4
)
T
∣
=
(
−
1
2
,
3
5
2
,
0
,
4
5
2
)
T
\omega = \frac{{\xi - \left| \xi \right|{e_1}}}{{\left| {\xi - \left| \xi \right|{e_1}} \right|}} = \frac{{{{( - 5,3,0,4)}^T}}}{{\left| {{{( - 5,3,0,4)}^T}} \right|}} = {( - \frac{1}{{\sqrt 2 }},\frac{3}{{5\sqrt 2 }},0,\frac{4}{{5\sqrt 2 }})^T}
ω=∣ξ−∣ξ∣e1∣ξ−∣ξ∣e1=∣∣∣(−5,3,0,4)T∣∣∣(−5,3,0,4)T=(−21,523,0,524)T则初等反射矩阵H为,
H
=
I
−
2
ω
ω
T
=
[
0
3
5
0
4
5
3
5
16
25
0
−
12
25
0
0
1
0
4
5
−
12
25
0
9
25
]
H = I - 2\omega {\omega ^T} = \left[ {\begin{array}{l} 0&{\frac{3}{5}}&0&{\frac{4}{5}}\\\\ {\frac{3}{5}}&{\frac{{16}}{{25}}}&0&{ - \frac{{12}}{{25}}}\\\\ 0&0&1&0\\\\ {\frac{4}{5}}&{ - \frac{{12}}{{25}}}&0&{\frac{9}{{25}}} \end{array}} \right]
H=I−2ωωT=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡0530545325160−2512001054−25120259⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤
从而有,
η
=
H
ξ
=
(
5
,
0
,
0
,
0
)
T
\eta = H\xi = {(5,0,0,0)^T}
η=Hξ=(5,0,0,0)T
上例,是使用镜像变换完成了对向量的化简。那么,对于矩阵而言,同样把矩阵看成是一组列向量的组合,那么每次求解只完成一列列向量的简化。
而与Given变换不同的是,对于每一列列向量的简化,其对应的反射矩阵H只要求解一次,所以,经过
n
−
1
n-1
n−1次左乘初等反射矩阵H就可以化成一个上三角阵,计算量大约是Givens变换的一半,即,
H
(
n
−
1
)
H
(
n
−
2
)
⋯
H
(
1
)
A
=
A
(
n
)
=
R
Q
=
(
H
(
n
−
1
)
H
(
n
−
2
)
⋯
H
(
1
)
)
−
1
\begin{array}{l} {H^{(n - 1)}}{H^{(n - 2)}} \cdots {H^{(1)}}A = {A^{(n)}} = R\\\\ Q = {({H^{(n - 1)}}{H^{(n - 2)}} \cdots {H^{(1)}})^{ - 1}} \end{array}
H(n−1)H(n−2)⋯H(1)A=A(n)=RQ=(H(n−1)H(n−2)⋯H(1))−1
举个例子,用Housholder变换法对矩阵 A = [ 2 2 1 1 2 2 2 1 2 ] A = \left[ {\begin{array}{l} 2&2&1\\ 1&2&2\\ 2&1&2 \end{array}} \right] A=⎣⎡212221122⎦⎤进行QR分解。
第一步,完成对第一列的简化。
因为
α
1
(
1
)
=
(
2
,
1
,
2
)
T
≠
0
\alpha _1^{(1)} = {(2,1,2)^T} \ne 0
α1(1)=(2,1,2)T=0,作单位向量,
ω
(
1
)
=
α
1
(
1
)
−
∣
α
1
(
1
)
∣
e
1
∣
α
1
(
1
)
−
∣
α
1
(
1
)
∣
e
1
∣
=
(
2
,
1
,
2
)
T
−
3
e
1
∣
(
2
,
1
,
2
)
T
−
3
e
1
∣
=
(
−
1
6
,
1
6
,
2
6
)
T
{\omega ^{(1)}} = \frac{{\alpha _1^{(1)} - \left| {\alpha _1^{(1)}} \right|{e_1}}}{{\left| {\alpha _1^{(1)} - \left| {\alpha _1^{(1)}} \right|{e_1}} \right|}} = \frac{{{{(2,1,2)}^T} - 3{e_1}}}{{\left| {{{(2,1,2)}^T} - 3{e_1}} \right|}} = {( - \frac{1}{{\sqrt 6 }},\frac{1}{{\sqrt 6 }},\frac{2}{{\sqrt 6 }})^T}
ω(1)=∣∣∣α1(1)−∣∣∣α1(1)∣∣∣e1∣∣∣α1(1)−∣∣∣α1(1)∣∣∣e1=∣∣∣(2,1,2)T−3e1∣∣∣(2,1,2)T−3e1=(−61,61,62)T则反射矩阵H为,
H
(
1
)
=
I
−
2
ω
(
1
)
ω
(
1
)
T
=
[
2
3
1
3
2
3
1
3
2
3
−
2
3
2
3
−
2
3
−
1
3
]
{H^{(1)}} = I - 2{\omega ^{(1)}}{\omega ^{(1)T}} = \left[ {\begin{array}{l} {\frac{2}{3}}&{\frac{1}{3}}&{\frac{2}{3}}\\\\ {\frac{1}{3}}&{\frac{2}{3}}&{ - \frac{2}{3}}\\\\ {\frac{2}{3}}&{ - \frac{2}{3}}&{ - \frac{1}{3}} \end{array}} \right]
H(1)=I−2ω(1)ω(1)T=⎣⎢⎢⎢⎢⎡3231323132−3232−32−31⎦⎥⎥⎥⎥⎤从而得,
H
(
1
)
A
=
[
3
8
3
8
3
0
4
3
1
3
0
−
1
3
−
4
3
]
=
[
3
8
3
8
3
0
0
A
2
]
{H^{(1)}}A = \left[ {\begin{array}{l} 3&{\frac{8}{3}}&{\frac{8}{3}}\\\\ 0&{\frac{4}{3}}&{\frac{1}{3}}\\\\ 0&{ - \frac{1}{3}}&{ - \frac{4}{3}} \end{array}} \right] = \left[ {\begin{array}{l} 3&{\begin{array}{l} {\frac{8}{3}}&{\frac{8}{3}} \end{array}}\\\\ {\begin{array}{l} 0\\\\ 0 \end{array}}&{{A_2}} \end{array}} \right]
H(1)A=⎣⎢⎢⎢⎢⎡3003834−313831−34⎦⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎡3003838A2⎦⎥⎥⎥⎥⎤
第二步,简化第二列。此时关注的不是矩阵
H
(
1
)
A
H^{(1)}A
H(1)A,而是
A
2
A_2
A2。
则,
ω
^
(
2
)
=
(
4
3
,
−
1
3
)
T
−
∣
(
4
3
,
−
1
3
)
T
∣
e
1
∣
(
4
3
,
−
1
3
)
T
−
∣
(
4
3
,
−
1
3
)
T
∣
e
1
∣
=
(
4
−
17
34
−
8
17
,
−
1
34
−
8
17
)
T
{\hat \omega ^{(2)}} = \frac{{{{(\frac{4}{3}, - \frac{1}{3})}^T} - \left| {{{(\frac{4}{3}, - \frac{1}{3})}^T}} \right|{e_1}}}{{\left| {{{(\frac{4}{3}, - \frac{1}{3})}^T} - \left| {{{(\frac{4}{3}, - \frac{1}{3})}^T}} \right|{e_1}} \right|}} = {\left( {\frac{{4 - \sqrt {17} }}{{\sqrt {34 - 8\sqrt {17} } }}, - \frac{1}{{\sqrt {34 - 8\sqrt {17} } }}} \right)^T}
ω^(2)=∣∣∣(34,−31)T−∣∣∣(34,−31)T∣∣∣e1∣∣∣(34,−31)T−∣∣∣(34,−31)T∣∣∣e1=(34−8174−17,−34−8171)T
那么有,
H
^
(
2
)
=
I
−
2
ω
^
(
2
)
ω
^
(
2
)
T
=
[
1
−
2
(
4
−
17
)
2
34
−
8
17
2
(
4
−
17
)
34
−
8
17
2
(
4
−
17
)
34
−
8
17
1
−
2
34
−
8
17
]
{\hat H^{(2)}} = I - 2{\hat \omega ^{(2)}}{\hat \omega ^{(2)T}} = \left[ {\begin{array}{l} {1 - \frac{{2{{(4 - \sqrt {17} )}^2}}}{{34 - 8\sqrt {17} }}}&{\frac{{2(4 - \sqrt {17} )}}{{34 - 8\sqrt {17} }}}\\\\ {\frac{{2(4 - \sqrt {17} )}}{{34 - 8\sqrt {17} }}}&{1 - \frac{2}{{34 - 8\sqrt {17} }}} \end{array}} \right]
H^(2)=I−2ω^(2)ω^(2)T=⎣⎢⎡1−34−8172(4−17)234−8172(4−17)34−8172(4−17)1−34−8172⎦⎥⎤从而对应的反射矩阵为,
H
(
2
)
=
[
1
0
0
H
^
(
2
)
]
=
[
1
0
0
0
1
−
2
(
4
−
17
)
2
34
−
8
17
2
(
4
−
17
)
34
−
8
17
0
2
(
4
−
17
)
34
−
8
17
1
−
2
34
−
8
17
]
{H^{(2)}} = \left[ {\begin{array}{l} 1&0\\\\ 0&{{{\hat H}^{(2)}}} \end{array}} \right] = \left[ {\begin{array}{l} 1&0&0\\\\ 0&{1 - \frac{{2{{(4 - \sqrt {17} )}^2}}}{{34 - 8\sqrt {17} }}}&{\frac{{2(4 - \sqrt {17} )}}{{34 - 8\sqrt {17} }}}\\\\ 0&{\frac{{2(4 - \sqrt {17} )}}{{34 - 8\sqrt {17} }}}&{1 - \frac{2}{{34 - 8\sqrt {17} }}} \end{array}} \right]
H(2)=⎣⎡100H^(2)⎦⎤=⎣⎢⎢⎢⎢⎢⎡10001−34−8172(4−17)234−8172(4−17)034−8172(4−17)1−34−8172⎦⎥⎥⎥⎥⎥⎤则有,
H
(
2
)
H
(
1
)
A
=
[
3
8
3
8
3
0
17
3
8
17
51
0
0
5
17
17
]
=
R
{H^{(2)}}{H^{(1)}}A = \left[ {\begin{array}{l} 3&{\frac{8}{3}}&{\frac{8}{3}}\\\\ 0&{\frac{{\sqrt {17} }}{3}}&{\frac{{8\sqrt {17} }}{{51}}}\\\\ 0&0&{\frac{{5\sqrt {17} }}{{17}}} \end{array}} \right] = R
H(2)H(1)A=⎣⎢⎢⎢⎢⎡300383170385181717517⎦⎥⎥⎥⎥⎤=R
Q的求解如下,
Q
=
(
H
(
2
)
H
(
1
)
)
−
1
=
[
2
3
2
17
51
−
3
17
17
1
3
10
17
51
2
17
17
2
3
−
7
17
17
2
17
17
]
Q = {({H^{(2)}}{H^{(1)}})^{ - 1}} = \left[ {\begin{array}{l} {\frac{2}{3}}&{\frac{{2\sqrt {17} }}{{51}}}&{ - \frac{{3\sqrt {17} }}{{17}}}\\\\ {\frac{1}{3}}&{\frac{{10\sqrt {17} }}{{51}}}&{\frac{{2\sqrt {17} }}{{17}}}\\\\ {\frac{2}{3}}&{ - \frac{{7\sqrt {17} }}{{17}}}&{\frac{{2\sqrt {17} }}{{17}}} \end{array}} \right]
Q=(H(2)H(1))−1=⎣⎢⎢⎢⎢⎢⎡32313251217511017−17717−173171721717217⎦⎥⎥⎥⎥⎥⎤
附:
从实用性角度考虑,一般不用施密特正交化进行QR分解,用Givens变换法和Housholder变换法较多,但Givens变换法需要
n
(
n
−
1
)
2
\frac{{n(n - 1)}}{2}
2n(n−1)次左乘初等旋转矩阵,计算量大,而Housholder变换法只需
n
−
1
n-1
n−1次左乘初等反射矩阵,计算量小是Housholder变换法的优势。但在稀疏矩阵的QR分解上,用Givens方法求解仍有方便之处。