Householder变换
已知
x
\mathbf{x}
x,做Householder变换,使得
∥
x
∥
=
∥
y
∥
\|\mathbf{x}\|=\|\mathbf{y}\|
∥x∥=∥y∥,
H
x
=
y
\mathbf{Hx}=\mathbf{y}
Hx=y,
要求
H
H
H
=
I
,
H
H
=
H
\mathbf{H}\mathbf{H}^H=\mathbf{I},\mathbf{H}^H=\mathbf{H}
HHH=I,HH=H
于是
H
=
I
−
u
u
H
\mathbf{H}=\mathbf{I}-\mathbf{u}\mathbf{u}^H
H=I−uuH
其中
u
=
x
−
y
∥
x
−
y
∥
\mathbf{u}=\frac{\mathbf{x}-\mathbf{y}}{\|\mathbf{x}-\mathbf{y}\|}
u=∥x−y∥x−y
推导:
y
=
x
−
(
x
−
y
)
=
x
−
2
x
−
y
2
=
x
−
2
P
x
=
x
−
2
u
u
H
x
=
(
I
−
2
u
u
H
)
x
\begin{aligned} \mathbf{y}&=\mathbf{x}-\left(\mathbf{x}-\mathbf{y}\right)\\ &=\mathbf{x}-2\frac{\mathbf{x}-\mathbf{y}}{2}\\ &=\mathbf{x}-2\mathbf{Px}\\ &=\mathbf{x}-2\mathbf{u}\mathbf{u}^H\mathbf{x}\\ &=\left(\mathbf{I}-2\mathbf{u}\mathbf{u}^H\right)\mathbf{x} \end{aligned}
y=x−(x−y)=x−22x−y=x−2Px=x−2uuHx=(I−2uuH)x
其中
P
\mathbf{P}
P是
span
{
x
−
y
}
\operatorname{span}\lbrace \mathbf{x}-\mathbf{y} \rbrace
span{x−y}的正交投影
span
{
x
−
y
}
\operatorname{span}\lbrace \mathbf{x}-\mathbf{y} \rbrace
span{x−y}的标准正交基
u
=
x
−
y
∥
x
−
y
∥
\mathbf{u}=\frac{\mathbf{x}-\mathbf{y}}{\|\mathbf{x}-\mathbf{y}\|}
u=∥x−y∥x−y
P
=
u
u
H
\mathbf{P}=\mathbf{u}\mathbf{u}^H
P=uuH
变为e1
即
H
x
=
α
e
1
\mathbf{H}\mathbf{x}=\alpha \mathbf{e}_1
Hx=αe1
容易求出来
α
=
±
∥
x
∥
\alpha = \pm \|\mathbf{x}\|
α=±∥x∥
可以取
α
=
sign
(
x
1
)
\alpha = \operatorname{sign}\left(x_1\right)
α=sign(x1),于是
u
=
x
+
sign
(
x
1
)
∥
x
∥
e
1
∥
x
+
sign
(
x
1
)
∥
x
∥
e
1
∥
\mathbf{u}=\frac{\mathbf{x}+\operatorname{sign}\left(x_1\right)\|\mathbf{x}\|\mathbf{e}_1}{\|\mathbf{x}+\operatorname{sign}\left(x_1\right)\|\mathbf{x}\|\mathbf{e}_1\|}
u=∥x+sign(x1)∥x∥e1∥x+sign(x1)∥x∥e1
减小误差
可能搜代码的时候会有这种
H
=
I
−
τ
u
u
T
,
τ
=
2
u
T
u
\mathbf{H}=\mathbf{I}-\tau\mathbf{u}\mathbf{u}^T,\quad \tau=\frac{2}{\mathbf{u}^T\mathbf{u}}
H=I−τuuT,τ=uTu2
其中
u
=
x
−
α
e
1
\mathbf{u}=\mathbf{x}-\alpha \mathbf{e}_1
u=x−αe1,但是
当
x
1
>
0
x_1>0
x1>0时
u
1
=
x
1
−
∥
x
∥
=
x
1
2
−
∥
x
∥
2
x
1
+
∥
x
∥
=
−
x
2
2
+
⋯
+
x
n
2
x
1
+
∥
x
∥
u_1=x_1-\|\mathbf{x}\|=\frac{x_1^2-\|\mathbf{x}\|^2}{x_1+\|\mathbf{x}\|}=-\frac{x_2^2+\cdots+x_n^2}{x_1+\|\mathbf{x}\|}
u1=x1−∥x∥=x1+∥x∥x12−∥x∥2=−x1+∥x∥x22+⋯+xn2
当
x
1
≤
0
x_1\le 0
x1≤0时,依然用原来的方法算
u
1
=
x
1
−
∥
x
∥
u_1=x_1-\|\mathbf{x}\|
u1=x1−∥x∥
然后
τ
=
2
u
1
2
∥
u
∥
2
u
=
u
u
1
\tau = \frac{2 u_1^2}{\|\mathbf{u}\|^2}\\ \mathbf{u}=\frac{\mathbf{u}}{u_1}
τ=∥u∥22u12u=u1u
代码来自https://stackoverflow.com/questions/53489237/how-can-you-implement-householder-based-qr-decomposition-in-python
def householder(x: np.ndarray) -> Union[np.ndarray, int]:
alpha = x[0]
s = np.power(np.linalg.norm(x[1:]), 2)
v = x.copy()
if s == 0:
tau = 0
else:
t = np.sqrt(alpha**2 + s)
v[0] = alpha - t if alpha <= 0 else -s / (alpha + t)
tau = 2 * v[0]**2 / (s + v[0]**2)
v /= v[0]
return v, tau
第二种
H
=
I
−
τ
u
u
T
,
τ
=
2
u
T
u
\mathbf{H}=\mathbf{I}-\tau\mathbf{u}\mathbf{u}^T,\quad \tau=\frac{2}{\mathbf{u}^T\mathbf{u}}
H=I−τuuT,τ=uTu2
其中
u
=
x
+
sign
(
x
1
)
∥
x
∥
e
1
\mathbf{u}=\mathbf{x}+\operatorname{sign}\left(x_1\right)\|\mathbf{x}\|\mathbf{e}_1
u=x+sign(x1)∥x∥e1
然后
u
=
u
u
1
τ
=
2
∥
u
∥
2
\mathbf{u}=\frac{\mathbf{u}}{u_1}\\ \tau = \frac{2}{\|\mathbf{u}\|^2}\\
u=u1uτ=∥u∥22
https://stackoverflow.com/questions/53489237/how-can-you-implement-householder-based-qr-decomposition-in-python
def householder_vectorized(a):
"""Use this version of householder to reproduce the output of np.linalg.qr
exactly (specifically, to match the sign convention it uses)
based on https://rosettacode.org/wiki/QR_decomposition#Python
"""
v = a / (a[0] + np.copysign(np.linalg.norm(a), a[0]))
v[0] = 1
tau = 2 / (v.T @ v)
return v,tau
QR分解
QR分解可以用施密特正交化做,也可以用householder变换
以4维为例
A
=
(
a
11
a
12
a
13
a
14
a
21
a
22
a
23
a
24
a
31
a
32
a
33
a
34
a
41
a
42
a
43
a
44
)
\mathbf{A}=\begin{pmatrix} a_{11}&a_{12}&a_{13}&a_{14}\\ a_{21}&a_{22}&a_{23}&a_{24}\\ a_{31}&a_{32}&a_{33}&a_{34}\\ a_{41}&a_{42}&a_{43}&a_{44}\\ \end{pmatrix}
A=⎝
⎛a11a21a31a41a12a22a32a42a13a23a33a43a14a24a34a44⎠
⎞
目标是变为上三角矩阵
先利用Householder变换,变换第一列
设
x
=
(
a
11
a
21
a
31
a
41
)
\mathbf{x}=\begin{pmatrix} a_{11}\\ a_{21}\\ a_{31}\\ a_{41}\\ \end{pmatrix}
x=⎝
⎛a11a21a31a41⎠
⎞,
y
=
(
a
11
(
1
)
0
0
0
)
\mathbf{y}=\begin{pmatrix} a_{11}^{(1)}\\ 0\\ 0\\ 0\\ \end{pmatrix}
y=⎝
⎛a11(1)000⎠
⎞
根据Householder变换,
a
11
(
1
)
=
∥
x
∥
a_{11}^{(1)}=\|\mathbf{x}\|
a11(1)=∥x∥
u
=
x
−
y
∥
x
−
y
∥
\mathbf{u}=\frac{\mathbf{x}-\mathbf{y}}{\|\mathbf{x}-\mathbf{y}\|}
u=∥x−y∥x−y
H
(
1
)
=
I
−
u
u
H
\mathbf{H}^{(1)}=\mathbf{I}-\mathbf{u}\mathbf{u}^H
H(1)=I−uuH
A
(
1
)
=
H
(
1
)
A
=
(
a
11
(
1
)
a
12
(
1
)
a
13
(
1
)
a
14
(
1
)
0
a
22
(
1
)
a
23
(
1
)
a
24
(
1
)
0
a
32
(
1
)
a
33
(
1
)
a
34
(
1
)
0
a
42
(
1
)
a
43
(
1
)
a
44
(
1
)
)
\mathbf{A}^{(1)}=\mathbf{H}^{(1)}\mathbf{A}=\begin{pmatrix} a_{11}^{(1)}&a_{12}^{(1)}&a_{13}^{(1)}&a_{14}^{(1)}\\ 0&a_{22}^{(1)}&a_{23}^{(1)}&a_{24}^{(1)}\\ 0&a_{32}^{(1)}&a_{33}^{(1)}&a_{34}^{(1)}\\ 0&a_{42}^{(1)}&a_{43}^{(1)}&a_{44}^{(1)}\\ \end{pmatrix}
A(1)=H(1)A=⎝
⎛a11(1)000a12(1)a22(1)a32(1)a42(1)a13(1)a23(1)a33(1)a43(1)a14(1)a24(1)a34(1)a44(1)⎠
⎞
接着变换第二列,此时要保证不改变第一行
设
x
=
(
a
22
(
1
)
a
32
(
1
)
a
42
(
1
)
)
\mathbf{x}=\begin{pmatrix} a_{22}^{(1)}\\ a_{32}^{(1)}\\ a_{42}^{(1)}\\ \end{pmatrix}
x=⎝
⎛a22(1)a32(1)a42(1)⎠
⎞,
y
=
(
a
22
(
2
)
0
0
)
\mathbf{y}=\begin{pmatrix} a_{22}^{(2)}\\ 0\\ 0\\ \end{pmatrix}
y=⎝
⎛a22(2)00⎠
⎞
u
=
x
−
y
∥
x
−
y
∥
\mathbf{u}=\frac{\mathbf{x}-\mathbf{y}}{\|\mathbf{x}-\mathbf{y}\|}
u=∥x−y∥x−y
H
~
(
2
)
=
I
−
u
u
H
\tilde{\mathbf{H}}^{(2)}=\mathbf{I}-\mathbf{u}\mathbf{u}^H
H~(2)=I−uuH
H
(
2
)
=
(
1
0
H
0
H
~
(
2
)
)
\mathbf{H}^{(2)}=\begin{pmatrix} 1&\mathbf{0}^H\\ \mathbf{0}&\tilde{\mathbf{H}}^{(2)}\\ \end{pmatrix}
H(2)=(100HH~(2))
A
(
2
)
=
H
(
2
)
H
(
1
)
A
=
(
a
11
(
1
)
a
12
(
1
)
a
13
(
1
)
a
14
(
1
)
0
a
22
(
2
)
a
23
(
2
)
a
24
(
2
)
0
0
a
33
(
2
)
a
34
(
2
)
0
0
a
43
(
2
)
a
44
(
2
)
)
\mathbf{A}^{(2)}=\mathbf{H}^{(2)}\mathbf{H}^{(1)}\mathbf{A}=\begin{pmatrix} a_{11}^{(1)}&a_{12}^{(1)}&a_{13}^{(1)}&a_{14}^{(1)}\\ 0&a_{22}^{(2)}&a_{23}^{(2)}&a_{24}^{(2)}\\ 0&0&a_{33}^{(2)}&a_{34}^{(2)}\\ 0&0&a_{43}^{(2)}&a_{44}^{(2)}\\ \end{pmatrix}
A(2)=H(2)H(1)A=⎝
⎛a11(1)000a12(1)a22(2)00a13(1)a23(2)a33(2)a43(2)a14(1)a24(2)a34(2)a44(2)⎠
⎞
接着变换第三列,同时保证第一行,第二行不改变
设
x
=
(
a
33
(
2
)
a
43
(
2
)
)
\mathbf{x}=\begin{pmatrix} a_{33}^{(2)}\\ a_{43}^{(2)}\\ \end{pmatrix}
x=(a33(2)a43(2)),
y
=
(
a
33
(
3
)
0
)
\mathbf{y}=\begin{pmatrix} a_{33}^{(3)}\\ 0\\ \end{pmatrix}
y=(a33(3)0)
u
=
x
−
y
∥
x
−
y
∥
\mathbf{u}=\frac{\mathbf{x}-\mathbf{y}}{\|\mathbf{x}-\mathbf{y}\|}
u=∥x−y∥x−y
H
~
(
3
)
=
I
−
u
u
H
\tilde{\mathbf{H}}^{(3)}=\mathbf{I}-\mathbf{u}\mathbf{u}^H
H~(3)=I−uuH
H
(
3
)
=
(
I
2
×
2
0
H
0
H
~
(
3
)
)
\mathbf{H}^{(3)}=\begin{pmatrix} \mathbf{I}_{2\times 2}&\mathbf{0}^H\\ \mathbf{0}&\tilde{\mathbf{H}}^{(3)}\\ \end{pmatrix}
H(3)=(I2×200HH~(3))
R
=
A
(
3
)
=
H
(
3
)
H
(
2
)
H
(
1
)
A
=
(
a
11
(
1
)
a
12
(
1
)
a
13
(
1
)
a
14
(
1
)
0
a
22
(
2
)
a
23
(
2
)
a
24
(
2
)
0
0
a
33
(
3
)
a
34
(
3
)
0
0
0
a
44
(
3
)
)
\mathbf{R}=\mathbf{A}^{(3)}=\mathbf{H}^{(3)}\mathbf{H}^{(2)}\mathbf{H}^{(1)}\mathbf{A}=\begin{pmatrix} a_{11}^{(1)}&a_{12}^{(1)}&a_{13}^{(1)}&a_{14}^{(1)}\\ 0&a_{22}^{(2)}&a_{23}^{(2)}&a_{24}^{(2)}\\ 0&0&a_{33}^{(3)}&a_{34}^{(3)}\\ 0&0&0&a_{44}^{(3)}\\ \end{pmatrix}
R=A(3)=H(3)H(2)H(1)A=⎝
⎛a11(1)000a12(1)a22(2)00a13(1)a23(2)a33(3)0a14(1)a24(2)a34(3)a44(3)⎠
⎞
于是
Q
=
H
(
1
)
H
(
2
)
H
(
3
)
\mathbf{Q}=\mathbf{H}^{(1)}\mathbf{H}^{(2)}\mathbf{H}^{(3)}
Q=H(1)H(2)H(3)