施密特正交化
施密特正交化(Gram-Schmidt Orthogonality)是常用的求欧式空间正交基的方法。给定一个线性无关向量组 a 1 , a 2 , . . . , a m a_1,a_2,...,a_m a1,a2,...,am,经过施密特正交化可以得到一组正交向量组 b 1 , b 2 , . . . , b m b_1,b_2,...,b_m b1,b2,...,bm,正交向量组中的每个向量正交化之后,就得到一个标准正交向量组。
下面我们来看施密特正交化的具体步骤。因为向量组
a
1
,
a
2
,
.
.
.
,
a
m
a_1,a_2,...,a_m
a1,a2,...,am是线性无关的,所以可以用这些向量来构造
b
1
,
b
2
,
.
.
.
,
b
m
b_1,b_2,...,b_m
b1,b2,...,bm,具体如下:
b
1
=
a
1
b_1 = a_1
b1=a1
b
2
=
a
2
+
h
a
1
b_2 = a_2 + ha_1
b2=a2+ha1
b
3
=
a
3
+
h
1
a
1
+
h
2
a
2
b_3 = a_3 + h_1a_1 + h_2a_2
b3=a3+h1a1+h2a2
.
.
.
...
...
反过来,我们也可以从
b
1
,
b
2
,
.
.
.
,
b
m
b_1,b_2,...,b_m
b1,b2,...,bm退回到
a
1
,
a
2
,
.
.
.
,
a
m
a_1,a_2,...,a_m
a1,a2,...,am
a
1
=
b
1
a_1 = b_1
a1=b1
a
2
=
b
2
+
k
b
1
a_2 = b_2 + k b_1
a2=b2+kb1
a
3
=
b
3
+
k
1
b
1
+
k
2
b
2
a_3 = b_3 + k_1b_1 + k_2b_2
a3=b3+k1b1+k2b2
.
.
.
...
...
=
=
>
==>
==>
b
1
=
a
1
b_1 = a_1
b1=a1
b
2
=
a
2
−
k
b
1
b_2 = a_2 - kb_1
b2=a2−kb1
b
3
=
a
3
−
k
1
b
1
−
k
2
b
2
b_3 = a_3 - k_1b_1 - k_2b_2
b3=a3−k1b1−k2b2
.
.
.
...
...
只需要求出
k
,
k
1
,
k
2
k,k_1,k_2
k,k1,k2这些系数,就可以得到
a
1
,
a
2
,
.
.
.
,
a
m
a_1,a_2,...,a_m
a1,a2,...,am到
b
1
,
b
2
,
.
.
.
,
b
m
b_1,b_2,...,b_m
b1,b2,...,bm的转化方式。由于
b
1
,
b
2
,
.
.
.
,
b
m
b_1,b_2,...,b_m
b1,b2,...,bm是正交的,所以我们有:
b
1
T
b
2
=
0
⇒
b
1
T
(
a
2
−
k
b
1
)
=
0
⇒
k
=
b
1
T
a
2
b
1
T
b
1
b_1^Tb_2=0 ⇒ b_1^T(a_2 - kb_1)=0 ⇒ k=\frac{b_1^Ta_2}{b_1^Tb_1}
b1Tb2=0⇒b1T(a2−kb1)=0⇒k=b1Tb1b1Ta2
b
1
T
b
3
=
0
⇒
b
1
T
(
a
3
−
k
1
b
1
−
k
2
b
2
)
=
0
⇒
k
1
=
b
1
T
a
3
b
1
T
b
1
b_1^Tb_3=0 ⇒ b_1^T(a_3 - k_1b_1 - k_2b_2)=0 ⇒ k_1=\frac{b_1^Ta_3}{b_1^Tb_1}
b1Tb3=0⇒b1T(a3−k1b1−k2b2)=0⇒k1=b1Tb1b1Ta3
b
2
T
b
3
=
0
⇒
b
2
T
(
a
3
−
k
1
b
1
−
k
2
b
2
)
=
0
⇒
k
2
=
b
2
T
a
3
b
2
T
b
2
b_2^Tb_3=0 ⇒ b_2^T(a_3 - k_1b_1 - k_2b_2)=0 ⇒ k_2=\frac{b_2^Ta_3}{b_2^Tb_2}
b2Tb3=0⇒b2T(a3−k1b1−k2b2)=0⇒k2=b2Tb2b2Ta3
.
.
.
...
...
根据这个公式很容易写出施密特正交化代码:
def schmidt_orthogonality(matrix_org, debug=False):
"""
b1 = a1, b2 = a2 - kb1, b3 = a3 - k1b1 - k2b2
:param matrix_org: m x n matrix, m >= n 且满秩
:return:
"""
m, n = matrix_org.shape
matrix_ortho = matrix_org.copy()
matrix_ortho = np.asarray(matrix_ortho, dtype=np.float)
coefficient = np.zeros(shape=(m, n)) # 系数矩阵k、k1、k2
coefficient[0, 0] = 1 # b1 = a1
for i in range(1, n): # 开始处理下一列
coefficient[i, i] = 1
for j in range(i):
b_j = matrix_ortho[:, j]
k_j = np.dot(b_j, matrix_org[:, i]) / np.dot(b_j, b_j)
coefficient[j, i] = k_j
matrix_ortho[:, i] -= k_j * b_j
# 正交向量b1,b2...做正交化处理,系数也做相应的改变
for i in range(n):
devider = np.dot(matrix_ortho[:, i], matrix_ortho[:, i])
if abs(devider) < 1e-16: # 避免除以0
matrix_ortho[:, i] *= 0
else:
devider = np.sqrt(devider)
matrix_ortho[:, i] /= devider
coefficient[i, :] *= devider
print('result:', matrix_ortho)
return matrix_ortho, coefficient
例如输入矩阵
[[1, 2, 2],
[1, 0, 2],
[0, 1, 1]]
得到:
matrix_ortho=
[[ 0.70710678 0.57735027 -0.40824829]
[ 0.70710678 -0.57735027 0.40824829]
[ 0. 0.57735027 0.81649658]]
coefficient=
[[1.41421356 1.41421356 2.82842712]
[0. 1.73205081 0.57735027]
[0. 0. 0.81649658]]
matrix_ortho * coefficient =
[[ 1.00000000e+00 2.00000000e+00 2.00000000e+00]
[ 1.00000000e+00 -7.64210971e-17 2.00000000e+00]
[ 0.00000000e+00 1.00000000e+00 1.00000000e+00]]
由于数值计算有微小的误差,所以可以看到结果里面有-7.64210971e-17
而不是正好等于0。
QR分解
施密特正交化得到的向量组正交,系数矩阵是上三角,满足A=QR。所以施密特正交化的结果就是矩阵的QR分解结果。
其他
对于非满秩矩阵可以先用单位向量扩展到满秩矩阵,然后再做施密特正交化。