矩阵论课程学习的一些代码实践
前言
万恶的矩阵论终于考试结束,本人学艺不精,但是喜欢在学习新东西的过程中用python来实践理论,所以肯定也得和矩阵论“贴贴”。也算一时兴起,有些代码也参考了其他大神(抱大腿)。因为不是数学专业,也是根据自己的粗浅理解,总体还不是很完善,仅供参考,必定有错,毋庸置疑。也算是对自己一年的总结吧~ 谢谢各位
正文开始~
LU & LDV分解
若方阵A可以分解为A=LU,其中,L和U分别是上下三角矩阵,我们称这种分解为LU分解,L可以表示lower,U可以表示upper。当A可以分解为A=LDV形式,且L,V分别是对角线元素为1的下三角和上三角阵,D为对角矩阵时,称此种分解为LDV分解。这里我们不过多探究LU或者LDV分解的存在性和唯一性判定,具体可以参考相关的Linear Algebra书籍。
给个大概的既视感:
L
U
:
A
=
[
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
]
=
[
∗
∗
0
0
∗
∗
∗
∗
0
∗
∗
∗
∗
∗
∗
]
[
∗
∗
∗
∗
∗
∗
0
∗
∗
∗
∗
0
0
∗
∗
]
LU:A= \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}= \begin{bmatrix} ** & 0 & 0 \\ ** & ** &0 \\ ** & ** & ** \end{bmatrix} \begin{bmatrix} ** & ** & ** \\ 0 & ** &** \\ 0 & 0 & ** \end{bmatrix}
LU:A=⎣⎡a11a21a31a12a22a32a13a23a33⎦⎤=⎣⎡∗∗∗∗∗∗0∗∗∗∗00∗∗⎦⎤⎣⎡∗∗00∗∗∗∗0∗∗∗∗∗∗⎦⎤
L D V : A = [ a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 ] = [ 1 0 0 ∗ ∗ 1 0 ∗ ∗ ∗ ∗ 1 ] [ ∗ ∗ 0 0 0 ∗ ∗ 0 0 0 ∗ ∗ ] [ 1 ∗ ∗ ∗ ∗ 0 1 ∗ ∗ 0 0 1 ] LDV:A= \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}= \begin{bmatrix} 1 & 0 & 0 \\ ** & 1 &0 \\ ** & ** & 1 \end{bmatrix} \begin{bmatrix} ** & 0 & 0\\ 0 & ** &0 \\ 0 & 0 & ** \end{bmatrix} \begin{bmatrix} 1 & ** & ** \\ 0 & 1&** \\ 0 & 0 & 1 \end{bmatrix} LDV:A=⎣⎡a11a21a31a12a22a32a13a23a33⎦⎤=⎣⎡1∗∗∗∗01∗∗001⎦⎤⎣⎡∗∗000∗∗000∗∗⎦⎤⎣⎡100∗∗10∗∗∗∗1⎦⎤
具体操作方法呢,我们可以先对方阵A进行初等行变换化为上三角阵,假设每一次的行变换操作等价于左乘一个变换矩阵P,我们有
P
n
P
n
−
1
.
.
.
P
1
A
=
U
P_nP_{n-1}...P_1A=U
PnPn−1...P1A=U
记
P
=
P
n
P
n
−
1
.
.
.
P
1
P=P_nP_{n-1}...P_1
P=PnPn−1...P1,其中
P
P
P是一个可逆的下三角矩阵。而我们在以前的学习中也知道行变换矩阵的逆有对应的形式,所以得到
A
=
P
−
1
U
=
L
U
A=P^{-1}U=LU
A=P−1U=LU,即LU分解。
而我们可以由LU分解继续得到LDV分解(注:有LU分解不一定就有LDV分解),此时U是一个上三角矩阵,我们把对角线的元素抽出组成对角矩阵D,然后将U去除以各行对应的对角元素即可得到V,下面是个简单例子:
U = [ a 11 a 12 a 13 0 a 22 a 23 0 0 a 33 ] = [ a 11 0 0 0 a 22 0 0 0 a 33 ] [ 1 a 12 / a 11 a 13 / a 11 0 1 a 23 / a 22 0 0 1 ] U= \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ 0 & a_{22} & a_{23} \\ 0 & 0 & a_{33} \end{bmatrix}= \begin{bmatrix} a_{11} & 0 & 0 \\ 0 & a_{22} & 0 \\ 0 & 0 & a_{33} \end{bmatrix} \begin{bmatrix} 1 & a_{12}/a_{11} & a_{13}/a_{11} \\ 0 & 1 & a_{23}/a_{22} \\ 0 & 0 & 1 \end{bmatrix} U=⎣⎡a1100a12a220a13a23a33⎦⎤=⎣⎡a11000a22000a33⎦⎤⎣⎡100a12/a1110a13/a11a23/a221⎦⎤
初等矩阵求逆
对于初等矩阵,即由初等变换(还不知道可以度娘)得到的矩阵,它们都是可逆矩阵(det不为0),且有着固定的逆矩阵形式,如:
- E i ( k ) E_i(k) Ei(k)表示第i行乘上一个常数k,其逆矩阵为 E i ( 1 / k ) E_i(1/k) Ei(1/k);
- E i j − 1 E_{ij}^{-1} Eij−1表示第i行和第j行互换,其逆矩阵为其本身,即 E i j − 1 E_{ij}^{-1} Eij−1,很好理解,再换一次就复原了;
-
E
i
j
(
k
)
E_{ij}(k)
Eij(k)表示第i行的k倍加到第j行,其逆矩阵为
E
i
j
(
−
k
)
E_{ij}(-k)
Eij(−k).
利用LU分解解线性方程组
对于线性方程组(暂不考虑相不相容的情况)
A
X
=
b
AX=b
AX=b,我们对A进行LU分解,得到
A
=
L
U
A=LU
A=LU,所以有
L
U
X
=
b
LUX=b
LUX=b,我们记
Y
=
U
X
Y=UX
Y=UX,即把UX看成一个整体。这里是要利用L的下三角特性,可以比较方便地解出Y的值,然后在
U
X
=
Y
UX=Y
UX=Y中又可以利用U的上三角特性,方便地解出X。
r
a
w
:
A
X
=
b
d
o
:
A
=
L
U
g
e
t
:
L
U
X
=
b
i
n
t
r
o
d
u
c
e
Y
:
L
Y
=
b
s
o
l
v
e
:
Y
b
a
c
k
:
U
X
=
Y
s
o
l
v
e
:
X
\begin{aligned} raw:&AX=b \\ do:&A=LU \\ get:&LUX=b\\ introduce Y:&LY=b\\ solve:& Y\\ back:& UX=Y\\ solve: &X \end{aligned}
raw:do:get:introduceY:solve:back:solve:AX=bA=LULUX=bLY=bYUX=YX
上代码
具体在编程的时候还要注意大数吃小数的问题,所以可以在行变换时对各行进行排序,从小到大,让小的数去乘上对应的系数约去大数。
@classmethod
def LDV_2D(self, A, mode='LDV', eps=1e-6, Test=False, solve_equation=False):
"""custom function for 2D-matrix LDV-factorization by Junno
Args:
A ([np.darray]): [2D-matrix to be processed with LDV-factorization. ]
mode (str, optional): [choose mode]. Defaults to LDV.
eps ([float]): numerical threshold.
Test (bool, optional): [show checking information]. Defaults to False.
solve_equation(bool,optional): [True when use this function to solve equation like Ax=b]. Defaults to False.
Raises:
ValueError: [matrix don't meet the requirement for LDV or LU factorization]
Returns:
[np.darray]: [resort-matrix, L, D, V] or [resort-matrix, L, U]
Last edited: 21-12-31
Author: Junno
"""
assert len(A.shape) == 2 and A.shape[0] == A.shape[1], "please be sure A is 2D and square"
# primary check for LDV: nonsingular matrices has LDV factorization uniquely.
if mode == 'LDV':
# full-rank check, we will introduce later!
if not self.Check_full_rank(A):
raise ValueError(
"This matrix can't be operated with LDV factorization, maybe you can try LU factorization instead ^_^")
M, N = A.shape
# cm is to be operated
cm = np.copy(A)
# A with row exchange
resort_m = np.copy(A)
# record primary row transform operators
operator_list = []
L = np.eye(M)
if Test:
print('original matrix:\n', A)
for i in range(M):
for j in range(i, N):
if Test:
print('During process in row:{}, col:{}'.format(i, j))
# there are non-zero value in col i
if sum(abs(cm[i:, j])) != 0:
zero_index = []
non_zero_index = []
for k in range(i, M):
if abs(cm[k, j])>eps:
non_zero_index.append(k)
else:
zero_index.append(k)
non_zero_index = sorted(
non_zero_index, key=lambda x: abs(cm[x, j]))
# from small to large ans non-zero first
sort_idnex = non_zero_index+zero_index
if Test:
print('Sorting index for {} cols:\n'.format(i), sort_idnex)
# do row-exchange and record
# t means row-exchange transform and pre means primary row transform coefficients
if sort_idnex[0] != i:
operator_list.append(['t', [sort_idnex[0], i]])
cm[[sort_idnex[0], i], :] = cm[[i, sort_idnex[0]], :]
resort_m[[sort_idnex[0], i],
:] = resort_m[[i, sort_idnex[0]], :]
if Test:
print('After resorting\n', cm)
# operate elementary row transformation on matrix based on elimination coefficient
if np.sum(abs(cm[i+1:, j]))>eps:
prefix = -cm[i+1:, j]/cm[i, j]
operator_list.append(['pre', i, prefix.tolist()])
temp = (np.array(prefix).reshape(-1, 1)
)@cm[i, :].reshape((1, -1))
cm[i+1:, :] += temp
if Test:
print('After primary row transformation:')
print(cm)
break
else:
continue
# Prints a new matrix arranged according to the main elements (just swap rows without changing the matrix information)
if Test:
print('resort matrix:\n', resort_m)
print('After primary row transformation: \n', cm)
print('Operation list:\n', operator_list)
# Operate on the identity matrix according to the saved sorting and elimination coefficients to get the matrix L
for i in range(len(operator_list)-1, -1, -1):
op = operator_list[i]
if op[0] == 't':
m, n = op[1]
L[[m, n], :] = L[[n, m], :]
elif op[0] == 'pre':
ind = op[1]
temp = -(np.array(op[2]).reshape(-1, 1)
)@L[ind, :].reshape((1, -1))
L[ind+1:, :] += temp
# Operate the same row transpose on the L to let it be lower triangle
for i in range(len(operator_list)):
op = operator_list[i]
if op[0] == 't':
m, n = op[1]
L[[m, n], :] = L[[n, m], :]
# LU mode
if mode == 'LU':
if solve_equation:
return operator_list, resort_m, L, cm
else:
return resort_m, L, cm
# LDV mode
elif mode == 'LDV':
# extract the principal diagonal elements of the transformed matrix
diag = np.diag(cm)
# Error
if 0 in diag:
raise ValueError(
'this matrix can\'t be operated with LDV factorization')
return resort_m, L, np.diag(diag), np.triu(cm/diag.reshape((-1, 1)))
我这捉鸡的代码功底大家见笑,大致思路已经注释在里面,核心不过初等行变换以及记录变换的一些信息,随后合成L和U矩阵,继而可以得到LDV分解的结果。
examples
先上一个教材的例子(《矩阵论》杨明,刘先忠)
A
=
[
2
2
3
4
7
7
−
2
4
5
]
A= \begin{bmatrix} 2 & 2 & 3 \\ 4 & 7 & 7 \\ -2 & 4 & 5 \end{bmatrix}
A=⎣⎡24−2274375⎦⎤
>>> import numpy as np
>>> from Matrix_Solutions import Matrix_Solutions
>>> import math
>>> from copy import deepcopy
>>> np.set_printoptions(precision=3, suppress=True, linewidth=100)
>>> a=np.array([[2,2,3],[4,7,7],[-2,4,5]]).reshape((3,3)).astype(np.float32)
>>> a
array([[ 2., 2., 3.],
[ 4., 7., 7.],
[-2., 4., 5.]], dtype=float32)
# test LU factorization
>>> rm,L,U=Matrix_Solutions.LDV_2D(a, mode="LU",test=False)
>>> rm
array([[ 2., 2., 3.],
[ 4., 7., 7.],
[-2., 4., 5.]], dtype=float32)
>>> L
array([[ 1., 0., 0.],
[ 2., 1., 0.],
[-1., 2., 1.]])
>>> U
array([[2., 2., 3.],
[0., 3., 1.],
[0., 0., 6.]], dtype=float32)
>>> L@U
array([[ 2., 2., 3.],
[ 4., 7., 7.],
[-2., 4., 5.]])
# test LDV factorization
>>> re,L,D,V=Matrix_Solutions.LDV_2D(a, mode="LDV",Test=False)
>>> L
array([[ 1., 0., 0.],
[ 2., 1., 0.],
[-1., 2., 1.]])
>>> D
array([[2., 0., 0.],
[0., 3., 0.],
[0., 0., 6.]], dtype=float32)
>>> V
array([[1. , 1. , 1.5 ],
[0. , 1. , 0.333],
[0. , 0. , 1. ]], dtype=float32)
>>> L@D@V
array([[ 2., 2., 3.],
[ 4., 7., 7.],
[-2., 4., 5.]])
# test a random matrix
>>> a=np.random.randn(4,4)
>>> a
array([[-1.407, -0.787, 1.194, 0.275],
[-0.539, 0.482, -0.562, 3.241],
[-0.822, 0.482, -1.329, 1.048],
[-0.386, -0.432, 0.55 , 1.209]])
>>> rm,L,D,V=Matrix_Solutions.LDV_2D(a, mode="LDV",Test=False)
# we can see that rm is the result of resorting a, and LDV is linked to rm
>>> rm
array([[-0.386, -0.432, 0.55 , 1.209],
[-1.407, -0.787, 1.194, 0.275],
[-0.539, 0.482, -0.562, 3.241],
[-0.822, 0.482, -1.329, 1.048]])
>>> L@D@V
array([[-0.386, -0.432, 0.55 , 1.209],
[-1.407, -0.787, 1.194, 0.275],
[-0.539, 0.482, -0.562, 3.241],
[-0.822, 0.482, -1.329, 1.048]])
未完待续
放假的心已经肆无忌惮,祝大家元旦佳节快乐,疫情期间不要乱跑喔!
坐等申鹤姐姐 @^@