矩阵论代码实践之LU&LDV分解

小刀来啦

于 2021-12-31 17:47:22 发布

阅读量1.2k

点赞数 2

分类专栏：矩阵论代码实践文章标签：矩阵线性代数 python

本文链接：https://blog.csdn.net/weixin_40519529/article/details/122258464

版权

矩阵论代码实践专栏收录该内容

7 篇文章 2 订阅

订阅专栏

矩阵论课程学习的一些代码实践

前言
未完待续

前言

万恶的矩阵论终于考试结束，本人学艺不精，但是喜欢在学习新东西的过程中用python来实践理论，所以肯定也得和矩阵论“贴贴”。也算一时兴起，有些代码也参考了其他大神（抱大腿）。因为不是数学专业，也是根据自己的粗浅理解，总体还不是很完善，仅供参考，必定有错，毋庸置疑。也算是对自己一年的总结吧~ 谢谢各位

正文开始~ CSDN (゜-゜)つロ干杯

LU & LDV分解

若方阵A可以分解为A=LU，其中，L和U分别是上下三角矩阵，我们称这种分解为LU分解，L可以表示lower，U可以表示upper。当A可以分解为A=LDV形式，且L,V分别是对角线元素为1的下三角和上三角阵，D为对角矩阵时，称此种分解为LDV分解。这里我们不过多探究LU或者LDV分解的存在性和唯一性判定，具体可以参考相关的Linear Algebra书籍。

给个大概的既视感：
$\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}= \begin{bmatrix} ** & 0 & 0 \\ ** & ** &0 \\ ** & ** & ** \end{bmatrix} \begin{bmatrix} ** & ** & ** \\ 0 & ** &** \\ 0 & 0 & ** \end{bmatrix}$

$\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}= \begin{bmatrix} 1 & 0 & 0 \\ ** & 1 &0 \\ ** & ** & 1 \end{bmatrix} \begin{bmatrix} ** & 0 & 0\\ 0 & ** &0 \\ 0 & 0 & ** \end{bmatrix} \begin{bmatrix} 1 & ** & ** \\ 0 & 1&** \\ 0 & 0 & 1 \end{bmatrix}$

具体操作方法呢，我们可以先对方阵A进行初等行变换化为上三角阵，假设每一次的行变换操作等价于左乘一个变换矩阵P，我们有
$P_nP_{n-1}...P_1A=U$
记 $P=P_nP_{n-1}...P_1$ ，其中 $P$ 是一个可逆的下三角矩阵。而我们在以前的学习中也知道行变换矩阵的逆有对应的形式，所以得到 $A=P^{-1}U=LU$ ，即LU分解。

而我们可以由LU分解继续得到LDV分解(注：有LU分解不一定就有LDV分解），此时U是一个上三角矩阵，我们把对角线的元素抽出组成对角矩阵D，然后将U去除以各行对应的对角元素即可得到V，下面是个简单例子：

$\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ 0 & a_{22} & a_{23} \\ 0 & 0 & a_{33} \end{bmatrix}= \begin{bmatrix} a_{11} & 0 & 0 \\ 0 & a_{22} & 0 \\ 0 & 0 & a_{33} \end{bmatrix} \begin{bmatrix} 1 & a_{12}/a_{11} & a_{13}/a_{11} \\ 0 & 1 & a_{23}/a_{22} \\ 0 & 0 & 1 \end{bmatrix}$

初等矩阵求逆

对于初等矩阵，即由初等变换（还不知道可以度娘）得到的矩阵，它们都是可逆矩阵（det不为0），且有着固定的逆矩阵形式，如：

$E_i(k)$ 表示第i行乘上一个常数k，其逆矩阵为 $E_i(1/k)$ ；
$E_{ij}^{-1}$ 表示第i行和第j行互换，其逆矩阵为其本身，即 $E_{ij}^{-1}$ ，很好理解，再换一次就复原了；
$E_{ij}(k)$ 表示第i行的k倍加到第j行，其逆矩阵为 $E_{ij}(-k)$ .

利用LU分解解线性方程组

对于线性方程组（暂不考虑相不相容的情况） $A X = b$ ，我们对A进行LU分解，得到 $A = L U$ ，所以有 $L U X = b$ ，我们记 $Y = U X$ ，即把UX看成一个整体。这里是要利用L的下三角特性，可以比较方便地解出Y的值，然后在 $U X = Y$ 中又可以利用U的上三角特性，方便地解出X。
$\begin{aligned} raw:&AX=b \\ do:&A=LU \\ get:&LUX=b\\ introduce Y:&LY=b\\ solve:& Y\\ back:& UX=Y\\ solve: &X \end{aligned}$

上代码

具体在编程的时候还要注意大数吃小数的问题，所以可以在行变换时对各行进行排序，从小到大，让小的数去乘上对应的系数约去大数。

在这里插入图片描述

@classmethod
def LDV_2D(self, A, mode='LDV', eps=1e-6, Test=False, solve_equation=False):
	"""custom function for 2D-matrix LDV-factorization by Junno
	
	Args:
	    A ([np.darray]): [2D-matrix to be processed with LDV-factorization. ]
	    mode (str, optional): [choose mode]. Defaults to LDV.
	    eps ([float]): numerical threshold.
	    Test (bool, optional): [show checking information]. Defaults to False.
	    solve_equation(bool,optional): [True when use this function to solve equation like Ax=b]. Defaults to False.
	
	Raises:
	    ValueError: [matrix don't meet the requirement for LDV or LU factorization]
	
	Returns:
	    [np.darray]: [resort-matrix, L, D, V] or [resort-matrix, L, U]
	
	Last edited: 21-12-31

    Author: Junno
	"""
	
	assert len(A.shape) == 2 and A.shape[0] == A.shape[1], "please be sure A is 2D and square"
    # primary check for LDV:  nonsingular matrices has LDV factorization uniquely.
    if mode == 'LDV':
    	# full-rank check, we will introduce later!
        if not self.Check_full_rank(A):
            raise ValueError(
                "This matrix can't be operated with LDV factorization, maybe you can try LU factorization instead ^_^")
    M, N = A.shape
    # cm is to be operated
    cm = np.copy(A)
    # A with row exchange
    resort_m = np.copy(A)
    # record primary row transform operators
    operator_list = []
    L = np.eye(M)
    if Test:
    	print('original matrix:\n', A)

    for i in range(M):
        for j in range(i, N):
            if Test:
                print('During process in row:{}, col:{}'.format(i, j))
            # there are non-zero value in col i
            if sum(abs(cm[i:, j])) != 0:
                zero_index = []
                non_zero_index = []
                for k in range(i, M):
                    if abs(cm[k, j])>eps:
                        non_zero_index.append(k)
                    else:
                        zero_index.append(k)

                non_zero_index = sorted(
                    non_zero_index, key=lambda x: abs(cm[x, j]))
				# from small to large ans non-zero first
                sort_idnex = non_zero_index+zero_index

                if Test:
                    print('Sorting index for {} cols:\n'.format(i), sort_idnex)
				
				# do row-exchange and record
				# t means row-exchange transform and pre means primary row transform coefficients 
                if sort_idnex[0] != i:
                    operator_list.append(['t', [sort_idnex[0], i]])
                    cm[[sort_idnex[0], i], :] = cm[[i, sort_idnex[0]], :]
                    resort_m[[sort_idnex[0], i],
                             :] = resort_m[[i, sort_idnex[0]], :]
                if Test:
                    print('After resorting\n', cm)
				
				# operate elementary row transformation on matrix based on elimination coefficient
                if np.sum(abs(cm[i+1:, j]))>eps:
                    prefix = -cm[i+1:, j]/cm[i, j]
                    operator_list.append(['pre', i, prefix.tolist()])
                    temp = (np.array(prefix).reshape(-1, 1)
                            )@cm[i, :].reshape((1, -1))
                    cm[i+1:, :] += temp

                if Test:
                    print('After primary row transformation:')
                    print(cm)

                break

            else:
                continue

    # Prints a new matrix arranged according to the main elements (just swap rows without changing the matrix information)
    if Test:
        print('resort matrix:\n', resort_m)
        print('After primary row transformation: \n', cm)
        print('Operation list:\n', operator_list)

    # Operate on the identity matrix according to the saved sorting and elimination coefficients to get the matrix L
    for i in range(len(operator_list)-1, -1, -1):
        op = operator_list[i]
        if op[0] == 't':
            m, n = op[1]
            L[[m, n], :] = L[[n, m], :]
        elif op[0] == 'pre':
            ind = op[1]
            temp = -(np.array(op[2]).reshape(-1, 1)
                     )@L[ind, :].reshape((1, -1))
            L[ind+1:, :] += temp

    # Operate the same row transpose on the L to let it be lower triangle
    for i in range(len(operator_list)):
        op = operator_list[i]
        if op[0] == 't':
            m, n = op[1]
            L[[m, n], :] = L[[n, m], :]

    # LU mode
    if mode == 'LU':
        if solve_equation:
            return operator_list, resort_m, L, cm
        else:
            return resort_m, L, cm
    # LDV mode
    elif mode == 'LDV':
        # extract the principal diagonal elements of the transformed matrix
        diag = np.diag(cm)

        # Error
        if 0 in diag:
            raise ValueError(
                'this matrix can\'t be operated with LDV factorization')
        return resort_m, L, np.diag(diag), np.triu(cm/diag.reshape((-1, 1)))

我这捉鸡的代码功底大家见笑，大致思路已经注释在里面，核心不过初等行变换以及记录变换的一些信息，随后合成L和U矩阵，继而可以得到LDV分解的结果。

examples

先上一个教材的例子（《矩阵论》杨明，刘先忠）
$\begin{bmatrix} 2 & 2 & 3 \\ 4 & 7 & 7 \\ -2 & 4 & 5 \end{bmatrix}$

>>> import numpy as np
>>> from Matrix_Solutions import Matrix_Solutions
>>> import math
>>> from copy import deepcopy
>>> np.set_printoptions(precision=3, suppress=True, linewidth=100)

>>> a=np.array([[2,2,3],[4,7,7],[-2,4,5]]).reshape((3,3)).astype(np.float32)
>>> a
array([[ 2.,  2.,  3.],
       [ 4.,  7.,  7.],
       [-2.,  4.,  5.]], dtype=float32)
# test LU factorization
>>> rm,L,U=Matrix_Solutions.LDV_2D(a, mode="LU",test=False)
>>> rm
array([[ 2.,  2.,  3.],
       [ 4.,  7.,  7.],
       [-2.,  4.,  5.]], dtype=float32)
>>> L
array([[ 1.,  0.,  0.],
       [ 2.,  1.,  0.],
       [-1.,  2.,  1.]])
>>> U
array([[2., 2., 3.],
       [0., 3., 1.],
       [0., 0., 6.]], dtype=float32)
>>> L@U
array([[ 2.,  2.,  3.],
       [ 4.,  7.,  7.],
       [-2.,  4.,  5.]])

# test LDV factorization
>>> re,L,D,V=Matrix_Solutions.LDV_2D(a, mode="LDV",Test=False) 
>>> L
array([[ 1.,  0.,  0.],
       [ 2.,  1.,  0.],
       [-1.,  2.,  1.]])
>>> D
array([[2., 0., 0.],
       [0., 3., 0.],
       [0., 0., 6.]], dtype=float32)
>>> V
array([[1.   , 1.   , 1.5  ],
       [0.   , 1.   , 0.333],
       [0.   , 0.   , 1.   ]], dtype=float32)
>>> L@D@V
array([[ 2.,  2.,  3.],
       [ 4.,  7.,  7.],
       [-2.,  4.,  5.]])

# test a random matrix
>>> a=np.random.randn(4,4)
>>> a
array([[-1.407, -0.787,  1.194,  0.275],
       [-0.539,  0.482, -0.562,  3.241],
       [-0.822,  0.482, -1.329,  1.048],
       [-0.386, -0.432,  0.55 ,  1.209]])
>>> rm,L,D,V=Matrix_Solutions.LDV_2D(a, mode="LDV",Test=False) 
# we can see that rm is the result of resorting a, and LDV is linked to rm
>>> rm
array([[-0.386, -0.432,  0.55 ,  1.209],
       [-1.407, -0.787,  1.194,  0.275],
       [-0.539,  0.482, -0.562,  3.241],
       [-0.822,  0.482, -1.329,  1.048]])
>>> L@D@V
array([[-0.386, -0.432,  0.55 ,  1.209],
       [-1.407, -0.787,  1.194,  0.275],
       [-0.539,  0.482, -0.562,  3.241],
       [-0.822,  0.482, -1.329,  1.048]])