6.5 Gram-Schmidt Orthogonal法求特征值

醒过来摸鱼

已于 2023-05-19 16:41:11 修改

阅读量2.4k

点赞数 1

分类专栏：线性代数【更新中】文章标签：算法线性代数数据结构

于 2022-03-31 15:37:48 首次发布

本文链接：https://blog.csdn.net/m0_66201040/article/details/123844681

版权

线性代数【更新中】专栏收录该内容

71 篇文章 46 订阅

订阅专栏

文章目录

1 高次方程无根式解问题
2 舒尔分解定理
3 分解原理
4 QR分解求特征值
5 Python实现

1 高次方程无根式解问题

这样计算会产生一个问题。对于 $5\times5$ 以上的矩阵，会产生五次及以上的高次方程。而这个方程是没有根式解的。比如以下矩阵：
$A=\begin{pmatrix} 2& 1& -1& 7& 8\\ 1&2& -1&10&11\\ -1& -1& 2& 9& 6\\ 11&3& 4& 5& 12\\ 16&17& -4& -5& -6 \end{pmatrix}\\ det(A-\lambda I) =x^5 - 5x^4 - 401x^3 - 196x^2 + 12207x - 12272=0$
早在200年前，天才数学家伽罗瓦就证明了五次及以上高次方程没有根式解。所以不能用这个教科书的方法来求特征值。那怎么办？

2 舒尔分解定理

前面说LU分解的时候，说过舒尔补。现在舒尔又来了。舒尔分解定理的内容是：
对于复矩阵A，如果存在一个酉矩阵U和一个上三角矩阵T，使得下式成立：
$U^*AU=T$
那么A的特征值就是T的所有对角线元素。哇塞，这么刺激！那就求这个矩阵T呗。但是我们又接触了一个新概念，酉矩阵。啥是酉矩阵啊？酉矩阵就是逆矩阵等于共轭矩阵的复矩阵。
舒尔分解定理的证明过程灰常复杂，我就不去证明了。
但是复矩阵的应用场景不多，更多的是实矩阵。于是舒尔又想出来了实矩阵的分解，被称为舒尔实分解定理：
对于实矩阵A，如果存在一个正交矩阵Q和一个拟上三角quasi-triangular矩阵U，使得下式成立：
$Q^TAQ=U$
那么A的特征值就位于U的对角线上。如果对角线上是 $1\times1$ 矩阵，那么这就是个实特征值，如果对角线上是 $2\times2$ 矩阵，那么就这就是两个共轭复特征值。
同样，我不去证明哈，过程太复杂了。
但是我们不用舒尔分解去做。无论是舒尔复分解还是舒尔实分解，都是数学上可行，但是计算机很难处理的方法。

3 分解原理

分解原理在于对于两个同阶的方阵 $A B$ 与 $B A$ 拥有相同的特征值。所以就可以使用这种算法来求特征值：
$A=BC\\ CB=A_1\\ A_1=B_1C_1\\ C_1B_1=A_2\\ \cdots$
这样得到的一系列矩阵 $A_1,\cdots,A_n$ 有相同的特征值。关键是矩阵分解方法有很多种，要这样一直分解再交换顺序乘起来必须满足两个条件：

这个矩阵序列最终收敛；
这个矩阵序列收敛的结果可以得到容易求特征值的矩阵，比如拟上三角阵、对角阵、约当标准型。

4 QR分解求特征值

前人已经证明利用施密特正交化进行分解，按上述过程下去组成的矩阵序列，会收敛到上三角矩阵或拟上三角矩阵。换句话说，就是收敛到舒尔实分解。也就是前面的算法可以改成这样：
$A=QR\\ A_1=RQ = Q_1R_1\\ A_2=R_1R_1= Q_2R_2\\ \vdots\\ A_n=R_{n-1}R_{n-1}= Q_nR_n\\$
等于是分解出QR之后，把RQ进行相乘，得到的新的矩阵 $A_1$ ，然后新矩阵继续QR分解，当 $A_n$ 变成上三角矩阵或拟上三角矩阵的时候，连环QR就停止了。虽然数学理论证明了， $A_0$ 到 $A_n$ 这个矩阵序列是存在极限的，是一定会变成拟上三角矩阵或上三角矩阵的，但是实际上运算因为浮点数误差，就是接近0就行了，别用都是0进行判断，否则可能无限循环。
还以那个矩阵为例子：
$A_0=\begin{pmatrix}2 & 1 & -1\\ 1 & 2 & -1\\ -1 & -1 & 2\\ \end{pmatrix}\\ A_1=\begin{pmatrix}3.664 & 0.805 & -0.491\\ 0.803 & 1.243 & -0.148\\ -0.492 & -0.148 & 1.091\\ \end{pmatrix}\\ A_2=\begin{pmatrix}3.975 & 0.231 & -0.131\\ 0.227 & 1.018 & -0.01\\ -0.132 & -0.009 & 1.006\\ \end{pmatrix}\\ A_3=\begin{pmatrix}3.998 & 0.061 & -0.033\\ 0.057 & 1.001 & -0.001\\ -0.033 & 0.001 & 1.0\\ \end{pmatrix}\\ A_4=\begin{pmatrix}4.0 & 0.019 & -0.009\\ 0.014 & 1.0 & -0.001\\ -0.008 & 0.002 & 1.0\\ \end{pmatrix}\\ A_5=\begin{pmatrix}4.0 & 0.01 & -0.003\\ 0.003 & 1.0 & -0.001\\ -0.002 & 0.002 & 1.0\\ \end{pmatrix}\\$
到了我们预定的误差范围(<0.01)，循环结束，特征值为4、1、1，完全正确。
当遇到拟上三角upper quasi-triangular矩阵时，需要求一个 $2\times2$ 矩阵的特征值，这个时候用求根公式就行了。公式如下：
$\begin{pmatrix}a & b\\ c & d \end{pmatrix}\\ \lambda_1 =\frac{a+d+\sqrt{a^2+d^2-2ad+4bc}}2\\ \lambda_2 =\frac{a+d-\sqrt{a^2+d^2-2ad+4bc}}2$
再举个拟上三角矩阵复特征值的例子：
$A_0= \begin{pmatrix}0.0 & 0.0 & 3.0 & 1.0\\ 1.0 & 1.0 & 0.0 & 1.0\\ 0.0 & 1.0 & 1.0 & 0.0\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix}\\ A_1= \begin{pmatrix}1.0 & 0.0 & 1.0 & 1.0\\ 1.0 & 1.0 & 0.0 & 0.0\\ 0.0 & 3.0 & 0.0 & 1.0\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_2= \begin{pmatrix}1.5 & 0.574 & 0.649 & 0.707\\ 2.179 & 0.342 & -2.159 & 0.811\\ 0.0 & 0.67 & 0.158 & 0.918\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_3= \begin{pmatrix}2.0 & -0.598 & 2.236 & 1.069\\ 0.598 & 0.6 & -0.695 & 0.894\\ 0.0 & 1.443 & -0.6 & -0.239\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_4= \begin{pmatrix}1.885 & 1.276 & 1.764 & 1.28\\ 0.465 & -0.291 & -1.903 & 0.041\\ 0.0 & 0.786 & 0.406 & -0.599\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_5= \begin{pmatrix}2.165 & 0.605 & -1.29 & 1.253\\ 0.235 & 0.778 & -1.774 & -0.319\\ 0.0 & 1.259 & -0.943 & 0.573\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_6= \begin{pmatrix}2.239 & -1.066 & -1.111 & 1.211\\ 0.156 & -0.706 & -2.045 & 0.278\\ 0.0 & 0.831 & 0.467 & 0.676\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_7= \begin{pmatrix}2.162 & -0.231 & 1.764 & 1.227\\ 0.073 & 0.613 & -1.771 & 0.422\\ 0.0 & 1.021 & -0.775 & -0.562\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_8= \begin{pmatrix}2.155 & 1.309 & 1.126 & 1.241\\ 0.04 & -0.758 & -1.857 & -0.283\\ 0.0 & 0.993 & 0.603 & -0.617\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_9= \begin{pmatrix}2.179 & 0.081 & -1.661 & 1.235\\ 0.023 & 0.503 & -2.005 & -0.295\\ 0.0 & 0.865 & -0.682 & 0.622\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_{10}= \begin{pmatrix}2.18 & -1.424 & -0.899 & 1.232\\ 0.011 & -0.871 & -1.661 & 0.383\\ 0.0 & 1.191 & 0.691 & 0.579\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\ A_{11}= \begin{pmatrix}2.173 & 0.111 & 1.698 & 1.234\\ 0.007 & 0.377 & -2.09 & 0.247\\ 0.0 & 0.758 & -0.549 & -0.645\\ 0.0 & 0.0 & 0.0 & 2.0\\ \end{pmatrix} \\$
特征值为：

[2.173, (-0.086+1.17j), (-0.086-1.17j), 2.0]

5 Python实现

# _*_ coding:utf-8 _*_

import math
import cmath

MIN_VALUE = 1e-2


def round_num(x):
    if isinstance(x, complex):
        return complex(round_num(x.real), round_num(x.imag))
    return round(x * 1000) / 1000.0


class Matrix:
    # 矩阵，为了方便计算，二维数组的每个一维数组表示一个向量，也就是一列
    # 所以看起来会比较奇怪
    def __init__(self, lines):
        self.__lines = lines

    def gram_schmidt(self):
        # q矩阵先复制原矩阵
        q = [[i for i in vector] for vector in self.__lines]
        # 从第二个向量开始做减法
        # q的长度数组
        q_square = [0 for _ in q]
        q_square[0] = Matrix.square(q[0])
        q_length = [0 for _ in q]
        q_length[0] = math.sqrt(q_square[0])
        columns = len(self.__lines)
        # r 初始化为全部为0
        r = [[0 for _ in vector] for vector in self.__lines]
        # r对角线为1
        for i, vector in enumerate(r):
            vector[i] = 1
        # i 是列
        for i in range(1, columns):
            q[i] = self.__lines[i]
            # sum其实是一个向量
            for j in range(0, i):
                frac = Matrix.dot(q[j], self.__lines[i]) / q_square[j]
                # 把系数存储在r中
                r[i][j] = frac
                q[i] = Matrix.sub(q[i], Matrix.mul_num(q[j], frac))
            q_square[i] = Matrix.square(q[i])
            q_length[i] = math.sqrt(q_square[i])
        # Q单位化
        # R乘回来
        for i, vector in enumerate(q):
            for j, e in enumerate(vector):
                vector[j] = e / q_length[i]
                # vector[j] = round(vector[j] * 10000) / 10000.0
        # i 代表向量，也就是列
        for i, vector in enumerate(r):
            # j才代表行
            for j, e in enumerate(vector):
                # 每一行乘以长度
                vector[j] = e * q_length[j]
                # vector[j] = round(vector[j] * 10000) / 10000.0
        return Matrix(q), Matrix(r)

    def eigen_values(self):
        q, r = self.gram_schmidt()
        a = self
        i = 0
        while not a.is_upper_triangle():
            a = r * q
            q, r = a.gram_schmidt()
            i += 1
            print(f"A_{i}=", a.to_latex(),"\\\\")
        return a.__eigen_values()

    def __eigen_values(self):
        eigen_values = [0 for _ in self.__lines]
        for i, vector in enumerate(self.__lines):
            # 如果左边不是0，那么求过了
            if i > 0 and abs(self.__lines[i - 1][i]) > MIN_VALUE:
                continue
            # 如果是下一个元素不是0
            if i < len(self.__lines) - 1 and abs(vector[i + 1]) > MIN_VALUE:
                # 按求根公式计算
                a = vector[i]
                b = self.__lines[i + 1][i]
                c = vector[i + 1]
                d = self.__lines[i + 1][i + 1]
                root = a * a + d * d - 2 * a * d + 4 * b * c
                if root > 0:
                    sqrt = math.sqrt(root)
                else:
                    sqrt = cmath.sqrt(root)
                a_d = a + d
                eigen_values[i] = round_num((a_d + sqrt) / 2)
                eigen_values[i + 1] = round_num((a_d - sqrt) / 2)
            else:
                eigen_values[i] = round(vector[i] * 1000) / 1000.0

        return eigen_values

    def is_upper_triangle(self):
        for i in range(len(self.__lines)):
            vector = self.__lines[i]
            for j in range(i + 1, len(vector)):
                e = vector[j]
                if abs(e) > MIN_VALUE:
                    if j > i + 1:
                        return False
                    # 这时候再判断是不是拟三角矩阵
                    # 如果左上角是不是0
                    if i > 0 and abs(self.__lines[i - 1][j - 1]) > MIN_VALUE:
                        return False
                    # 如果右下角是不是0
                    if i < len(self.__lines) - 1 and abs(self.__lines[i + 1][j + 1]) > MIN_VALUE:
                        return False
        return True

    def __mul__(self, other):
        if len(self.__lines) != len(other.__lines[0]):
            raise Exception("矩阵A列数%d != 矩阵B的行数%d" % (len(self.__lines[0]), len(other.__lines)))
        # 弄一个m行n列的新矩阵
        m = len(self.__lines[0])
        n = len(other.__lines)
        p = len(other.__lines[0])

        result = [[0] * n for _ in range(0, m)]
        # i 代表self的行
        for i in range(0, m):
            # j 代表 B 矩阵的列
            for j in range(0, n):
                # 第一个矩阵的行 与第二个矩阵列的乘积和
                # k 代表 A矩阵的列和B矩阵的行
                for k in range(0, p):
                    mul = self.__lines[k][i] * other.__lines[j][k]
                    result[j][i] += mul
                # result[j][i] = round(result[j][i] * 10000) / 10000.0
        return Matrix(result)

    def __str__(self):
        s = ''
        rows = len(self.__lines)
        columns = len(self.__lines[0])
        for i in range(0, columns):
            s += "|"

            for j in range(0, rows):
                if j > 0:
                    s += '\t'
                s += str(round(self.__lines[j][i] * 1000) / 1000)

            s += "\t|\n"
        return s

    def __getitem__(self, item):
        return self.__lines[item]

    def to_latex(self):
        s = '\\begin{pmatrix}'
        rows = len(self.__lines)
        columns = len(self.__lines[0])
        for i in range(0, columns):
            for j in range(0, rows):
                if j > 0:
                    s += ' & '
                s += str(round(self.__lines[j][i] * 1000) / 1000)

            s += "\\\\\n"
        return s + "\end{pmatrix}\n"

    @staticmethod
    def square(vector):
        return Matrix.dot(vector, vector)

    @staticmethod
    def dot(vector1, vector2):
        l = 0
        for i in range(0, len(vector1)):
            l += vector1[i] * vector2[i]
        return l

    @staticmethod
    def sub(vector1, vector2):
        return [e - vector2[i] for i, e in enumerate(vector1)]

    @staticmethod
    def mul_num(vector, num):
        return [num * i for i in vector]