Sherman-Morrison公式及其应用

Sherman-Morrison公式

  Sherman-Morrison公式以 Jack Sherman 和 Winifred J. Morrison命名,在线性代数中,是求解逆矩阵的一种方法。本篇博客将介绍该公式及其应用,首先我们来看一下该公式的内容及其证明。
  (Sherman-Morrison公式)假设 ARn×n A ∈ R n × n 为可逆矩阵, u,vRn u , v ∈ R n 为列向量,则 A+uvT A + u v T 可逆当且仅当 1+vTA1u0 1 + v T A − 1 u ≠ 0 , 且当 A+uvT A + u v T 可逆时,该逆矩阵由以下公式给出:

(A+uvT)1=A1A1uvTA11+vTA1u. ( A + u v T ) − 1 = A − 1 − A − 1 u v T A − 1 1 + v T A − 1 u .

证明:
() ( ⇐ ) 1+vTA1u0 1 + v T A − 1 u ≠ 0 时,令 X=A+uvT,Y=A1A1uvTA11+vTA1u X = A + u v T , Y = A − 1 − A − 1 u v T A − 1 1 + v T A − 1 u ,则只需证明 XY=YX=I X Y = Y X = I 即可,其中 I I 为n阶单位矩阵。
XY=(A+uvT)(A1A1uvTA11+vTA1u)=AA1+uvTA1AA1uvTA1+uvTA1uvTA11+vTA1u=I+uvTA1uvTA1+uvTA1uvTA11+vTA1u=I+uvTA1u(1+vTA1u)vTA11+vTA1u=I+uvTA1uvTA1=I

同理,有 YX=I Y X = I .因此,当 1+vTA1u0 1 + v T A − 1 u ≠ 0 时, (A+uvT)1=A1A1uvTA11+vTA1u. ( A + u v T ) − 1 = A − 1 − A − 1 u v T A − 1 1 + v T A − 1 u .
() ( ⇒ ) u=0 u = 0 时,显然有 1+vTA1u=10. 1 + v T A − 1 u = 1 ≠ 0. u0 u ≠ 0 时,用反正法证明该命题成立。假设 A+uvT A + u v T 可逆,但 1+vTA1u=0 1 + v T A − 1 u = 0 ,则有
(A+uvT)A1u=u+u(vTA1u)=(1+vTA1u)u=0. ( A + u v T ) A − 1 u = u + u ( v T A − 1 u ) = ( 1 + v T A − 1 u ) u = 0.

因为 A+uvT A + u v T 可逆,故 A1 A − 1 u=0,又因为 A1 A − 1 可逆,故 u=0 u = 0 ,此与假设 u0 u ≠ 0 矛盾。因此,当 A+uvT A + u v T 可逆时,有 1+vTA1u0. 1 + v T A − 1 u ≠ 0.

Sherman-Morrison公式的应用

应用1: A=I A = I 时的Sherman-Morrison公式

  在Sherman-Morrison公式中,令 A=I A = I ,则有: I+uvT I + u v T 可逆当且仅当 1+vTu0 1 + v T u ≠ 0 , 且当 I+uvT I + u v T 可逆时,该逆矩阵由以下公式给出:

(I+uvT)1=IuvT1+vTu. ( I + u v T ) − 1 = I − u v T 1 + v T u .

  再令 v=u v = u ,则 1+uTu>0 1 + u T u > 0 , 因此, I+uuT I + u u T 可逆,且
(I+uuT)1=IuuT1+uTu. ( I + u u T ) − 1 = I − u u T 1 + u T u .

应用2:BFGS算法

  Sherman-Morrison公式在BFGS算法中的应用,可用来求解BFGS算法中近似Hessian矩阵的逆。本篇博客并不打算给出Sherman-Morrison公式在BFGS算法中的应用,将会再写篇博客介绍BFGS算法,到时再给出该公式的应用,并会在之后补上该博客的链接(因为笔者还没写)。

应用3:循环三对角线性方程组的求解

  本篇博客将详细讲述Sherman-Morrison公式在循环三对角线性方程组的求解中的应用。
  首先给给出理论知识介绍部分。
  对于 ARn×n A ∈ R n × n 为可逆矩阵, u,vRn u , v ∈ R n 为列向量, 1+vTA1u0 1 + v T A − 1 u ≠ 0 ,需要求解方程 (A+uvT)x=b. ( A + u v T ) x = b . 对此,我们可以先求解以下两个方程:

Ay=b,Az=u A y = b , A z = u
.
然后令 x=yvTy1+vTzz x = y − v T y 1 + v T z z ,该解即为原方程的解,验证如下:
(A+uvT)x=(A+uvT)(yvTy1+vTzz)=Ay+uvTyvTy1+vTzAzvTy1+vTzuvTz=b+uvTyvTyu+vTyuvTz1+vTz=b+uvTy(1+vTz)vTyu1+vTz=b+uvTyuvTy=b ( A + u v T ) x = ( A + u v T ) ( y − v T y 1 + v T z z ) = A y + u v T y − v T y 1 + v T z A z − v T y 1 + v T z u v T z = b + u v T y − v T y u + v T y u v T z 1 + v T z = b + u v T y − ( 1 + v T z ) v T y u 1 + v T z = b + u v T y − u v T y = b

  这样将原方程 (A+uvT)x=b ( A + u v T ) x = b 分成两个方程,可以在一定程度上简化原方程。接下来,我们将介绍循环三对角线性方程组的求解。
  所谓循环三对角线性方程组,指的是系数矩阵为如下形式:
A=b1a200cnc1b200c2an20bn2an1000cn2bn1ana100cn1bn A = [ b 1 c 1 0 ⋯ 0 a 1 a 2 b 2 c 2 0 ⋮ 0 0 ⋱ ⋱ ⋱ 0 ⋮ ⋮ ⋮ a n − 2 b n − 2 c n − 2 0 0 ⋯ ⋯ a n − 1 b n − 1 c n − 1 c n 0 ⋯ 0 a n b n ]

循环三对角线性方程组可写成 Ax=d A x = d ,其中 d=(d1,d2,...,dn)T. d = ( d 1 , d 2 , . . . , d n ) T .
  对于此方程的求解,我们令 u=(γ,0,0,...,cn)T,v=(1,0,0,...,a1γ)T u = ( γ , 0 , 0 , . . . , c n ) T , v = ( 1 , 0 , 0 , . . . , a 1 γ ) T , 且 A=A+uvT A = A ′ + u v T ,其中 A A ′ 如下:
A=b1γa2000c1b200c2an20bn2an1000cn2bn1an000cn1bna1cnγ A ′ = [ b 1 − γ c 1 0 ⋯ 0 0 a 2 b 2 c 2 0 ⋮ 0 0 ⋱ ⋱ ⋱ 0 ⋮ ⋮ ⋮ a n − 2 b n − 2 c n − 2 0 0 ⋯ ⋯ a n − 1 b n − 1 c n − 1 0 0 ⋯ 0 a n b n − a 1 c n γ ]

A A ′ 为三对角矩阵。根据以上的理论知识,我们只需要求解以下两个方程
Ay=d,Az=u A ′ y = d , A ′ z = u ,

然后,就能根据 y,z y , z 求出 x x .而以上两个方程为三对角线性方程组,可以用追赶法(或Thomas法)求解,具体算法可以参考博客:三对角线性方程组(tridiagonal systems of equations)的求解 。
  综上,我们利用Sherman-Morrison公式的思想,可以将循环三对角线性方程组转化为三对角线性方程组求解。我们将会在下面给出该算法的Python语言实现。

Python实现

  我们要解的循环三对角线性方程组如下:

[4100214100014100014130014][x1x2x3x4x5]=[76668]

  用Python实现解该方程的Python完整代码如下:

# use Sherman-Morrison Formula and Thomas Method to solve cyclic tridiagonal linear equation

import numpy as np

# Thomas Method for soling tridiagonal linear equation Ax=d
# parameter: a,b,c,d are list-like of same length
# b: main diagonal of matrix A
# a: main diagonal below of matrix A
# c: main diagonal upper of matrix A
# d: Ax=d
# return: x(type=list), the solution of Ax=d
def TDMA(a,b,c,d):

    try:
        n = len(d)    # order of tridiagonal square matrix

        # use a,b,c to create matrix A, which is not necessary in the algorithm
        A = np.array([[0]*n]*n, dtype='float64')

        for i in range(n):
            A[i,i] = b[i]
            if i > 0:
                A[i, i-1] = a[i]
            if i < n-1:
                A[i, i+1] = c[i]

        # new list of modified coefficients
        c_1 = [0]*n
        d_1 = [0]*n

        for i in range(n):
            if not i:
                c_1[i] = c[i]/b[i]
                d_1[i] = d[i] / b[i]
            else:
                c_1[i] = c[i]/(b[i]-c_1[i-1]*a[i])
                d_1[i] = (d[i]-d_1[i-1]*a[i])/(b[i]-c_1[i-1] * a[i])

        # x: solution of Ax=d
        x = [0]*n

        for i in range(n-1, -1, -1):
            if i == n-1:
                x[i] = d_1[i]
            else:
                x[i] = d_1[i]-c_1[i]*x[i+1]

        x = [round(_, 4) for _ in x]

        return x

    except Exception as e:
        return e

# Sherman-Morrison Fomula for soling cyclic tridiagonal linear equation Ax=d
# parameter: a,b,c,d are list-like of same length
# b: main diagonal of matrix A
# a: main diagonal below of matrix A
# c: main diagonal upper of matrix A
# d: Ax=d
# return: x(type=list), the solution of Ax=d
def Cyclic_Tridiagnoal_Linear_Equation(a,b,c,d):
    try:
        # use a,b,c to create cyclic tridiagonal matrix A
        n = len(d)
        A = np.array([[0] * n] * n, dtype='float64')

        for i in range(n):
            A[i, i] = b[i]
            if i > 0:
                A[i, i - 1] = a[i]
            if i < n - 1:
                A[i, i + 1] = c[i]
        A[0, n - 1] = a[0]
        A[n - 1, 0] = c[n - 1]

        gamma = 1 # gamma can be set freely
        u = [gamma] + [0] * (n - 2) + [c[n - 1]]
        v = [1] + [0] * (n - 2) + [a[0] / gamma]

        # modify the coefficient to form A'
        b[0] -= gamma
        b[n - 1] -= a[0] * c[n - 1] / gamma
        a[0] = 0
        c[n - 1] = 0

        # solve A'y=d, A'z=u by using Thomas Method
        y = np.array(TDMA(a, b, c, d))
        z = np.array(TDMA(a, b, c, u))

        # use y and z to calculate x
        # x = y-(v·y)/(1+v·z) *z
        # x is the solution of Ax=d
        x = y - (np.dot(np.array(v), y)) / (1 + np.dot(np.array(v), z)) * z

        x = [round(_, 3) for _ in x]

        return x

    except Exception as e:
        return e

def main():
    '''
       equation:
       A = [[4,1,0,0,2],
            [1,4,1,0,0],
            [0,1,4,1,0],
            [0,0,1,4,1],
            [3,0,0,1,4]]
       d = [7,6,6,6,8]
       solution x should be [1,1,1,1,1]
    '''

    a = [2, 1, 1, 1, 1]
    b = [4, 4, 4, 4, 4]
    c = [1, 1, 1, 1, 3]
    d = [7, 6, 6, 6, 8]

    x = Cyclic_Tridiagnoal_Linear_Equation(a,b,c,d)
    print('The solution is %s'%x)

main()

输出结果如下:

The solution is [1.0, 1.0, 1.0, 1.0, 1.0]

参考文献

  1. https://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula
  2. http://wwwmayr.in.tum.de/konferenzen/Jass09/courses/2/Soldatenko_paper.pdf
  3. https://scicomp.stackexchange.com/questions/10137/solving-system-of-linear-equations-with-cyclic-tridiagonal-matrix
  4. https://blog.csdn.net/jclian91/article/details/80251244
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值