Cholesky分解针对的是对称非正定矩阵
主要目的包括将原矩阵强制转换为对称正定矩阵以及保证分解过程的稳定性
修正cholesky分解的做法
给定对称矩阵
A
∈
R
n
×
n
A \in R^{n\times n}
A∈Rn×n:计算出
δ
=
m
a
x
i
=
1
,
…
,
n
∣
A
i
i
∣
\delta=max_{i=1,\ldots,n}|A_{ii}|
δ=maxi=1,…,n∣Aii∣
ν
=
m
a
x
i
≠
j
∣
A
i
,
j
∣
\nu=max_{i\neq j}|A_{i,j}|
ν=maxi=j∣Ai,j∣
β
2
=
m
a
x
(
δ
,
ν
(
n
2
−
1
)
,
μ
)
\beta^2=max(\delta,\frac{\nu}{\sqrt(n^2-1)},\mu)
β2=max(δ,(n2−1)ν,μ)
1:
d
j
′
=
m
a
x
{
δ
,
∣
A
j
,
j
−
∑
r
=
1
j
−
1
C
j
,
r
2
d
r
−
1
∣
}
d_{j}^{'}=max\{\delta,|A_{j,j}-\sum_{r=1}^{j-1}C_{j,r}^{2}d_{r}^{-1}|\}
dj′=max{δ,∣Aj,j−∑r=1j−1Cj,r2dr−1∣}
2:
C
i
,
j
=
A
i
,
j
−
∑
r
=
1
j
−
1
L
j
,
r
C
i
,
r
,
i
≥
j
+
1
,
…
,
n
C_{i,j}=A_{i,j}-\sum_{r=1}^{j-1}L_{j,r}C_{i,r},i\geq j+1,\ldots,n
Ci,j=Ai,j−∑r=1j−1Lj,rCi,r,i≥j+1,…,n
3:
θ
j
=
m
a
x
{
∣
C
i
,
j
,
i
>
j
}
,
(
j
=
n
,
θ
j
=
0
)
\theta_{j}=max\{|C_{i,j},i>j\},(j=n,\theta_j=0)
θj=max{∣Ci,j,i>j},(j=n,θj=0)
4:
d
j
=
m
a
x
{
d
j
′
,
θ
j
2
β
2
}
d_j=max\{d_{j}^{'},\frac{\theta_{j}^{2}}{\beta^{2}}\}
dj=max{dj′,β2θj2}
5:
L
i
,
j
=
C
i
,
j
d
j
,
i
≥
j
+
1
,
…
,
n
L_{i,j}=\frac{C_{i,j}}{d_{j}},i\geq j+1,\ldots,n
Li,j=djCi,j,i≥j+1,…,n
上面步骤中,目的是为了输出矩阵L和矩阵D,其中L是单位下三角矩阵,D是对角矩阵,
D
=
d
i
a
g
{
d
1
,
d
2
,
…
,
d
n
}
D=diag\{d_1,d_2,\ldots,d_n\}
D=diag{d1,d2,…,dn}
修正以后的矩阵
A
x
i
u
=
L
D
L
T
A_{xiu}=LDL^{T}
Axiu=LDLT,可以证明
A
x
i
u
−
A
A_{xiu}-A
Axiu−A是一个对角矩阵。下面我们放代码:
#修正cholesky分解
def XC(A):
n = A.shape[0]
dia = np.diagonal(A.numpy())#以列表存储矩阵对角元素
delta = torch.tensor(max(abs(dia)))
#delta = (max(abs(dia))).clone().detach()
U = np.triu(A.numpy()) - np.diag(dia)#取出非对角上三角函数
#print(abs(U).max())
be = max(delta,abs(U).max()/(n**2 - 1))
beta = (1/be).clone().detach()#根据A给出beta^2分之一的值
C = torch.zeros_like(A);L = torch.eye(n,n);D = torch.zeros_like(A)
d = torch.zeros(n);theta = torch.zeros(n)
#----------------------
d[0] = max(delta,abs(A[0,0]))
for i in range(1,n):
C[i,0] = A[i,0]
theta[0] = max(abs(C[:,0]))#theta_0
D[0,0] = max(beta*theta[0]**2,d[0])
for i in range(1,n):
L[i,0] = C[i,0]/D[0,0]
#------------------------------
for j in range(1,n):
for i in range(j+1,n):
for r in range(j):
d[j] = A[j,j] - C[j,r]**2/d[r]
C[i,j] = A[i,j] - L[j,r]*C[i,r]
d[j] = max(delta,abs(d[j]))
theta[j] = max(abs(C[:,j]))
D[j,j] = max(d[j],beta*theta[j]**2)
for m in range(j + 1,n):
L[m,j] = C[m,j]/D[j,j]
#print(beta)
return L,D
A = torch.eye(3,3)
A[0,1] = 1.0;A[0,2] = 2.0
A[1,0] = 1;A[1,1] += 1e-7;A[1,2] = 3
A[2,0] = 2;A[2,1] = 3
print(A)
L,D = XC(A)
print(L@D@L.t() - A)