matlab的稀疏矩阵可以自动扩展吗,【原创+收藏】提高MATLAB中稀疏矩阵的计算速度...

最新推荐文章于 2022-08-24 08:30:59 发布

啥时吃泡面不心疼

最新推荐文章于 2022-08-24 08:30:59 发布

阅读量810

点赞数 1

文章标签： matlab的稀疏矩阵可以自动扩展吗

根据MATLAB文档所述，使用稀疏矩阵可以减少内存使用，提高计算效率。但也要注意使用方法，否则适得其反。下面是在Mathwoks找到一篇关于Sparse

Matrix使用技巧的文章。文章很长，但核心思想只有一个：准备好数据后用sparse函数生成稀疏矩阵，而不要先创建一个大稀疏矩阵A，然后在程序中用A(i,j)的方式给它赋值。

试了一下，效果非常明显。MATLAB的自带函数bvp4c中大量使用了A(i,j)的方式访问稀疏矩阵。按文章建议的方法改了一下，对156个方程的情况：

修改前：

Function

Name

Calls

Total

Time

Self

Time

bvp4c

211.973 s

6.498 s

bvp4c>colloc_Jac

203.117 s

195.789 s

修改后：

Function

Name

Calls

Total

Time

Self

Time

bvp4c2

24.587 s

6.903 s

bvp4c2>colloc_Jac

15.355 s

4.133 s

Creating Sparse Finite-Element Matrices in

MATLAB

I'm pleased to introduce Tim Davis as this week's guest blogger.

Tim is a professor at the University of Florida, and is the author

or co-author of many of our sparse matrix functions (lu, chol, much

of sparse backslash, ordering methods such as amd and colamd, and

other functions such as etree and symbfact). He is also the author

of a recent book, Direct Methods for Sparse

Linear Systems, published by SIAM, where more details of MATLAB

sparse matrices are discussed ( http://www.cise.ufl.edu/~davis

Contents

MATLAB is Slow? Only If You Use It Incorrectly

From time to time, I hear comments such as "MATLAB is slow for

large finite-element problems." When I look at the details, the

problem is typically due to an overuse of A(i,j)= ... when

creating the sparse matrix (assembling the finite elements). This

can be seen in typical user's code, MATLAB code in books on the

topic, and even in MATLAB itself. The problem is widespread. MATLAB

can be very fast for finite-element problems, but not if it's used

incorrectly.

How Not to Create a Finite-Element Matrix

A good example of what not to do can be found in the

wathen.m function, in MATLAB.

A=gallery('wathen',200,200) takes a huge amount of time; a

very minor modification cuts the run time drastically. I'm not

intending to single out this one function for critique; this is a

very common issue that I see over and over again. This function is

built into MATLAB, which makes it an accessible example. It was

first written when MATLAB did not support sparse matrices, and was

modified only slightly to exploit sparsity. It was never meant for

generating large sparse finite-element matrices. You can see the

entire function with the command:

type private/wathen.m

Below is an excerpt of the relevant parts of wathen.m.

The function wathen1.m creates a finite-element matrix of

an nx-by-ny mesh. Each i loop creates a single 8-by-8

finite-element matrix, and adds it into A.

type wathen1

tic

A = wathen1 (200,200) ;

toc

function A = wathen1 (nx,ny)

rand('state',0)

e1 = [6 -6 2 -8;-6 32 -6 20;2 -6 6 -6;-8 20 -6 32];

e2 = [3 -8 2 -6;-8 16 -8 20;2 -8 3 -8;-6 20 -8 16];

e = [e1 e2; e2' e1]/45;

n = 3*nx*ny+2*nx+2*ny+1;

A = sparse(n,n);

RHO = 100*rand(nx,ny);

nn = zeros(8,1);

for j=1:ny

for i=1:nx

nn(1) = 3*j*nx+2*i+2*j+1;

nn(2) = nn(1)-1;

nn(3) = nn(2)-1;

nn(4) = (3*j-1)*nx+2*j+i-1;

nn(5) = 3*(j-1)*nx+2*i+2*j-3;

nn(6) = nn(5)+1;

nn(7) = nn(6)+1;

nn(8) = nn(4)+1;

em = e*RHO(i,j);

for krow=1:8

for kcol=1:8

A(nn(krow),nn(kcol)) = A(nn(krow),nn(kcol))+em(krow,kcol);

end

Elapsed time is 305.832709 seconds.

What's Wrong with It?

The above code is fine for generating a modest sized matrix, but

the A(i,j) = ... statement is quite slow when A

is large and sparse, particularly when i and j

are also scalars. The inner two for-loops can be vectorized so that

i and j are vectors of length 8. Each A(i,j)

= ... statement is assembling an entire finite-element matrix

into A. However, this leads to very minimal improvement in

run time.

type wathen1b

tic

A1b = wathen1b (200,200) ;

toc

function A = wathen1b (nx,ny)

rand('state',0)

e1 = [6 -6 2 -8;-6 32 -6 20;2 -6 6 -6;-8 20 -6 32];

e2 = [3 -8 2 -6;-8 16 -8 20;2 -8 3 -8;-6 20 -8 16];

e = [e1 e2; e2' e1]/45;

n = 3*nx*ny+2*nx+2*ny+1;

A = sparse(n,n);

RHO = 100*rand(nx,ny);

nn = zeros(8,1);

for j=1:ny

for i=1:nx

nn(1) = 3*j*nx+2*i+2*j+1;

nn(2) = nn(1)-1;

nn(3) = nn(2)-1;

nn(4) = (3*j-1)*nx+2*j+i-1;

nn(5) = 3*(j-1)*nx+2*i+2*j-3;

nn(6) = nn(5)+1;

nn(7) = nn(6)+1;

nn(8) = nn(4)+1;

em = e*RHO(i,j);

A (nn,nn) = A (nn,nn) + em ;

end

Elapsed time is 282.945593 seconds.

disp (norm (A-A1b,1))

How MATLAB Stores Sparse Matrices

To understand why the above examples are so slow, you need to

understand how MATLAB stores its sparse matrices. An n-by-n MATLAB

sparse matrix is stored as three arrays; I'll call them p,

i, and x. These three arrays are not directly

accessible from M, but they can be accessed by a mexFunction. The

nonzero entries in column j are stored in

i(p(j):p(j+1)-1) and x(p(j):p(j+1)-1), where

x holds the numerical values and i holds the

corresponding row indices. Below is a very small example. First, I

create a full matrix and convert it into a sparse one. This

is only so that you can easily see the matrix C and how

it's stored in sparse form. You should never create a sparse matrix

this way, except for tiny examples.

C = [

4.5 0.0 3.2 0.0

3.1 2.9 0.0 0.9

0.0 1.7 3.0 0.0

3.5 0.4 0.0 1.0 ] ;

C = sparse (C)

C =

(1,1) 4.5000

(2,1) 3.1000

(4,1) 3.5000

(2,2) 2.9000

(3,2) 1.7000

(4,2) 0.4000

(1,3) 3.2000

(3,3) 3.0000

(2,4) 0.9000

(4,4) 1.0000

Notice that the nonzero entries in C are stored in

column order, with sorted row indices. The internal p,

i, and x arrays can be reconstructed as follows.

The find(C) statement returns a list of "triplets," where

the kth triplet is i(k), j(k), and x(k).

This specifies that C(i(k),j(k)) is equal to

x(k). Next, find(diff(...)) constructs the column

pointer array p (this only works if there are no all-zero

columns in the matrix).

[i j x] = find (C) ;

n = size(C,2) ;

p = find (diff ([0 ; j ; n+1])) ;

for col = 1:n

fprintf ('column %d:\n k row index value\n', col) ;

disp ([(p(col):p(col+1)-1)' i(p(col):p(col+1)-1) x(p(col):p(col+1)-1)])

end

column 1:

k row index value

1.0000 1.0000 4.5000

2.0000 2.0000 3.1000

3.0000 4.0000 3.5000

column 2:

k row index value

4.0000 2.0000 2.9000

5.0000 3.0000 1.7000

6.0000 4.0000 0.4000

column 3:

k row index value

7.0000 1.0000 3.2000

8.0000 3.0000 3.0000

column 4:

k row index value

9.0000 2.0000 0.9000

10.0000 4.0000 1.0000

Now consider what happens when one entry is added to

C(3,1) = 42 ;

[i j x] = find (C) ;

n = size(C,2) ;

p = find (diff ([0 ; j ; n+1])) ;

for col = 1:n

fprintf ('column %d:\n k row index value\n', col) ;

disp ([(p(col):p(col+1)-1)' i(p(col):p(col+1)-1) x(p(col):p(col+1)-1)])

end

column 1:

k row index value

1.0000 1.0000 4.5000

2.0000 2.0000 3.1000

3.0000 3.0000 42.0000

4.0000 4.0000 3.5000

column 2:

k row index value

5.0000 2.0000 2.9000

6.0000 3.0000 1.7000

7.0000 4.0000 0.4000

column 3:

k row index value

8.0000 1.0000 3.2000

9.0000 3.0000 3.0000

column 4:

k row index value

10.0000 2.0000 0.9000

11.0000 4.0000 1.0000

and you can see that nearly every entry in C has been

moved down by one in the i and x arrays. In

general, the single statement C(3,1)=42 takes time

proportional to the number of entries in matrix. Thus, looping

nnz(A) times over the statement A(i,j)=A(i,j)+...

takes time proportional to nnz(A)^2.

A Better Way to Create a Finite-Element Matrix

The version below is only slightly different. It could be

improved, but I left it nearly the same to illustrate how simple it

is to write fast MATLAB code to solve this problem, via a minor

tweak. The idea is to create a list of triplets, and let MATLAB

convert them into a sparse matrix all at once. If there are

duplicates (which a finite-element matrix always has) the

duplicates are summed, which is exactly what you want when

assembling a finite-element matrix. In MATLAB 7.3 (R2006b),

sparse uses quicksort, which takes

nnz(A)*log(nnz(A)) time. This is slower than it could be

(a linear-time bucket sort can be used, taking essentially

nnz(A) time). However, it's still much faster than

nnz(A)^2. For this matrix, nnz(A) is about 1.9

million.

type wathen2.m

tic

A2 = wathen2 (200,200) ;

toc

function A = wathen2 (nx,ny)

rand('state',0)

e1 = [6 -6 2 -8;-6 32 -6 20;2 -6 6 -6;-8 20 -6 32];

e2 = [3 -8 2 -6;-8 16 -8 20;2 -8 3 -8;-6 20 -8 16];

e = [e1 e2; e2' e1]/45;

n = 3*nx*ny+2*nx+2*ny+1;

ntriplets = nx*ny*64 ;

I = zeros (ntriplets, 1) ;

J = zeros (ntriplets, 1) ;

X = zeros (ntriplets, 1) ;

ntriplets = 0 ;

RHO = 100*rand(nx,ny);

nn = zeros(8,1);

for j=1:ny

for i=1:nx

nn(1) = 3*j*nx+2*i+2*j+1;

nn(2) = nn(1)-1;

nn(3) = nn(2)-1;

nn(4) = (3*j-1)*nx+2*j+i-1;

nn(5) = 3*(j-1)*nx+2*i+2*j-3;

nn(6) = nn(5)+1;

nn(7) = nn(6)+1;

nn(8) = nn(4)+1;

em = e*RHO(i,j);

for krow=1:8

for kcol=1:8

ntriplets = ntriplets + 1 ;

I (ntriplets) = nn (krow) ;

J (ntriplets) = nn (kcol) ;

X (ntriplets) = em (krow,kcol) ;

end

A = sparse (I,J,X,n,n) ;

Elapsed time is 1.594073 seconds.

disp (norm (A-A2,1))

1.4211e-014

If you do not know how many entries your matrix will have, you

may not be able to preallocate the I, J, and

X arrays, as done in wathen2.m. In that case,

start them at a reasonable size (anything larger than zero will do)

and add this code to the innermost loop, just after

ntriplets is incremented:

len = length (X) ;

if (ntriplets > len)

I (2*len) = 0 ;

J (2*len) = 0 ;

X (2*len) = 0 ;

end

and when done, use

sparse(I(1:ntriplets),J(1:ntriplets),X(1:ntriplets),n,n).

Moral: Do Not Abuse A(i,j)=... for Sparse A; Use sparse

Instead

Avoid statements such as

C(4,2) = C(4,2) + 42 ;

in a loop. Create a list of triplets (i,j,x) and use sparse instead. This advice holds for any sparse

matrix, not just finite-element ones.

Try a Faster Sparse Function

CHOLMOD includes a sparse2 mexFunction which is a

replacement for sparse. It uses a linear-time bucket sort.

The MATLAB 7.3 (R2006b) sparse accounts for about 3/4ths

the total run time of wathen2.m. For this matrix

sparse2 in CHOLMOD is about 10 times faster than the

MATLAB sparse. CHOLMOD can be found in the SuiteSparse package, in MATLAB Central.

If you would like to see a short and concise C mexFunction

implementation of the method used by sparse2, take a look

at CSparse, which was written for a concise textbook-style.

presentation. The cs_sparse mexFunction uses cs_compress.c, to convert the triplets to a compressed

column form. of A', cs_dupl.c to sum up duplicate entries, and then

cs_transpose.c to transpose the result (which also

sorts the columns).

When Fast is Not Fast Enough...

Even with this dramatic improvement in constructing the matrix

A, MATLAB could still use additional features for faster

construction of sparse finite-element matrices. Constructing the

matrix should be much faster than x=A\b, since chol is doing about 700 times more work as

sparse for this matrix (1.3 billion flops, vs 1.9 million

nonzeros in A). The run times are not that different,

however:

tic

A = wathen2 (200,200) ;

toc

b = rand (size(A,1),1) ;

tic

x=A\b ;

toc

Elapsed time is 1.720397 seconds.

Elapsed time is 3.125791 seconds.

By Loren Shure

啥时吃泡面不心疼

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫