原文:
I'm trying to speed up a MATLAB code that needs to access some terms a(i,j) of a large matrix a in a for loop. There are portions in which one term can be needed in five or more different calculations. In those cases, the code assigns the term a(i,j) to another variable k.
I thought this was producing an unnecessary assignment (in the context of five calculations), but to my surprise, it's the opposite. Actually, this assignment does make the code runs faster. Accessing five times a term of a large matrix is slower than passing it to a scalar variable and accessing this scalar variable five times.
Those findings can be reproduced in a simple test function:
r = 5e6;
i = 50;
j = 50000;
a = zeros(i,j);
%
tic
for ii = 1:r
b = a(i,j)+a(i,j)+a(i,j)+a(i,j)+a(i,j);
end
toc
%
tic
for ii = 1:r
k = a(i,j);
b = k+k+k+k+k;
end
toc
The first code takes 3.5x more time than the second one.
Should MATLAB be that slow to access data from a matrix of ~20 Mb?
EDIT 1:
Following Cris Luengo's answer obviously there is a problem with the MATLAB installation I'm working with (R2019a). The previous result was obtained through a M-file.
The following code produces the output. It seems that there is no compilation at all.
r = 5e6;
i = 50;
j = 50000;
a = zeros(i,j);
aux_rgb = lines(2);
figure('Color','White','Name','Code with drawnow'); hold on;
legend('location','bestoutside'); ylim([0,1.05]);
xlabel('number of terms in summation');
ylabel('relative time spent');
h1 = animatedline(NaN,NaN,'LineWidth',2.5,'Color',aux_rgb(1,:),'DisplayName','a(i,j)');
h2 = animatedline(NaN,NaN,'LineWidth',2.5,'Color',aux_rgb(2,:),'DisplayName','k');
%%
n = 1;
t1 = tic;
for ii = 1:r
b = a(i,j);
end
t1 = toc(t1);
addpoints(h1,n,t1/t1);
t2 = tic;
for ii = 1:r
k = a(i,j);
b = k;
end
t2 = toc(t2);
addpoints(h2,n,t2/t1);
drawnow
%%
n = 2;
t1 = tic;
for ii = 1:r
b = a(i,j)+a(i,j);
end
t1 = toc(t1);
addpoints(h1,n,t1/t1);
t2 = tic;
for ii = 1:r
k = a(i,j);
b = k+k;
end
t2 = toc(t2);
addpoints(h2,n,t2/t1);
drawnow
%%
n = 3;
t1 = tic;
for ii = 1:r
b = a(i,j)+a(i,j)+a(i,j);
end
t1 = toc(t1);
addpoints(h1,n,t1/t1);
t2 = tic;
for ii = 1:r
k = a(i,j);
b = k+k+k;
end
t2 = toc(t2);
addpoints(h2,n,t2/t1);
drawnow
%%
n = 4;
t1 = tic;
for ii = 1:r
b = a(i,j)+a(i,j)+a(i,j)+a(i,j);
end
t1 = toc(t1);
addpoints(h1,n,t1/t1);
t2 = tic;
for ii = 1:r
k = a(i,j);
b = k+k+k+k;
end
t2 = toc(t2);
addpoints(h2,n,t2/t1);
drawnow
%%
n = 5;
t1 = tic;
for ii = 1:r
b = a(i,j)+a(i,j)+a(i,j)+a(i,j)+a(i,j);
end
t1 = toc(t1);
addpoints(h1,n,t1/t1);
t2 = tic;
for ii = 1:r
k = a(i,j);
b = k+k+k+k+k;
end
t2 = toc(t2);
addpoints(h2,n,t2/t1);
drawnow
EDIT 2:
Following Cris Luengo's answer again, that is the output obtained through a M-file function (not a M-file script). Now the compilation does its job.
# Answer 1
There is a difference running code by copy-pasting into the command line, or inside a function. I see:
Elapsed time is 0.062195 seconds.
Elapsed time is 0.034381 seconds.
when copy-pasting in the command line, and
Elapsed time is 0.024922 seconds.
Elapsed time is 0.025392 seconds.
if I create a function M-file with that same code in it and run the function (a function M-file has a line starting with the function keyword as the first non-comment line). In this case, the two loops are equally fast (the first one might be slightly faster, but the difference is small).
You'll also notice that both loops run faster than when copy-pasting in the command line. When running a function, the function is compiled and then executed (this is referred to as "just-in-time compilation", or JIT). This compiler is able to optimize out the multiple indexing operations, effectively indexing only once.
On the command line, no compilation occurs, and therefore the index location is computed five times, and the value is retrieved five times.
Script M-files (M-files that do not start with the function keyword) are supposed to get JIT-compiled, but this doesn't always seem to happen, at least not with the same amount of optimizations.