Matlab GPU加速怎么用可参照邪恶的亡灵的博客《Matlab+GPU加速学习笔记》。
目前遇到的问题是当矩阵维数比较小的时候,GPU计算速度并没有CPU计算速度快,只有当矩阵维数比较大的时候,GPU才有加速效果。测试代码如下:
clear all
clc
M = rand(5, 5);
% M = rand(50, 50);
% M = rand(500, 500);
% M = rand(5000, 5000);
tt1 = 0;
for i = 1:1000
tic
N = M .* M;
t1 = toc;
tt1 = tt1 + t1;
end
tt1
M = gpuArray(single(M));
tt2 = 0;
for i = 1:1000
tic
N1 = M .* M;
t2 = toc;
tt2 = tt2 + t2;
end
tt2
输出结果(运行时间对比 /s):
dim:5×5
cpu_time = 0.00078
gpu_time = 0.012756
dim:10×10
cpu_time = 0.000997
gpu_time = 0.012787
dim:50×50
cpu_time = 0.001936
gpu_time = 0.012238
dim:100×100
cpu_time = 0.009022
gpu_time = 0.01211
dim:300×300
cpu_time = 0.025318
gpu_time = 0.015521
dim:500×500
cpu_time = 0.064052
gpu_time = 0.016059
dim:800×800
cpu_time = 0.824175
gpu_time = 0.011845
dim:1000×1000
cpu_time = 2.190284
gpu_time = 0.015683
这就比较尴尬了,我平时的数据矩阵维数也就128×128左右,这不就体验不到GPU的加速快感了吗?
怎么破???
在matlab上找的一些问答:
https://www.mathworks.com/matlabcentral/answers/122232-cpu-vs-gpu-is-it-reasonable?s_tid=srchtitle
https://www.mathworks.com/matlabcentral/answers/360211-gpu-time-slower-than-cpu-time-what-went-wrong-with-my-gpu-implementation?s_tid=srchtitle
GPU are SIMD (Single Instruction Multiple Data) devices that can do a lot of work quickly in total when they are given large amounts of data to work on. However, their clock cycle between instructions is slower than a CPU, so if you do not work on enough data and try to use the GPU to replace a CPU through iteration, then the outcome is certain to be slower.
https://www.mathworks.com/matlabcentral/answers/269646-my-code-in-matlab-takes-longer-time-on-gpu-compare-with-cpu-my-gpu-device-is-geforce-980-ti-coul?s_tid=srchtitle
https://www.mathworks.com/matlabcentral/answers/264457-why-for-gpu-is-slower-than-cpu-for-this-code-is-it-because-of-sparsity-or-because-of-for-loop?s_tid=srchtitle