这个Problem其实考察的就是__global__里的索引的划分,以及block thread的大小,其实我也是刚学,接下来慢慢的再学一下。
How Pixels are represented ?
How to Convert Color to Black and White ?
根据自己的配置修改makefile
CUDA_INCLUDEPATH=/usr/local/cuda-10.1/include
NVCC_OPTS=-O3 -arch=sm_52 -Xcompiler -Wall -Xcompiler -Wextra -m64
补充缺少的代码,参考提供的cpu版本代码
int offset = threadIdx.x + blockIdx.x * blockDim.x;
uchar4 rgbaPixel = rgbaImage[offset];
float channelSum = .299f * rgbaPixel.x + .587f * rgbaPixel.y + .114f * rgbaPixel.z;
greyImage[offset] = channelSum;
const dim3 blockSize(128, 1, 1); //TODO
const dim3 gridSize( 512, 1, 1); //TODO
日常报错
wlsh@wlsh-ThinkStation:~/Desktop/cs344-master/Problem Sets/Problem Set 1$ make
nvcc -c student_func.cu -O3 -arch=sm_52 -Xcompiler -Wall -Xcompiler -Wextra -m64
student_func.cu:59:49: warning: unused parameter ‘h_rgbaImage’ [-Wunused-parameter]
void your_rgba_to_greyscale(const uchar4 * const h_rgbaImage, uchar4 * const d_
^
nvcc -o HW1 main.o student_func.o compare.o reference_calc.o -L /usr/lib -lopencv_core -lopencv_imgproc -lopencv_highgui -O3 -arch=sm_52 -Xcompiler -Wall -Xcompiler -Wextra -m64
然后把这个函数的h_rgbaImage全部去掉,共修改三处 main.cpp等
void your_rgba_to_greyscale(uchar4 * const d_rgbaImage,
unsigned char* const d_greyImage, size_t numRows, size_t numCols)
make
lsh@wlsh-ThinkStation:~/Desktop/cs344-master/Problem Sets/Problem Set 1$ make
g++ -c main.cpp -O3 -Wall -Wextra -m64 -I /usr/local/cuda-10.1/include -I /usr/include
nvcc -c student_func.cu -O3 -arch=sm_52 -Xcompiler -Wall -Xcompiler -Wextra -m64
g++ -c compare.cpp -I /usr/include -O3 -Wall -Wextra -m64 -I /usr/local/cuda-10.1/include
g++ -c reference_calc.cpp -I /usr/include -O3 -Wall -Wextra -m64 -I /usr/local/cuda-10.1/include
nvcc -o HW1 main.o student_func.o compare.o reference_calc.o -L /usr/lib -lopencv_core -lopencv_imgproc -lopencv_highgui -O3 -arch=sm_52 -Xcompiler -Wall -Xcompiler -Wextra -m64
Success, 吃饭!!
wlsh@wlsh-ThinkStation:~/Desktop/cs344-master/Problem Sets/Problem Set 1$ ./HW1 cinque_terre_small.jpg
Your code ran in: 0.024704 msecs.
Difference at pos 905
Reference: 252
GPU : 253