CUDA C并行编程
JenKinJia
做大做强,再创辉煌
展开
-
C语言实现浮点数S的求和规约
#include <stdio.h> #include <stdlib.h> #include <sys/time.h> double seconds() { struct timeval tp; struct timezone tzp; int i = gettimeofday(&tp, &tzp); return ((double)tp.tv_sec + (double)tp.tv_usec * 1.e-6); } .原创 2022-05-19 20:12:42 · 304 阅读 · 0 评论 -
CUDA C编程 避免分支分化的九种并行规约方式
#include "../common/common.h" #include <cuda_runtime.h> #include <stdio.h> /* * This code implements the interleaved and neighbor-paired approaches to * parallel reduction in CUDA. For this example, the sum operation is used. A * variety o.原创 2022-05-16 14:33:20 · 451 阅读 · 0 评论 -
CUDA C线程ID计算
一维: threadIdx.x 二维: threadIdx.y * blockDim.x + threadIdx.x 三维: threadIdx.z * blockDim.y * blockDim.x + threadIdx.x x为最内层的维度,y为第二个维度,z为最外层的维度原创 2022-05-13 10:43:10 · 258 阅读 · 0 评论 -
CUDA C并行编程--nvidia-smi信息
1、nvidia-smi -L 查询设备的信息 输出: 2、 nvidia-smi -q -i 0 查询设备的详细信息 未截图全原创 2022-05-12 15:43:03 · 121 阅读 · 0 评论 -
CUDA C并行编程--查询设备信息
#include <cuda_runtime.h> #include <cuda_runtime_api.h> #include <stdio.h> int main(int argc, char** argv){ printf("%s starting ...\n", argv[0]); int deviceCount = 0; cudaError error_id = cudaGetDeviceCount(&deviceCoun...原创 2022-05-12 15:30:57 · 339 阅读 · 0 评论