如何设置CUDA Kernel中的grid_size和block_size?

f8ec6a4fc713cb4066f6a8b41080e56b.png

撰文 | 柳俊丞

一般而言,我们在代码中会看到使用以下方式启动一个 CUDA kernel:

 
 
cuda_kernel<<<grid_size, block_size, 0, stream>>>(...)

cuda_kernel 是 global function 的标识,(...) 中是调用 cuda_kernel 对应的参数,这两者和 C++ 的语法是一样的,而 <<<grid_size, block_size, 0, stream>>> 是 CUDA 对 C++ 的扩展,称之为 Execution Configuration(https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#execution-configuration),参考 CUDA C++ Programming Guide (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#abstract,后续简称 Guide ) 中的介绍:

The execution configuration is specified by inserting an expression of the form <<< Dg, Db, Ns, S >>> between the function name and the parenthesized argument list, where:

  • Dg is of type dim3 (see dim3) and specifies the dimension and size of the grid, such that Dg.x * Dg.y * Dg.z equals the number of blocks being launched;

  • Db is of type dim3 (see dim3) and specifies the dimension and size of each block, such that Db.x * Db.y * Db.z equals the number of threads per block;

  • Ns is of type size_t and specifies the number of bytes in shared memory that is dynamically allocated per block for this call in addition to the statically allocated memory; this dynamically allocated memory is used by any of the variables declared as an external array as mentioned in shared

  • 3
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值