关于CUDA中half-warp的简述

接触bank conflict时会引入 half-warp 这个概念,不过新的硬件架构中已经弱化这个概念了,并行单元都是 full-warp。具体的概述这里引用一个网站,个人认为适合新手来理解:

What's the mechanism of the warps and the banks in CUDA? - Stack Overflow

 网站中的结论这里直接附上,由于翻译可能表达不到位,这里写上原文:

 

Therefore a memory transaction is immediately visible across a full warp. As a result, memory requests are issued at the per-warp level, rather than per-half-warp. However, a full memory request can only retrieve 128 bytes at a time. Therefore, for data sizes larger than 32 bits per thread per transaction, the memory controller may still break the request down into a half-warp size.

My view is that, especially for a beginner, it's not necessary to have a detailed understanding of half-warp. It's generally sufficient to understand that it refers to a group of 16 threads executing together and it has implications for memory requests.

  • 5
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值