CUDA的Occupancy和Achieved Occupancy概念

官方原文

Occupancy is defined as the ratio of active warps on an SM to the maximum number of active warps supported by the SM.

即Occupancy = active warps / maximum number of active warps on this SM

Low occupancy results in poor instruction issue efficiency, because there are not enough eligible warps to hide latency between dependent instructions.

即这里的“active warps",是指可以随时context switch切换到处理器上的那些warps,不是指处理器上正在执行的warps! 

active warps多些,可以将因IO等因素hang住的warps隐藏起来,处理器一直忙碌;

每种型号的SM有理论上限,这些是制约Theoretical Occupancy的因素:

1. Warps per SM;

2. Blocks per SM; (同一个SM上可以同时执行多个Block)

3. Registers per SM;

4. Shared memory per SM;

5. Registers & Shared memory used per Block;

 

Achieved Occupancy:Occupancy在不同时刻,所有SM范围,上的平均值;

其低于Theoretical Occupancy的原因:

1. block内部执行时间不balance: 有的warp结束得早,有的结束得晚;(tail effect)

2. block之间不balance: 有的block结束得早,有的结束得晚;

3. Launch的block数目太少;

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值