GPU 架构基础之 Concurrent Kernel Execution in Fermi arch & later

最新推荐文章于 2023-03-20 13:00:18 发布

__DARK__

最新推荐文章于 2023-03-20 13:00:18 发布

阅读量1k

点赞数

分类专栏： GPU 体系架构 CUDA learning 文章标签： gpu concurrent

本文链接：https://blog.csdn.net/dark5669/article/details/60764024

版权

GPU 体系架构同时被 2 个专栏收录

24 篇文章 1 订阅

订阅专栏

CUDA learning

23 篇文章 6 订阅

订阅专栏

Fermi supports concurrent kernel execution, where different kernels of the same application
context can execute on the GPU at the same time. Concurrent kernel execution allows
programs that execute a number of small kernels to utilize the whole GPU. For example, a
PhysX program may invoke a fluids solver and a rigid body solver which, if executed
sequentially, would use only half of the available thread processors. On the Fermi architecture,
different kernels of the same CUDA context can execute concurrently, allowing maximum
utilization of GPU resources. Kernels from different application contexts can still run
sequentially with great efficiency thanks to the improved context switching performance.

。费米架构（计算能力 2.x）及以上支持 kernel 在同一 GPU上并发执行，意思就是说，只要 GPU资源（SMs ，Mem）足够，并且kernel 在不同流发射（stream launch），我理解是没有什么依赖关系，那么，不同kernel即可并发执行。

下图是个很好的例子。来自fermi架构白皮书

这里写图片描述

__DARK__

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
GPU 架构基础之 Concurrent Kernel Execution in Fermi arch & later

Fermi supports concurrent kernel execution, where different kernels of the same application context can execute on the GPU at the same time. Concurrent kernel execution allows programs that execute a
复制链接

扫一扫