《Learning CUDA Programming》读书笔记(一)

 

GPU擅长执行可并行的代码(data-parallel),CPU用来执行其他对延迟敏感的串行代码;

GPU:throughput-bound(以提升吞吐量为目标);  CPU:latency-bound(以减少延迟为目标)

GPU属于SIMT,不是SIMD

CPU的时钟周期高,为了跟上处理器的步伐,必须加快访存速度,需要:1. 数据cache;2.指令预取;

GPU的线程上下文在寄存器中,所以线程切换快;GPU寄存器多(可达几十MB);CPU寄存器少,线程上下文都在内存/cache中,所以线程切换慢;GPU线程切换快-->可以用一部分线程占用处理器来隐藏另一部分线程的访存延迟;

GPU的3大编程方式:1. 调库;2.OpenACC;3.手写kernel

cudaDeviceSynchronize: 一般用来等待kernel执行结束,CPU线程再往下走;

注意内存对齐,否则可能会产生冗余访存指令和次数。自定义struct可以用__align__(bytes)关键词来让编译器给对齐;

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
CUDA programming: a developer's guide to parallel computing with GPUs. by Shane Cook. Over the past five years there has been a revolution in computing brought about by a company that for successive years has emerged as one of the premier gaming hardware manufacturersdNVIDIA. With the introduction of the CUDA (Compute Unified Device Architecture) programming language, for the first time these hugely powerful graphics coprocessors could be used by everyday C programmers to offload computationally expensive work. From the embedded device industry, to home users, to supercomputers, everything has changed as a result of this. One of the major changes in the computer software industry has been the move from serial programming to parallel programming. Here, CUDA has produced great advances. The graphics processor unit (GPU) by its very nature is designed for high-speed graphics, which are inherently parallel. CUDA takes a simple model of data parallelism and incorporates it into a programming model without the need for graphics primitives. In fact, CUDA, unlike its predecessors, does not require any understanding or knowledge of graphics or graphics primitives. You do not have to be a games programmer either. The CUDA language makes the GPU look just like another programmable device. Throughout this book I will assume readers have no prior knowledge of CUDA, or of parallel programming. I assume they have only an existing knowledge of the C/C++ programming language. As we progress and you become more competent with CUDA, we’ll cover more advanced topics, taking you from a parallel unaware programmer to one who can exploit the full potential of CUDA. For programmers already familiar with parallel programming concepts and CUDA, we’ll be discussing in detail the architecture of the GPUs and how to get the most from each, including the latest Fermi and Kepler hardware. Literally anyone who can program in C or C++ can program with CUDA in a few hours given a little training. Getting from novice CUDA programmer, with a several times speedup to 10 times–plus speedup is what you should be capable of by the end of this book. The book is very much aimed at learning CUDA, but with a focus on performance, having first achieved correctness. Your level of skill and understanding of writing high-performance code, especially for GPUs, will hugely benefit from this text. This book is a practical guide to using CUDA in real applications, by real practitioners. At the same time, however, we cover the necessary theory and background so everyone, no matter what their background, can follow along and learn how to program in CUDA, making this book ideal for both professionals and those studying GPUs or parallel programming.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值