2011年06月_roctang2006

12月 11月 10月 06月 03月 01月

原创阅读《大规模并行处理器程序设计》影印版心得第六章 Performance Consideration

6.1 More on Thread Execution warp的概念 warp是如何组织的：按x,y,z逐渐增大的方式来线性化多维方式组织的线程，然后从前往后，每32个线程为一个warp The hardware executes an instruction for all threads in the same warp before moving to th

2011-06-23 01:19:00 810

原创阅读《大规模并行处理器程序设计》影印版心得第五章 CUDA Memories

主要意图是：global memory太慢（几百个时钟周期），带宽太小。我们编程时，应该努力少用global memory，而更多使用shared memory和constant memory等快速memory 5.1 Importance of Memory Access Efficiency CGMA 刻画做一次浮点运算需要做几次global memory访问，此值

2011-06-22 17:36:00 901

原创阅读《大规模并行处理器程序设计》影印版心得第四章 CUDA Threads

4.1 CUDA Thread Organization 具体例子：一个grid中有N个block，但是以一维的形式组织起来。每一个block中有M个线程，也以一维的形式组织起来。则任何一个block中的线程可以号可以用公式 threadID = blockIdx.x *blockDim.x +threadIdx.x来计算。两个变量：gridDim和blockDim, g

2011-06-20 23:16:00 914

原创阅读《大规模并行处理器程序设计》影印版心得第三章 Introduction to CUDA

3.1 data parallelism 数据可并行化处理是应用GPU计算的核心。矩阵相乘是简单的数据可并行化的例子，更多的应用中体现出更复杂的数据并行化。 3.2 CUDA program structure grid -- 每一个kernel调用时，所生成的所有threads，统称为一个grid，可以认为grid是threads的一个组织单位。 3.3 a

2011-06-20 17:39:00 714

原创溢出攻击的几个实例（教学用例）

第一个实例---------------------------------------------------------------------------------------#include "stdafx.h"#include "test1.h"#ifdef _DEBUG#define new DEBUG_NEW#undef THIS_FILEstati

2011-06-13 16:53:00 1081

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

原创 阅读 《大规模并行处理器程序设计》影印版心得 第六章 Performance Consideration

原创 阅读 《大规模并行处理器程序设计》影印版心得 第五章 CUDA Memories

原创 阅读 《大规模并行处理器程序设计》影印版心得 第四章 CUDA Threads

原创 阅读 《大规模并行处理器程序设计》影印版心得 第三章 Introduction to CUDA

原创 溢出攻击的几个实例 （教学用例）

空空如也

空空如也

原创阅读《大规模并行处理器程序设计》影印版心得第六章 Performance Consideration

原创阅读《大规模并行处理器程序设计》影印版心得第五章 CUDA Memories

原创阅读《大规模并行处理器程序设计》影印版心得第四章 CUDA Threads

原创阅读《大规模并行处理器程序设计》影印版心得第三章 Introduction to CUDA

原创溢出攻击的几个实例（教学用例）