cuda的PACK

最新推荐文章于 2024-08-28 22:47:27 发布

shenlan282

最新推荐文章于 2024-08-28 22:47:27 发布

阅读量806

点赞数

分类专栏： cuda/GPGPU

cuda/GPGPU 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

https://devtalk.nvidia.com/default/topic/387841/structure-pack-issue/

When I copy an instance of the following structurefrom Visual Studio C++ code to CUDA code I get erroneous results.

struct DimmVars

{

uint2 dimData;

uint2 dimSegment;

unsigned * pSrc;

};

Some further investigation tells me that the sizes of the structure in C++is different then in CUDA.

printf("SizeofDimmVars: %d \r\n", sizeof(_dimmVars));

In Visual C++ this prints: Sizeof DimmVars: 20

In Cuda this prints: Sizeof DimmVars: 24

I thought this was a pack issue, but #pragma pack(show) tells me thatpragma size is for c++ the default (8) (cuda does not support this feature). SoI'm a little bit lost, and am not sure how to solve the problem. Can anybodyhelp?

I found the solution.

Apparantly the built-in types are defined to be aligned on a 8 byte boundary.The problem is that this is done only for the CUDA compiler.

An example type:

/*DEVICE_BUILTIN*/

struct __builtin_align__(8) uint2

{

unsigned int x, y;

};

The __builtin_align__ is defined for the CUDAcompiler, but not for the GCC/MSVC compiler, as a result the data is mostlikely aligned differently. So when you copy a structure containing this typefrom Visual C++ to CUDA, data can be misaligned (depending on the layout of thestructure).

For me, the solution was to modify the structuredefinition to:

struct DimmVars

{

uint2 __align__(8) dimData;

uint2 __align__(8) dimSegment;

unsigned * pSrc;

};

My suggestion is to at least mention this in theCUDA SDK Sample for the CPP-integration project, which shows how simple it isto use built-in types in C++ code.