CUDA的目的就是将大量的计算分配给GPU进行快速运算来节省时间。我们希望在设备上(显卡)上分配内存和执行代码,当今的显卡可能包含多个GPU。如,某些NVIDIA产品-GeForce GTX TITAN X,就是在单块卡上包含两个GPU,所以装配该显卡的计算机拥有两个支持CUDA的处理器。
从CUDA3.0开始,在cudaDeviceProp结构中包含了以下信息:
struct cudaDeviceProp{
char name[256];
size_t totalGlobalMem;
size_t shaerdMemPerBlock;
int regsPerBlock;
int warpSize;
size_t memPitch;
int maxThreadsPerBlock;
int maxThreadsDim[3];
int maxGridSize[3];
size_t totalConstMem;
int major;
int minor;
int clockRate;
size_t textureAlignment;
int deviceOverlap;
int multiProcessorCount;
int kernelExecTimeoutEnabled;
int integrradted;
int canMapHostMemory;
int computeMode;
int maxTexture1D;
int maxTexture2D[2];
int maxTexture3D[3];
int maxTexture2DArray[3];
int concurrentKernels;
}
某些属性可能见文知意很容易理解,但是为了更好的学习还是说明一下吧。见下表。
具体操作代码如下:
int main() {
cudaDeviceProp prop;
int count;
cudaGetDeviceCount(&count);
for (int i = 0; i < count; i++)
{
cudaGetDeviceProperties(&prop, i);
printf(" --- General Inromation for device %d ---\n", i);
printf("Name: %s\n", prop.name);
printf("Compute capability: %d.%d\n", prop.major, prop.minor);
printf("Clock rate: %d\n", prop.clockRate);
printf("Device copy overlap: ");
if (prop.deviceOverlap)
printf("Enabled\n");
else
printf("Dissabled\n");
printf("Kernel execition timeout : ");
if (prop.kernelExecTimeoutEnabled)
printf("Enabled\n");
else
printf("Dissabled\n");
printf("Kernel execition timeout : ");
if (prop.kernelExecTimeoutEnabled)
printf("Enabled\n");
else
printf("Dissabled\n");
printf(" --- Memory Information for device %d ---\n", i);
printf("Total global mem: %ld\n", prop.totalGlobalMem);
printf("Total constant Mem: %ld \n", prop.totalConstMem);
printf("Max mem pitch: %ld\n", prop.memPitch);
printf("Texture Alignment: %ld\n", prop.textureAlignment);
printf(" ---MP Information for device %d ---\n", i);
printf("Multiprocessor count: %d\n", prop.multiProcessorCount);
printf("Shared mem per mp: %ld\n", prop.sharedMemPerBlock);
printf("Registers per mp: %d\n", prop.regsPerBlock);
printf("Threads in warp: %d\n", prop.warpSize);
printf("Max threads per block: %d\n", prop.maxThreadsPerBlock);
printf("Max thread dimensions: (%d, %d, %d)\n", prop.maxThreadsDim[0], prop.maxThreadsDim[1], prop.maxThreadsDim[2]);
printf("Max grid dimensions: (%d, %d,%d)\n", prop.maxGridSize[0], prop.maxGridSize[1], prop.maxGridSize[2]);
printf("\n");
}
return 0;
}
对GPU进行简单的了解后,在后续的GPU运算中能够轻松的进行GPU的调用以及内存相关问题的操作。