CUDA 4.0中P2P与UVA的性特性使用方法

本文链接：https://blog.csdn.net/dreampursue/article/details/6256426

CUDA 4.0引入了P2P（Peer-to-Peer）和UVA（Unified Virtual Address Space）特性，本文通过SDK中的simpleP2P示例代码，展示了如何在GPU间实现数据的直接传递和统一地址空间访问。示例代码包含设备检查、内存分配、启用P2P访问、内核函数调用和验证等步骤，强调了UVA简化内存操作的优势。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

近日，CUDA 4.0已经对注册开发者开放，其中增加了不少的功能。其中P2P（ Peer-to-Peer ）与UVA（ Unified Virtual Address Space ）的引进最为大家关心。这里与大家一起分享下SDK中的simpleP2P这个例子，他展示了如何使用这两个功能。代码如下：

/*
 * Copyright 1993-2011 NVIDIA Corporation.  All rights reserved.
 *
 * Please refer to the NVIDIA end user license agreement (EULA) associated
 * with this source code for terms and conditions that govern your use of
 * this software. Any use, reproduction, disclosure, or distribution of
 * this software and related documentation outside the terms of the EULA
 * is strictly prohibited.
 *
 */
/*
 * This sample demonstrates a combination of Peer-to-Peer (P2P) and Unified
 * Virtual Address Space (UVA) features new to SDK 4.0
 */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <cutil_inline.h>
#include <cuda_runtime_api.h>
const char *sSDKsample = "simpleP2P";
__global__ void SimpleKernel(float *src, float *dst)
{
    // Just a dummy kernel, doing enough for us to verify that everything
    // worked
    const int idx = blockIdx.x * blockDim.x + threadIdx.x;
    dst[idx] = src[idx] * 2.0f;
}
int main(int argc, char **argv)
{
    printf("[%s] starting.../n", sSDKsample);
    // Number of GPUs
    printf("Checking for multiple GPUs.../n");
    int gpu_n;
    cutilSafeCall(cudaGetDeviceCount(&gpu_n));
    printf("CUDA-capable device count: %i/n", gpu_n);
    if (gpu_n < 2)
    {
        printf("Two or more Tesla(s) with (SM 2.0) class GPUs are required for %s./n", sSDKsample);
        printf("Waiving test./n");
        printf("PASSED/n");
        e