nvrtc教程（全篇无废话）

最新推荐文章于 2024-10-09 03:56:13 发布

fengfeng9853

最新推荐文章于 2024-10-09 03:56:13 发布

阅读量2.3k

点赞数 28

文章标签： c++

本文链接：https://blog.csdn.net/fengfeng9853/article/details/138114086

版权

前言

文章使用的cuda版本：12.4
nvrtc（nvidia runtime compilation），是一个用来实现代码运行时动态编译.cu文件的库。使用这个库并不需要配置cuda环境（但仍需要cuda toolkit），因此对于那些死活配置不好cuda环境的（比如我），这可以是另一种使用kernel的办法（但没有直接nvcc来得方便）。
关于nvrtc的使用，如果有能力的话可以参考官方文档

准备工作

首先需要安装cuda toolkit，配置好include目录（默认目录应该是C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include）和lib目录（默认目录应该是C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\lib\x64）以及链接lib（干脆把lib目录下所有lib都链接了吧）。

需要的头文件：

#include<stdio.h>
#include<stdlib.h>
#include<nvrtc.h>
#include<cuda_runtime.h>
#include<cuda.h>

nvrtc.h的函数会返回一个值，可以用来判断函数是否成功运行，如果成功就会返回NVRTC_SUCCESS，返回类型为nvrtcResult，并且可以用nvrtcGetErrorString函数来获取这个错误的字符串形式。写一个简单的宏来实现这个功能

#define NVRTC_SAFE_CALL(x)\
    if(1)\
    {
     \
        nvrtcResult result = x;\
        if (result != NVRTC_SUCCESS)\
        {
     \
            printf("\nerror: " #x " failed with error %s\n", nvrtcGetErrorString(result));\
            system("pause");\
            exit(1);\
        }\
    }\

cuda.h和cuda_runtime.h的函数也有类似的返回值CUDA_SUCCESS，同样写一个宏函数：

#define CUDA_SAFE_CALL(x)\
    if(1)\
    {
     \
        CUresult result = x;\
        if (result != CUDA_SUCCESS)\
        {
     \
            const char *msg;\
            cuGetErrorName(result, &msg);\
            printf("\nerror: " #x " failed with error %s\n",msg);\
            system("pause");\
            exit(1);\
        }\
    }\

之后每次使用cuda的函数，都会调用这些宏，比如：

NVRTC_SAFE_CALL(nvrtcCreateProgram(&prog, source,  0,0,0,0));

此外，在主函数中应初始化cuda：

void CuInit()
{
   
	//指定设备并创建context
    CUdevice cuDevice;
    CUcontext context;
    CUDA_SAFE_CALL(cuInit(0));
    CUDA_SAFE_CALL(cuDeviceGet(&cuDevice, 0));
    CUDA_SAFE_CALL(cuCtxCreate(&context, 0, cuDevice));
}

把.cu文件转化为字符串

nvrtc要求输入的是字符串风格的.cu代码，比如下面这种：

const char *saxpy = "                                           \n\
extern \"C\" __global__                                         \n\
void saxpy(float a, float *x, float *y, float *</