多文件编译问题:ptxas fatal : Unresolved extern function ‘xxxx’

weixin_47744790

已于 2023-11-07 20:19:45 修改

阅读量480

点赞数

文章标签：笔记

于 2023-11-07 20:17:39 首次发布

本文链接：https://blog.csdn.net/weixin_47744790/article/details/134275227

版权

问题描述: 如果cuda的 __device__ 或者 __global__ 需要调用定义在其他文件中的函数, 直接使用nvcc编译时会报此类错误.

解决方法:

方法1: 编译时添加 -rdc=true 选项即可. (编译 + 链接)

nvcc *cu -rdc=true -o <my_program_name>

方法2: 先添加 -dc 选项, 编译出所有 .o 文件, 然后再使用nvcc链接到一起.

nvcc *.cu -dc  # 先编译
nvcc *.o -o <my_program_name>  # 后链接

原因: 在编译cuda程序时, 需要将device code嵌入到host obj当中, 这样在函数调用时才能定位到函数地址. 由于是分文件编译, 事先并不知道所调用的函数具体的地址, 因此需要先将各个文件编译成relocatable形式, 然后再链接到一起, 确定要调用的函数实际的地址, 形成可执行文件.

上述编译流程实际上是跟gcc是一样的, 之所以要多这一步是因为在cuda 5.0以前, nvcc并不支持多文件编译. 在5.0之后才开始支持, 为了保留之前的编译方式, 多文件编译选项默认是disable的, 需要参数明确指定

示例

假设有三份文件, 分别为 myclass.cuh 、myclass.cu 、main.cuh , main.cuh 中test函数会调用定义在myclass.cu中的函数

// myclass.cuh
#ifndef __MYCLASS_CUH_
#define __MYCLASS_CUH_
#include "stdio.h"
class A{
	int a;
public:
	__device__ void say_hello();
};

__device__ void warp(A *ptr);
#endif

// myclass.cu
#include "myclass.cuh"

__device__ void A::say_hello()
{
	printf("hello\n");
}

// main.cu
#include "myclass.cuh"

__global__ void test(A* ptr){
	ptr->say_hello();
}

int main(){
	A* d_ptr;
	cudaMalloc(&d_ptr, sizeof(A));

	dim3 grid(1),block(1);
	test<<<grid, block>>>(d_ptr);
	cudaDeviceSynchronize();
}

编译方法

nvcc main.cu myclass.cu -rdc=true -o my_cuda_program

# 或者采用下面这种方法:
nvcc -dc main.cu myclass.cu
nvcc main.o myclass.o -o my_cuda_program

参考

Compiler Driver NVCC

StackOverflow: how to compile multiple files in cuda

weixin_47744790

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
多文件编译问题:ptxas fatal : Unresolved extern function ‘xxxx’

在编译cuda程序时, 需要将device code嵌入到host obj当中, 这样在函数调用时才能定位到函数地址. 由于是分文件编译, 事先并不知道所调用的函数具体的地址, 因此需要先将各个文件编译成relocatable形式, 然后再链接到一起, 确定要调用的函数实际的地址, 形成可执行文件.需要调用定义在其他文件中的函数, 直接使用nvcc编译时会报此类错误.中test函数会调用定义在myclass.cu中的函数。假设有三份文件, 分别为。
复制链接

扫一扫