出现这个问题。找到了原因是,在global函数中调用了__device__函数,但是这两个文件不在同一个src文件里面。
http://stackoverflow.com/questions/31006581/cuda-device-unresolved-extern-function
The issue is that you defined a __device__
function in separate compilation unit from __global__
that calls it. You need to either explicitely enable relocatable device code mode by adding -dc
flag or move your definition to the same unit.
From nvcc
documentation:
--device-c|-dc
Compile each .c/.cc/.cpp/.cxx/.cu input file into an object file that contains relocatable device code. It is equivalent to--relocatable-device-code
=true--compile
.
See Separate Compilation and Linking of CUDA C++ Device Code for more information.
http://stackoverflow.com/questions/17188527/cuda-external-class-linkage-and-unresolved-extern-function-in-ptxas-file
因此解决的方式有2个。
第一是两个函数放到同一个cu文件中。
第二是在cu文件属性页面选项卡中 cuda c/c++->common->Generate Relocatable Device Code 选择-rdc=true。允许重定位device代码编译。或者在整个工程的cuda c/c++项中配置这个-rdc=true.
解决问题。
其他参考
https://devtalk.nvidia.com/default/topic/524436/how-to-deal-with-ptxas-fatal-error-unresolved-extern-function-39-cudagetparameterbuffer-39-/
1) View -> Property Pages
2) Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
3) Configuration Properties -> CUDA C/C++ -> Code Generation -> compute_35,sm_35
4) Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib
——————————————————————
另附:http://www.cnblogs.com/djiankuo/p/5910502.html