解决ROIAlign_cuda.cu和ROIPool_cuda.cu编译错误(遇到的两个问题)
报错1:
D:/python/frankmocap-master/frankmocap-master/detectors/hand_object_detector/lib/model/csrc/cuda/ROIAlign_cuda.cu(275): error: no instance of function template "THCCeilDiv" matches the argument list
argument types are: (long long, long)
D:/python/frankmocap-master/frankmocap-master/detectors/hand_object_detector/lib/model/csrc/cuda/ROIAlign_cuda.cu(275): error: no instance of overloaded function "std::min" matches the argument list
argument types are: (<error-type>, long)
D:/python/frankmocap-master/frankmocap-master/detectors/hand_object_detector/lib/model/csrc/cuda/ROIAlign_cuda.cu(320): error: no instance of function template "THCCeilDiv" matches the argument list
argument types are: (int64_t, long)
D:/python/frankmocap-master/frankmocap-master/detectors/hand_object_detector/lib/model/csrc/cuda/ROIAlign_cuda.cu(320): error: no instance of overloaded function "std::min" matches the argument list
argument types are: (<error-type>, long)
4 errors detected in the compilation of "C:/Users/18295/AppData/Local/Temp/tmpxft_00003514_00000000-10_ROIAlign_cuda.cpp1.ii"
解决方法:
修改maskscoring_rcnn/maskrcnn_benchmark/csrc/cuda/ROIAlign_cuda.cu第275行:
//原代码: dim3 grid(std::min(THCCeilDiv(output_size, 512L), 4096L));
dim3 grid(std::min(((int)output_size + 512 -1) / 512, 4096));
修改maskscoring_rcnn/maskrcnn_benchmark/csrc/cuda/ROIAlign_cuda.cu第320行:
//原代码: dim3 grid(std::min(THCCeilDiv(grad.numel(), 512L), 4096L));
dim3 grid(std::min(((int)(grad.numel()) + 512 -1) / 512, 4096));
修改maskscoring_rcnn/maskrcnn_benchmark/csrc/cuda/ROIPool_cuda.cu第129行:
// dim3 grid(std::min(THCCeilDiv(output_size, 512L), 4096L));
dim3 grid(std::min(((int)output_size + 512 -1) / 512, 4096));
修改maskscoring_rcnn/maskrcnn_benchmark/csrc/cuda/ROIPool_cuda.cu第176行:
//dim3 grid(std::min(THCCeilDiv(grad.numel(), 512L), 4096L));
dim3 grid(std::min(((int)(grad.numel()) + 512 -1) / 512, 4096));
修改完毕后再执行python setup.py build develop
即可
报错2:
error: ROIAlign_cuda.cu和ROIPool_cuda.cu文件中"__floorf"和"__ceilf"错误
error: calling a __host__ function("__floorf") from a __device__
function("get_coordinate_weight<float> ") is not allowed
error: identifier "__floorf" is undefined in device code
error: calling a __host__ function("__ceilf") from a __global__
function("detectron2::RoIAlignRotatedForward<float> ") is not allowed
error: identifier "__ceilf" is undefined in device code
解决方法:
找到ROIAlign_cuda.cu和ROIPool_cuda.cu文件,把里面的ceil
和floor
后面都加上f
,即ceil
改为ceilf
、floor
改为floorf
执行成功如下: