cuda 函数指针_Cuda函数指针

I was trying to make somtehing like this (actually I need to write some integration functions) in CUDA

I tried this but it did not worked - it's only caused.

Error: Function pointers and function template parameters are not supported in sm_1x.

float f1(float x) {

return x;

}

__global__ void tabulate(float lower, float upper, float p_function(float), float*result){

for (lower; lower < upper; lower++) {

*result = *result + p_function(lower);

}

}

int main(){

float res;

float* dev_res;

cudaMalloc( (void**)&dev_res, sizeof(float) ) ;

tabulate<<<1,1>>>(0.0, 5.0, f1, dev_res);

cudaMemcpy(&res, dev_res, sizeof(float), cudaMemcpyDeviceToHost ) ;

printf("%f\n", res );

/************************************************************************/

scanf("%s");

return 0;

}

解决方案

To get rid of your compile error, you'll have to use -gencode arch=compute_20,code=sm_20 as a compiler argument when compiling your code. But then you'll likely have some runtime problems:

Function pointers to __global__ functions are supported in host code, but not in device code.

Function pointers to __device__ functions are only supported in device code compiled for devices of compute capability 2.x and higher.

It is not allowed to take the address of a __device__ function in host code.

so you can have something like this (adapted from the "FunctionPointers" sample):

//your function pointer type - returns unsigned char, takes parameters of type unsigned char and float

typedef unsigned char(*pointFunction_t)(unsigned char, float);

//some device function to be pointed to

__device__ unsigned char

Threshold(unsigned char in, float thresh)

{

...

}

//pComputeThreshold is a device-side function pointer to your __device__ function

__device__ pointFunction_t pComputeThreshold = Threshold;

//the host-side function pointer to your __device__ function

pointFunction_t h_pointFunction;

//in host code: copy the function pointers to their host equivalent

cudaMemcpyFromSymbol(&h_pointFunction, pComputeThreshold, sizeof(pointFunction_t))

You can then pass the h_pointFunction as a parameter to your kernel, which can use it to call your __device__ function.

//your kernel taking your __device__ function pointer as a parameter

__global__ void kernel(pointFunction_t pPointOperation)

{

unsigned char tmp;

...

tmp = (*pPointOperation)(tmp, 150.0)

...

}

//invoke the kernel in host code, passing in your host-side __device__ function pointer

kernel<<<...>>>(h_pointFunction);

Hopefully that made some sense. In all, it looks like you would have to change your f1 function to be a __device__ function and follow a similar procedure (the typedefs aren't necessary, but they do make the code nicer) to get it as a valid function pointer on the host-side to pass to your kernel. I'd also advise giving the FunctionPointers CUDA sample a look over

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值