In order to speed up execution in a big project, one might use CUDA functions. CUDA supports only C & a little bit C++. So, it is not possible to write whole functions again in CUDA. so, you can write a host wrapper function in CUDA which calls the kernel device function. Here is the procedure to do it.
1) include this line in the file where you need to call CUDA function
extern void wrapperfunction(void);
Instead of void, you can pass any values and get any values as you wish because it is simple C function.You can call this function from any part of your C/C++ code.
2) create another file named wrapper.cu and include these lines.
#include <stdio.h> __global__ void kernel(int *a, int *b,int *c){ int tid=threadIdx.x; //your code for kernel } void wrapperfunction(void){ // your code for initialization, copying data to device memory, kernel<<<32,32>>>(a,b,c); //kernel call //your code for copying back the result to host memory & return }
3) Now, compile this file using following command.
nvcc -c wrapper.cu
4) Now, link the object file created while linking in C/C++ project
g++ -o program -L/usr/local/cuda/lib64 -lcuda -lcudart main.cpp wrapper.o
NOTE: In order to compile this program, you don’t need a PC with GPU. if you have installed CUDA environment, you can compile. In order to execute, you need a PC with GPU. Don’t forget that if you are using a 64 bit machine to link to the 64 bit library!
ALTERNATIVES:
If you want your executable to run both in PC without GPU and PC with GPU, follow this procedure. Create another host C function
configureCudaDevice()
in the wrapper.cu file (with prototype in the header) that queries and configures the GPU devices present and returns true if it is successful, false if not. If you use many functions, you can create a header file which contains extern declarations of all wrapper functions.
Include this in “mycudaImplementations.h”,
extern void runCudaImplementation(void); //wrapper unction
extern bool configureCudaDevice(); //own function to find CUDA enabled device is present
Include this code in main C/C++ implementation,
#include"myCudaImplementations.h"
// at app initialization // store this variable somewhere you can access it later bool deviceConfigured = configureCudaDevice(); ... // then later, at run time if(deviceConfigured) runCudaImplementation(); //run wrapper function for CUDA code else runCpuImplementation(); // run the original code
Here is my make file:
all: program
program: cudacode.o
g++ -o program -L/usr/local/cuda/lib64 -lcuda -lcudart main.cpp wrapper.o
cudacode.o:
nvcc -c wrapper.cu
clean: rm -rf *o program
http://techbird.wordpress.com/2012/07/30/calling-cuda-program-from-cc-project/