多个cu文件
nvcc -cubin -m64 -lcudadevrt -lcublas_device -gencode arch=compute_35,code=sm_35 -o test.cubin -c test.cu -dlink
You can also do that in two steps:
nvcc -m64 test.cu -gencode arch=compute_35,code=sm_35 -o test.o -dc nvcc -dlink test.o -arch sm_35 -lcublas_device -lcudadevrt -cubin -o test.cubin