命令行nvcc编译cuda程序出现不运行device(GPU)部分代码的解决方案

3 篇文章 0 订阅
问题描述

之前在visual studio 2019中编写的cuda代码,使用nvcc命令行编译时出现了一点问题,单个文件不运行CUDA,GPU部分的代码,问题复现如下:

命令行编译.cu文件时,假设我的代码是:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-EctqAS5L-1637135294499)(编译运行.assets/image-20211117153618875.png)]

使用下面命令进行编译:

nvcc .\cuda.cu -o cuda
	编译输入的文件名   输出的文件名

编译生成下列文件:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-BWFFFQju-1637135294507)(编译运行.assets/image-20211117153633648.png)]

运行cuda.exe

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wUIkh8gD-1637135294509)(编译运行.assets/image-20211117154102154.png)]

只会输出hello,并没有执行device上的代码,通过·测试,所有关于CUDA的命令都会有问题,这个时候查看nvcc的帮助文档,找到了问题所在,

解决方案

编译写好的.cu文件时,要指定必须为其编译CUDA输入文件的NVIDIA “虚拟” GPU体系结构类的名称

nvcc --help输出如下:

--gpu-architecture <arch>                  (-arch)                              
        Specify the name of the class of NVIDIA 'virtual' GPU architecture for which
        the CUDA input files must be compiled.
        With the exception as described for the shorthand below, the architecture
        specified with this option must be a 'virtual' architecture (such as compute_50).
        Normally, this option alone does not trigger assembly of the generated PTX
        for a 'real' architecture (that is the role of nvcc option '--gpu-code',
        see below); rather, its purpose is to control preprocessing and compilation
        of the input to PTX.
        For convenience, in case of simple nvcc compilations, the following shorthand
        is supported.  If no value for option '--gpu-code' is specified, then the
        value of this option defaults to the value of '--gpu-architecture'.  In this
        situation, as only exception to the description above, the value specified
        for '--gpu-architecture' may be a 'real' architecture (such as a sm_50),
        in which case nvcc uses the specified 'real' architecture and its closest
        'virtual' architecture as effective architecture values.  For example, 'nvcc
        --gpu-architecture=sm_50' is equivalent to 'nvcc --gpu-architecture=compute_50
        --gpu-code=sm_50,compute_50'.
        -arch=all         build for all supported architectures (sm_*), and add PTX
        for the highest major architecture to the generated code.
        -arch=all-major   build for just supported major versions (sm_*0), plus the
        earliest supported, and add PTX for the highest major architecture to the
        generated code.
        Note: -arch=all, -arch=all-major cannot be used with the -code option, but
        can be used with -gencode options
        Note: the values compute_30, compute_32, compute_35, compute_37, compute_50,
        sm_30, sm_32, sm_35, sm_37 and sm_50 are deprecated and may be removed in
        a future release.
        Allowed values for this option:  'all','all-major','compute_35','compute_37',
        'compute_50','compute_52','compute_53','compute_60','compute_61','compute_62',
        'compute_70','compute_72','compute_75','compute_80','compute_86','compute_87',
        'lto_35','lto_37','lto_50','lto_52','lto_53','lto_60','lto_61','lto_62',
        'lto_70','lto_72','lto_75','lto_80','lto_86','lto_87','sm_35','sm_37','sm_50',
        'sm_52','sm_53','sm_60','sm_61','sm_62','sm_70','sm_72','sm_75','sm_80',
        'sm_86','sm_87'.

也就是在visual studio 2019中配置项目的这个地方:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-bEottigS-1637135294512)(编译运行.assets/image-20211117151948608.png)]

也就是你的GPU算力,可以查看显卡算力列表,如我的是GeForce GTX 960M,则对应的是5.0,则写成sm_50或者compute_50。

在与文件相同的文件夹下使用下面命令行编译:

nvcc .\cuda.cu -o cuda -arch compute_50 -Wno-deprecated-gpu-targets
	输入文件名    输出文件名    指定GPU体系结构     忽略compute_50所带来的警告(未来可能会移除对compute_50的支持,这里也可以写为sm_50)				

在命令行中运行cuda。exe文件。输出如下:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-LKpZAbKd-1637135294514)(编译运行.assets/image-20211117153717244.png)]

成功!并且其它复杂的程序都运行没有问题。

  • 4
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值