解决CUDA error: no kernel image is availabel for execution on the device

问题描述:

GLIP demo从T4卡的服务器上迁移到4090卡上,同样的docker镜像,同样的启动脚本,竟然报错了:

原因分析:

环境里的cuda版本是11.0,网上看到4090支持的cuda最低版本是11.8

尝试解决第一步:

升级cuda版本,安装cuda11.8

打开nvidia官网,找到安装命令:

官网地址:https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=runfile_local

wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
sudo sh cuda_11.8.0_520.61.05_linux.run

安装时,不勾选‘driver’,按空格即可。安装好后,nvcc -V,检查是否安装成功。

启动服务,被催地发现,仍然报同样的错。

继续分析:

检查torch版本,是否torch版本和cuda不兼容导致。

环境中torch版本:1.9.1+cu102

对比该服务器上其它正常可用跑训练或推理的docker环境,发现 torch版本:1.9.1+cu111的运行正常。

尝试解决第二步:

更新torch版本:1.9.1+cu111

pip install torch==1.9.0+cu111 -f  https://download.pytorch.org/whl/cu111/torch_stable.html

直接 pip install torch==1.9.0+cu111 ,是会报错的,需要加上后面的地址。

再次启动demo,无报错,问题解决。

最后

有一种可能,不需要升级cuda版本,直接更新torch版本就可以解决问题。时间原因,不做验证了。

  • 3
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
The error message "NotImplementedError: CUDA is not available" typically occurs when you are trying to use a feature or library that requires CUDA (Compute Unified Device Architecture) but CUDA is not installed or available on your system. CUDA is a parallel computing platform and API model created by NVIDIA that allows developers to use GPUs for general-purpose computing. To resolve this issue, you can try the following steps: 1. Verify CUDA installation: Check if CUDA is properly installed on your system. You can do this by running the following command in the terminal: ``` nvcc --version ``` This command should display the version of CUDA if it is installed correctly. 2. Install CUDA: If CUDA is not installed, you can download and install it from the NVIDIA website. Make sure to choose the version of CUDA that matches your GPU and operating system. 3. Update GPU drivers: Ensure that you have the latest GPU drivers installed for your NVIDIA graphics card. You can download the latest drivers from the NVIDIA website and follow the installation instructions. 4. Set environment variables: After installing CUDA, you need to set the appropriate environment variables. This includes adding CUDA's bin directory to the system's PATH variable and setting CUDA_HOME to the installation directory of CUDA. 5. Check GPU compatibility: Verify if your GPU is compatible with the version of CUDA you have installed. Not all GPUs support CUDA, so make sure your GPU model is listed in the CUDA compatibility documentation. If you are still having issues or have specific error messages, please provide more details so that I can assist you further.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值