复现论文代码时报出了:
RuntimeError: CUDA error: no kernel image is available for execution on the device
为了找到问题所在,对CUDA进行调试:
~$ python
Python 3.7.16 (default, Jan 17 2023, 22:20:44)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.12.0+cu102'
>>> torch.cuda.is_available()
True
>>> a = torch.tensor([1, 2])
>>> a = a.cuda()
lib/python3.7/site-packages/torch/cuda/__init__.py:146: UserWarning:
NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
表示显卡和pytorch的版本不适配,需要重新进行配置
按照提示去官网查询对应版本:https://pytorch.org/get-started/locally/
找到对应的pytorch版本,进行配置:
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge
重新配置好后就不再报错
因为我是直接按照requirement.txt文件来配置环境的,没有考虑pytorch和CUDA的问题:
pip install -r requirements.txt
所以配置环境之前还是要检查一下requirement.txt文件
参考:https://blog.csdn.net/qq_43391414/article/details/110562749
2、又出现了一个小错误(与上面无关):
RuntimeError: The size of tensor a (1024) must match the size of tensor b (196) at non-singleton dimension 1
输入的图像大小和指定的图像大小不匹配,改成一致之后就不再报错
3、XIO: fatal IO error 25 (Inappropriate ioctl for device) on X server “localhost:10.0”
after 293032 requests (293032 known processed) with 12 events remaining.
应该是在nohup状态下使用了matplotlib造成的
使用下面代码让matplotlib不显示绘图即可:
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
链接:https://www.cnblogs.com/happystudyeveryday/p/13862510.html