1、一个月不用云服务器就报了个大错(整个人都不好了,之前还用的好好的,现在就凉了):
torch._C._cuda_init()
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:50
2、寻找解决方法:
第一篇参考博客:
https://blog.csdn.net/hangzuxi8764/article/details/86572093
说的很实在,如作者所述,需要查看一下自己显卡型号,奈何不会看只能去usr/src 目录下看一下文件夹的后缀,我的是418.74
step1:sudo apt-get install dkms
step2: sudo dkms install -m nvidia -v 418.74
但是step1的时候又报错了(整个人快崩了)
E: Failed to fetch store:/var/lib/apt/lists/partial/cn.archive.ubuntu.com_ubuntu_dists_xenial_universe_dep11_Components-amd64.yml.gz Hash Sum mismatch
E: Some index files failed to download. They have been ignored, or old ones used instead.
E: Encountered a section with no Package: header
E: Problem with MergeList /var/lib/apt/lists/cn.archive.ubuntu.com_ubuntu_dists_xenial_main_binary-amd64_Packages
E: The package lists or status file could not be parsed or opened.
继续找办法:
第二篇参考博客:
https://blog.csdn.net/yiluoak_47/article/details/31734505
sudo rm /var/lib/apt/lists/* -vf
sudo apt-get update
作者更实在,直接两步,确实搞定,前一步是根本,所以还是要按部就班(不按顺序,跳步的坑默默试过了)
参照两篇博客解决了问题,(好久好久过去了,之前看过什么重启啊 sudo reboot,没用的)
3、感谢两篇文章的作者,希望以后的坑少一点。
附上最后python测试GPU调用的代码
import torch
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.cuda.get_device_name())
print(torch.cuda.current_device())