GPU的使用+RuntimeError: CUDA error: out of memory + in _lazy_init torch._C._cuda_init()

FakeOccupational

已于 2022-07-03 14:05:33 修改

阅读量861

点赞数 1

分类专栏：深度学习文章标签： GPU

于 2022-01-25 17:14:40 首次发布

本文链接：https://blog.csdn.net/ResumeProject/article/details/121569365

版权

深度学习专栏收录该内容

162 篇文章 16 订阅

订阅专栏

查看GPU情况

nvidia-smi	查看命令，下部分为占用pid
watch -n 1 nvidia-smi	每1秒刷新一次
fuser -v /dev/nvidia*	运行在gpu上的所有程序

watch的参数解释

使用

# device setup 
use_cuda = torch.cuda.is_available()
device = torch.device('cuda' if use_cuda else 'cpu')

# bash 文件中
export CUDA_VISIBLE_DEVICES=4

# 代码中
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"# 保证程序中的GPU序号是和硬件中的序号是相同的
os.environ['CUDA_VISIBLE_DEVICES'] = "0, 1, 3"
device = torch.device("cuda:0" if torch.cuda.is_available() and not args.no_cuda else "cpu")  # cuda 指定使用GPU设备
model = torch.nn.DataParallel(model, device_ids=[0, 1, 3])  # 指定多GPU并行处理时使用的设备编号

RuntimeError: CUDA error: out of memory

情况	方案
内存不足	减小batchsize
显存问题	多次执行 torch.cuda.empty_cache() 或
torch 与cuda 不匹配	print(torch.cuda.is_available())
gcc 显卡的gcc的版本，与pytorch支持的最低版本	CUDA，NVIDIA Driver，Linux，GCC之间的版本对应关系表格
程序指定的GPU被占用	按照使用教程更改

File “envs\torch\lib\site-packages\torch\cuda_init_.py”, line 214, in _lazy_init torch._C._cuda_init()

原因GPU 设置错误，我的情况是没有此编号的GPU，并进行了以下修改

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

FakeOccupational

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
GPU的使用+RuntimeError: CUDA error: out of memory + in _lazy_init torch._C._cuda_init()

GPUhttps://research.com/conference-rankings/computer-science/2021
复制链接

扫一扫