CUDA:Out Of Memory问题

z290048663

已于 2024-03-14 15:31:48 修改

阅读量3k

点赞数 1

文章标签： python 开发语言

于 2023-02-09 11:50:10 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/z290048663/article/details/128950092

版权

文章讲述了在运行Real-ESRGAN模型时遇到CUDA内存不足（OOM）的问题，以及如何通过设置环境变量PYTORCH_CUDA_ALLOC_CONF和调整模型命令中的--tile参数来解决。具体解决方案包括在.bashrc中永久设置PYTORCH_CUDA_ALLOC_CONF为512，或者临时通过export命令设置，以及根据模型调整--tile的大小以减少内存碎片化。

摘要由CSDN通过智能技术生成

序

有不懂，请评论

我在其他社区的精简发帖：

I've edited my comment here，and you guys can also see a more clear version of reply there.https://discuss.pytorch.org/t/pytorch-cuda-alloc-conf/165376/7I understand the meaning of this command (PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:516), but where do you actually write it? In jupyter notebook? In command prompt?https://discuss.pytorch.org/t/pytorch-cuda-alloc-conf/165376/7

学习建议：用英文检索，bing\google
方法：将报错内容理解，然后键入英文搜索
适用范围：Real-ESRGAN模型（百分百可用）

搜索思路：

问题描述：If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF If you encounter CUDA out of memory, try to set --tile with a smaller number.
环境：

ubuntu(按量付费云服务器

3090RTX 24GB显存

问题来源：

运行python inference_realesrgan.py -n RealESRGAN_x4plus -i 2-8 -o output_2_8

超分图像，出现OOM问题，

我成功的操作：

step 1 设置PYTORCH_CUDA_ALLOC_CONF为512

至于该PYTORCH_CUDA_ALLOC_CONF值具体设置范围，请参见我列出的Reference

方法一：设置ubuntu环境变量（操作繁琐，但是能永久使用）：即，设置 ~./bashrc，详细操作看如下链接。

How to Set Environment Variables in Linux {Step-by-Step Guide}

方法二：也可以export（操作简单，但是只能一次性使用）：

直接在terminal，敲

export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512

Reference：

PYTORCH_CUDA_ALLOC_CONF

Cuda Reserve Memory - Memory Format - PyTorch Forums

https://pytorch.org/docs/stable/notes/cuda.html#memory-management

step 2 键入命令，设置--tile

由于我是使用RealESRGAN模型，我需要在运行模型命令这里，末尾加上参数设置，即

--tile 50

该命令的含义是：表示设置的每次tile大小

干净且完整的命令是：

python inference_realesrgan.py -n RealESRGAN_x4plus  --tile 50

原始链接提到设置的tile可用大小：

“

decrease the --tile

such as --tile 800 or smaller than 800

”

Reference：

CUDA out of memory · Issue #101 · xinntao/Real-ESRGAN · GitHub

结果

就能运行了，原理的话，自行看链接，慢慢看，一定能搞懂的。

rendering - Cycles / CUDA Error: Out of Memory - Blender Stack Exchange

题外话

关于step 1 设置PYTORCH_CUDA_ALLOC_CONF的大小原理

其他可能对你们有用的中文链接

首先，我先看了一个国内链接，试了试，结果针对我的电脑，是失败的；通过设置PYTORCH_CUDA_ALLOC_CONF中的max_split_size_mb解决Pytorch的显存碎片化导致的CUDA:Out Of Memory问题_梦音Yune的博客-CSDN博客

关注

1
点赞
踩
8

收藏

觉得还不错? 一键收藏
4
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论 4

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。