【新手】复现NeRCo代码中出现的torch.cuda.OutOfMemoryError: CUDA out of memory. 问题解决办法。

最新推荐文章于 2024-04-12 10:51:51 发布

好烦啊想摆了

最新推荐文章于 2024-04-12 10:51:51 发布

阅读量2k

点赞数 3

文章标签： python 图像处理目标检测计算机视觉

本文链接：https://blog.csdn.net/qq_43612410/article/details/132857362

版权

代码原文地址：

NeRCo

问题描述

复现CVPR2023中（NeRCo）代码中遇到的问题：

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.16 GiB (GPU 0; 14.58 GiB total capacity; 9.41 GiB already allocated; 1.32 GiB free; 12.25 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF：

由于我的服务器GPU配置是4张tesla T4，单张16G，查看作者是V100单张32G。

@Override
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.16 GiB (GPU 0; 14.58 GiB total capacity; 9.41 GiB already allocated; 1.32 GiB free; 12.25 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF:

原因分析：

查询了网上的很多解决方法：

例如：
1、判断模型是否规模太大或者batchsize太大，可以优化模型或者减小batchsize；
2、cached（缓存）过高的情况下，在报错的代码块之前先添加torch.cuda.empty_cache()释放缓存；
3、高版本pytorch可能在处理显存占用时有更多的优化，可以升级为更高版本的pytorch。特别是在相同型号GPU的情况下，可以考虑Python包版本的问题，通过“pip list”或者“conda list”可以参考版本信息。

对于1，在尝试减少batchsize解决，发现原文中原本的batchsize已经是1，没有办法再减了。
对于2，我尝试在NeRCo_model.py中报错的地方的 self.pre_A = self.netPre(self.real_A)前面加入torch.cuda.empty_cache(),发现错误还是存在。

    def forward(self):
        """Run forward pass; called by both functions <optimize_parameters> and <test>."""
        torch.cuda.empty_cache()
        self.pre_A = self.netPre(self.real_A)
        self.H, self.mask = self.netH(self.real_A)

        temp = torch.cat((self.real_A, self.pre_A), 1)
        self.fake_B = self.netG_A(temp * self.mask)  # G_A(A)
        self.rec_A = self.netG_B(self.fake_B)   # G_B(G_A(A))

        self.fake_A = self.netG_B(self.real_B * self.mask)  # G_B(B)
        self.pre_A1 = self.netPre(self.fake_A)
        temp = torch.cat((self.fake_A, self.pre_A1), 1)
        self.rec_B = self.netG_A(temp)   # G_A(G_B(B))

对于3，再尝试之后也没有效果。

解决方案：

在逐步尝试所有方法之后：
我突然在options/base_options.py中发现配置的

parser.add_argument('--preprocess', type=str, default='none', help='scaling and cropping of images at load time [resize_and_crop | crop | scale_width | scale_width_and_crop | none]')

发现可以调整加载图像的缩放和裁剪，随即使用python test.py --preprocess=scale_width，就可跑通了~

但是此方法只是应急用，至于怎么让他多个GPU一起用，我至今还没有弄明白，在参数前面加上CUDA_VISIBLE_DEVICES=0,1,2,3也还是会报显存不够的错误.

希望有大神可以解决一下我的疑惑~

好烦啊想摆了

关注

3
点赞
踩
1

收藏

觉得还不错? 一键收藏
4
评论
【新手】复现NeRCo代码中出现的torch.cuda.OutOfMemoryError: CUDA out of memory. 问题解决办法。

3、高版本pytorch可能在处理显存占用时有更多的优化，可以升级为更高版本的pytorch。特别是在相同型号GPU的情况下，可以考虑Python包版本的问题，通过“pip list”或者“conda list”可以参考版本信息。对于1，在尝试减少batchsize解决，发现原文中原本的batchsize已经是1，没有办法再减了。但是此方法只是应急用，至于怎么让他多个GPU一起用，我至今还没有弄明白，在参数前面加上。2、cached（缓存）过高的情况下，在报错的代码块之前先添加。也还是会报显存不够的错误.
复制链接

扫一扫