CUDA out of memory

最新推荐文章于 2024-06-30 23:24:39 发布

东方老司机

最新推荐文章于 2024-06-30 23:24:39 发布

阅读量492

点赞数

文章标签： python pytorch

本文链接：https://blog.csdn.net/u011732358/article/details/131484008

版权

记录一下
模型测试时出现了CUDA out of memory，也是很奇怪
原来是有数据在GPU中一直没有释放，每次调用测试代码时都会累计，久而久之导致了CUDA out of memory

pytorch的hook机制可能导致，显存爆炸，hook函数取出某一层的输入输出跟权重后，不可进行存储，修改等操作，这会造成hook不能回收，进而导致取出的输入输出权重都可能不被pytorch回收，所以模型的负担越来也大，最终导致显存爆炸。

代码中有

outputs = []
def hook(module, input, output):
    outputs.append(output)
model.layer1[-1].register_forward_hook(hook)
model.layer2[-1].register_forward_hook(hook)
model.layer3[-1].register_forward_hook(hook)

原来是hook机制捣的鬼
使用完后，给他们remove一下就OK了

outputs = []
def hook(module, input, output):
    outputs.append(output)
h1 = model.layer1[-1].register_forward_hook(hook)
h2 = model.layer2[-1].register_forward_hook(hook)
h3 = model.layer3[-1].register_forward_hook(hook)
...
h1.remove()
h2.remove()
h3.remove()

问题解决

东方老司机

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
CUDA out of memory

pytorch的hook机制可能导致，显存爆炸，hook函数取出某一层的输入输出跟权重后，不可进行存储，修改等操作，这会造成hook不能回收，进而导致取出的输入输出权重都可能不被pytorch回收，所以模型的负担越来也大，最终导致显存爆炸。原来是有数据在GPU中一直没有释放，每次调用测试代码时都会累计，久而久之导致了CUDA out of memory。模型测试时出现了CUDA out of memory，也是很奇怪。使用完后，给他们remove一下就OK了。原来是hook机制捣的鬼。
复制链接

扫一扫