【转】解决内存/显存泄露的方法 pytorch

最新推荐文章于 2024-07-24 03:06:41 发布

UeFan

最新推荐文章于 2024-07-24 03:06:41 发布

阅读量1.6k

点赞数

分类专栏：资源记录文章标签： pytorch 深度学习人工智能

原文链接：https://forum.pyro.ai/t/a-clever-trick-to-debug-tensor-memory/556

版权

资源记录专栏收录该内容

8 篇文章 0 订阅

订阅专栏

def debug_memory():
    import collections, gc, resource, torch
    print('maxrss = {}'.format(
        resource.getrusage(resource.RUSAGE_SELF).ru_maxrss))
    tensors = collections.Counter((str(o.device), o.dtype, tuple(o.shape))
                                  for o in gc.get_objects()
                                  if torch.is_tensor(o))
    for line in sorted(tensors.items()):
        print('{}\t{}'.format(*line))

使用上面的函数，在training loop 里用，可以追踪占用显寸的变量的大小，从而发现一直在扩大的、非正常占用显存的问题变量。

下面是一个例子（这里面没有用gpu，都是cpu）

>>> z = [torch.randn(i).long() for i in range(10)]
>>> debug_memory()
('cpu', torch.float32, (3, 3))	2
('cpu', torch.int64, (0,))	1
('cpu', torch.int64, (1,))	1
('cpu', torch.int64, (2,))	1
('cpu', torch.int64, (3,))	1
('cpu', torch.int64, (4,))	1
('cpu', torch.int64, (5,))	1
('cpu', torch.int64, (6,))	1
('cpu', torch.int64, (7,))	1
('cpu', torch.int64, (8,))	1
('cpu', torch.int64, (9,))	1

来源：A clever trick to debug tensor memory - Misc. - Pyro Discussion Forum