theano-GPU运行时pygpu.gpuarray.GpuArrayException: b‘cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory

项目场景:

l2t代码需要使用theano库和gpu加速计算

非root用户 ubuntu-18.04+cuda8.0(需要把gcc降级为相匹配的左右,我配为5.3)+cuDnn v5.1+theano 1.0.4+pygpu 0.7.6


问题描述

pygpu.gpuarray.GpuArrayException: b'cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory

Training... 2448.24 sec.
Traceback (most recent call last):
  File ".../python3.8/site-packages/theano/compile/function_module.py", line 903, in __call__
    self.fn() if output_subset is None else\
  File "pygpu/gpuarray.pyx", line 689, in pygpu.gpuarray.pygpu_zeros
  File "pygpu/gpuarray.pyx", line 700, in pygpu.gpuarray.pygpu_empty
  File "pygpu/gpuarray.pyx", line 301, in pygpu.gpuarray.array_empty
pygpu.gpuarray.GpuArrayException: b'cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "learn.py", line 535, in <module>
    main()
  File "learn.py", line 531, in main
    trainer.train()
  File "..l2t/lib/python3.8/site-packages/smartlearner-0.1.0-py3.8.egg/smartlearner/trainer.py", line 31, in train
  File ../l2t/lib/python3.8/site-packages/smartlearner-0.1.0-py3.8.egg/smartlearner/trainer.py", line 90, in _learning
  File "..l2t/lib/python3.8/site-packages/theano/compile/function_module.py", line 914, in __call__
    gof.link.raise_with_op(
  File "../l2t/lib/python3.8/site-packages/theano/gof/link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "../l2t/lib/python3.8/site-packages/six.py", line 718, in reraise
    raise value.with_traceback(tb)
  File "../l2t/lib/python3.8/site-packages/theano/compile/function_module.py", line 903, in __call__
    self.fn() if output_subset is None else\
  File "pygpu/gpuarray.pyx", line 689, in pygpu.gpuarray.pygpu_zeros
  File "pygpu/gpuarray.pyx", line 700, in pygpu.gpuarray.pygpu_empty
  File "pygpu/gpuarray.pyx", line 301, in pygpu.gpuarray.array_empty
pygpu.gpuarray.GpuArrayException: b'cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory'
Apply node that caused the error: GpuAlloc<None>{memset_0=True}(GpuArrayConstant{[[0.]]}, Elemwise{Composite{((i0 * i1 * i2 * i3) // maximum(i3, i4))}}[(0, 0)].0, Shape_i{3}.0)
Toposort index: 98
Inputs types: [GpuArrayType<None>(float64, (True, True)), TensorType(int64, scalar), TensorType(int64, scalar)]
Inputs shapes: [(1, 1), (), ()]
Inputs strides: [(8, 8), (), ()]
Inputs values: [gpuarray.array([[0.]]), array(2383800), array(100)]
Outputs clients: [[forall_inplace,gpu,grad_of_scan_fn}(Shape_i{1}.0, InplaceGpuDimShuffle{0,2,1}.0, InplaceGpuDimShuffle{0,2,1}.0, InplaceGpuDimShuffle{0,2,1}.0, GpuAlloc<None>{memset_0=True}.0, GpuSubtensor{int64:int64:int64}.0, GpuSubtensor{int64:int64:int64}.0, GpuSubtensor{int64:int64:int64}.0, GpuSubtensor{int64:int64:int64}.0, GpuSubtensor{int64:int64:int64}.0, GpuAlloc<None>{memset_0=True}.0, GpuAlloc<None>{memset_0=True}.0, GpuAlloc<None>{memset_0=True}.0, GpuAlloc<None>{memset_0=True}.0, GpuAlloc<None>{memset_0=True}.0, Shape_i{1}.0, Shape_i{1}.0, Shape_i{1}.0, Shape_i{1}.0, Shape_i{1}.0, Shape_i{1}.0, Shape_i{1}.0, Shape_i{1}.0, Shape_i{1}.0, Shape_i{1}.0, GRU0_W, GRU0_U, GRU0_Uh, GRU1_W, GRU1_U, GRU1_Uh, InplaceGpuDimShuffle{x,0}.0, GpuFromHost<None>.0, GpuElemwise{add,no_inplace}.0, GpuElemwise{add,no_inplace}.0, GpuElemwise{Add}[(0, 1)]<gpuarray>.0, GpuReshape{2}.0, GpuFromHost<None>.0, GpuElemwise{add,no_inplace}.0, GpuElemwise{add,no_inplace}.0, GpuElemwise{Add}[(0, 1)]<gpuarray>.0, GpuReshape{2}.0, GpuFromHost<None>.0, GpuElemwise{add,no_inplace}.0, GpuElemwise{add,no_inplace}.0, GpuElemwise{Add}[(0, 1)]<gpuarray>.0, GpuReshape{2}.0, InplaceGpuDimShuffle{1,0}.0, InplaceGpuDimShuffle{1,0}.0, InplaceGpuDimShuffle{x,0}.0, InplaceGpuDimShuffle{1,0}.0, InplaceGpuDimShuffle{1,0}.0, InplaceGpuDimShuffle{1,0}.0, InplaceGpuDimShuffle{1,0}.0, GpuFromHost<None>.0, InplaceGpuDimShuffle{1,0}.0, GpuAlloc<None>{memset_0=True}.0, GpuFromHost<None>.0, GpuAlloc<None>{memset_0=True}.0, GpuFromHost<None>.0, GpuAlloc<None>{memset_0=True}.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.


原因分析:

我自己没有找到原因,在外网搜索的提示:链接,把gpu__preallocate减少,我减为0.8,同时按照报错时的hint用少量的图优化(此处参考)

代码中输入:

成功跑上了~


我可能找到了theano的代替文档网站??(之前的deeplearning.net消失了。。真的废)

theano


 还有另两个参考链接:都提到preallocate

这个链接报错并不完全一样,但有人解答时提到gpu_prealloc设置

https://githubmemory.com/repo/JianshuZhang/WAP/issues/1这个链接报错也不完全一样

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值