当模型多GPU运行时,遇到dimension specified as 0 but tensor has no dimensions问题

当使用pytorch多gpu运行时候,loss反传时出现错误:

Traceback (most recent call last):
  File "trainval_net.py", line 343, in <module>
    rois_label = fasterRCNN(im1_data,im2_data, im1_info, im2_info,gt_boxes, num_boxes)
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 115, in forward
    return self.gather(outputs, self.output_device)
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 127, in gather
    return gather(outputs, output_device, dim=self.dim)
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather
    return gather_map(outputs)
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
    return type(out)(map(gather_map, zip(*outputs)))
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/parallel/scatter_gather.py", line 55, in gather_map
    return Gather.apply(target_device, dim, *outputs)
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/parallel/_functions.py", line 54, in forward
    ctx.input_sizes = tuple(map(lambda i: i.size(ctx.dim), inputs))
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/parallel/_functions.py", line 54, in <lambda>
    ctx.input_sizes = tuple(map(lambda i: i.size(ctx.dim), inputs))
RuntimeError: dimension specified as 0 but tensor has no dimensions
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f0b38b5abe0>>
Traceback (most recent call last):
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 349, in __del__
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 328, in _shutdown_workers
  File "/home/zhangxin/anaconda2/envs/py35/lib/python3.5/multiprocessing/queues.py", line 345, in get
  File "<frozen importlib._bootstrap>", line 969, in _find_and_load
  File "<frozen importlib._bootstrap>", line 954, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 887, in _find_spec
TypeError: 'NoneType' object is not iterable

其原因为新版本torch(0.4.0)不支持loss为标量,因此需要将所有loss转化为一维向量,也很简单。如:
原loss设定:

def compute_loss:
  loss = 0;
  return loss

改为:

def compute_loss:
  loss = 0;
  return loss.view(-1)

记住需要将所有的loss都变为向量,因为有的模型不止一个loss(如Faster-RCNN)。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
©️2022 CSDN 皮肤主题:大白 设计师:CSDN官方博客 返回首页
评论 2

打赏作者

小鑫爱学习

你的鼓励将是我创作的最大动力

¥2 ¥4 ¥6 ¥10 ¥20
输入1-500的整数
余额支付 (余额:-- )
扫码支付
扫码支付:¥2
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值