【bug1】
UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all '
fix: 该错误在有的服务器上使用pytorch 1.3版本时出现(不是所有机器都会有这个问题), 将版本降至1.0-1.1 即可。
【bug2】
return torch._C._broadcast_coalesced(tensors, devices, buffer_size) RuntimeError: all tensors must be on devices[0]
fix:该错误是由于mmcv版本问题导致。 mmcv 0.4的分布式训练框架对于早期的mmaction代码支持不佳, 该用 mmcv 0.2.15即可解决。
【bug3】
result = self.forward(*input, **kwargs)
File "/home/hadoop-mtcv/anaconda3/envs/myenv/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 392, in forward
self.reducer.prepare_for_backward([])
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module h