问题描述
PyTorch1.2中使用DataParallel的情况下,RNN报错了。部分报错信息如下:
loss: 250.3600 ||: : 3it [00:08, 3.74s/it]/opt/conda/conda-bld/pytorch_1570910687230/work/aten/src/ATen/native/cudnn/RNN.cpp:1238: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters().
loss: 239.2263 ||: : 4it [00:08, 2.78s/it]/opt/conda/conda-bld/pytorch_1570910687230/work/aten/src/ATen/native/cudnn/RNN.cpp:1238: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters().
loss: 229.2772 ||: : 5it [00:09, 2.08s/it]/opt/conda/conda-bld/pytorch_1570910687230/work/aten/src/ATen/native/cudnn/RNN.cpp:1238: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters().
loss: 218.7771 ||: : 6it [00:09, 1.60s/it]/opt/conda/conda-bld/pytorch_1570910687230/work/aten/src/ATen/native/cudnn/RNN.cpp:1238: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters().
loss: 235.8660 ||: : 7it [00:10, 1.27s/it]/opt/conda/conda-bld/pytorch_1570910687230/work/aten/src/ATen/native/cudnn/RNN.cpp:1238: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters().
解决办法
在1.1版本中只报告了一个错误,但是1.2每次forward都要报错。在forward
函数中加上flatten_parameters()
def forward...
if not hasattr(self, '_flattened'):
self.gru.flatten_parameters()
setattr(self, '_flattened', True)