当数据集的图片大小不一致时,设置batchsize>1,报如下错误:
epoch 0, processed 0 samples, lr 0.0000001000
Traceback (most recent call last):
File "train.py", line 347, in <module>
main()
File "train.py", line 138, in main
train(train_list, model, criterion, optimizer, epoch,f)
File "train.py", line 199, in train
for i,(img, target)in enumerate(train_loader):
File "/home/wanglin/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 615, in __next__
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/wanglin/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 232, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/wanglin/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 209, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 401 and 440 in dimension 2 at /opt/conda/conda-bld/pytorch_1544080996887/work/aten/src/TH/generic/THTensorMoreMath.cpp:1333
本来以为是anaconda2/lib/python2.7/site-packages/torch/utils/data/sampler.py中的BatchSampler类的batch.append(idx)出错,最后发现是在读取数据时,程序会调用anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py 的default_collate(batch)函数,当执行到torch.stack(batch, 0, out=out)会报错。原因是,torch.stack函数(以及torch.cat()函数)要求输入是相同size的。验证如下:
>>> a=torch.rand((2,3))
>>> b=torch.rand((2,3))
>>> a
tensor([[0.1102, 0.0474, 0.6739],
[0.1565, 0.1700, 0.4528]])
>>> b
tensor([[0.3654, 0.1404, 0.6968],
[0.0738, 0.0108, 0.4125]])
>>> c=torch.cat((a,b))
>>> c
tensor([[0.1102, 0.0474, 0.6739],
[0.1565, 0.1700, 0.4528],
[0.3654, 0.1404, 0.6968],
[0.0738, 0.0108, 0.4125]])
>>> c.size()
(4, 3)
>>> d=torch.stack((a,b))
>>> d
tensor([[[0.1102, 0.0474, 0.6739],
[0.1565, 0.1700, 0.4528]],
[[0.3654, 0.1404, 0.6968],
[0.0738, 0.0108, 0.4125]]])
>>> d.size()
(2, 2, 3)
>>> e=torch.rand((2,3))
>>> f=torch.stack((d,e))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: invalid argument 0: Tensors must have same number of dimensions: got 4 and 3 at /opt/conda/conda-bld/pytorch_1544080996887/work/aten/src/TH/generic/THTensorMoreMath.cpp:1324
>>> f=torch.cat((d,e))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: invalid argument 0: Tensors must have same number of dimensions: got 3 and 2 at /opt/conda/conda-bld/pytorch_1544080996887/work/aten/src/TH/generic/THTensorMoreMath.cpp:1324