在使用Pytorch进行训练时,有时需要在一个batch同时加载两个不同的数据集。考虑到两个数据集大小可能不同,遍历两个数据及可以按如下操作:
dataloaders1 = DataLoader(DummyDataset(0, 100), batch_size=10, shuffle=True)
dataloaders2 = DataLoader(DummyDataset(0, 200), batch_size=10, shuffle=True)
num_epochs = 10
for epoch in range(num_epochs):
dataloader_iterator = iter(dataloaders1)
for i, data1 in enumerate(dataloaders2)):
try:
data2 = next(dataloader_iterator)
except StopIteration:
dataloader_iterator = iter(dataloaders1)
data2 = next(dataloader_iterator)
do_cool_things()
Reference:
https://stackoverflow.com/a/57890309/9492373