使用torch.utils.data.TensorDataset封装数据集报错:
assert all(tensors[0].size(0) == tensor.size(0) for tensor in tensors)
AttributeError: 'Tensor' object has no attribute 'size'
是因为使用TensorDataset对数据集进行封装的时候,数据和标签必须是pytorch支持的tensor形式,而我输入的形式是numpy.ndarray和list形式,将其转换成tensor形式后,问题解决。格式转换使用torch.from_numpy和torch.Tensor方法即可,代码如下:
#读取训练数据集
train_data,train_labels = get_data(True)
train_data = train_data.reshape((50000, 3, 32, 32))
train_data = torch.from_numpy(train_data)
#print(type(train_labels)) #<class 'numpy.ndarray'>
train_labels = torch.from_numpy(train_labels)
#print(type(train_labels)) #<class 'torch.Tensor'>
train_dataset = torch.utils.data.TensorDataset(train_data,train_labels) #把数据放在数据库中
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size = BATCH_SIZE, shuffle=True)
print('读取训练数据集合完成')
test_data,test_labels = get_data(False)
test_data = test_data.reshape((10000, 3, 32, 32))
test_data = torch.from_numpy(test_data)
#print(type(test_labels)) #<class 'list'>
test_labels = torch.Tensor(test_labels) #list转换成tensor只能用这个方法
#print(type(test_labels)) #<class 'torch.Tensor'>
test_dataset = torch.utils.data.TensorDataset(test_data,test_labels) #把数据放在数据库中
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size = BATCH_SIZE, shuffle=True)
print('测试训练数据集合完成')
REF: