神经网络中resize、size、crop_size等含义

构造一个网络首先要保证数据流是通的,即各层的输出形状的是整数,不能是小数。至于构造出来的网络效果好不好,按下不表。只要数据流通畅,你的输入图像是什么形状的都无所谓了。

如果你的图像是边长为 256 的正方形。那么卷积层的输出就满足 [ (256 - kernel_size)/ stride ] + 1 ,这个数值得是整数才行,否则没有物理意义。例如,你算得一个边长为 7.7 的 feature map 是没有物理意义的。 pooling 层同理可得。FC 层的输出形状总是满足整数,其唯一的要求就是整个训练过程中 FC 层的输入得是定长的。

resize
如果你的图像不是正方形。在制作 leveldb / lmdb 数据库时,缩放到统一大小(非正方形),使用非正方形的 kernel_size 来使得卷积层的输出依然是整数。

总之,撇开网络的好坏不谈,其本质上就是在做算术应用题:如何使得各层的输出是整数。

---------------未完待续

  • 3
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
以下是使用PyTorch实现Dataloader读取Tiny ImageNet数据集并训练网络的示例代码: ```python import torch import torchvision import torchvision.transforms as transforms from torch.utils.data import Dataset, DataLoader from torch.utils.data.sampler import SubsetRandomSampler # 定义自定义数据集类 class TinyImageNetDataset(Dataset): def __init__(self, data_dir, transform=None): self.data_dir = data_dir self.transform = transform self.image_paths = [] self.labels = [] with open(data_dir + '/wnids.txt', 'r') as f: self.classes = [line.strip() for line in f.readlines()] for i, cls in enumerate(self.classes): for img_file in os.listdir(data_dir + '/train/' + cls + '/images/'): self.image_paths.append(data_dir + '/train/' + cls + '/images/' + img_file) self.labels.append(i) def __len__(self): return len(self.labels) def __getitem__(self, idx): image_path = self.image_paths[idx] image = Image.open(image_path).convert('RGB') label = self.labels[idx] if self.transform: image = self.transform(image) return image, label # 定义数据增强和预处理操作 transform_train = transforms.Compose([ transforms.RandomCrop(64, padding=4), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) transform_test = transforms.Compose([ transforms.Resize(64), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # 创建训练、验证和测试数据集的实例 train_dataset = TinyImageNetDataset(data_dir='/path/to/tiny-imagenet-200', transform=transform_train) val_dataset = TinyImageNetDataset(data_dir='/path/to/tiny-imagenet-200', transform=transform_test) test_dataset = TinyImageNetDataset(data_dir='/path/to/tiny-imagenet-200', transform=transform_test) # 创建随机子采样器 train_sampler = SubsetRandomSampler(range(100000)) val_sampler = SubsetRandomSampler(range(10000)) test_sampler = SubsetRandomSampler(range(10000)) # 创建Dataloader train_loader = DataLoader(train_dataset, batch_size=128, sampler=train_sampler, num_workers=4) val_loader = DataLoader(val_dataset, batch_size=128, sampler=val_sampler, num_workers=4) test_loader = DataLoader(test_dataset, batch_size=128, sampler=test_sampler, num_workers=4) # 定义神经网络模型 class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1) self.conv2 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1) self.conv3 = nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1) self.fc1 = nn.Linear(256 * 8 * 8, 1024) self.fc2 = nn.Linear(1024, 200) def forward(self, x): x = F.relu(self.conv1(x)) x = F.max_pool2d(x, kernel_size=2, stride=2) x = F.relu(self.conv2(x)) x = F.max_pool2d(x, kernel_size=2, stride=2) x = F.relu(self.conv3(x)) x = F.max_pool2d(x, kernel_size=2, stride=2) x = x.view(-1, 256 * 8 * 8) x = F.relu(self.fc1(x)) x = self.fc2(x) return x # 实例化神经网络模型、损失函数和优化器 net = Net() criterion = nn.CrossEntropyLoss() optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9) # 训练循环 for epoch in range(10): running_loss = 0.0 for i, data in enumerate(train_loader, 0): inputs, labels = data optimizer.zero_grad() outputs = net(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() running_loss += loss.item() if i % 100 == 99: print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 100)) running_loss = 0.0 # 验证和测试循环 correct = 0 total = 0 with torch.no_grad(): for data in val_loader: images, labels = data outputs = net(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() val_accuracy = correct / total print('Validation accuracy: %.2f %%' % (100 * val_accuracy)) correct = 0 total = 0 with torch.no_grad(): for data in test_loader: images, labels = data outputs = net(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() test_accuracy = correct / total print('Test accuracy: %.2f %%' % (100 * test_accuracy)) ``` 注意:上述代码仅供参考,实际使用时需要根据自己的需求进行修改。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值