- 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
- 🍖 原作者:K同学啊
电脑环境:
语言环境:Python 3.8.0
深度学习环境:torch 2.5.1+cu121
一、前言
1、理论基础
Inception v3由谷歌研究员Christian Szegedy 等人在2015年的论文《Rethinking the Inception Architecture for Computer Vision》 中提出。Inception v3是Inception网络系列的第三个版本,它在lmageNet图像识别竞赛中取得了优异成绩,尤其是在大规模图像识别任务中表现出色。
Inception v3的主要特点如下:
1、 更深的网络结构:Inception v3比之前的inception网络结构更深,包含了48层卷积层。这使得网络可以提取更多层次的特征,从而在图像识别任务上取得更好的效果。
2、使用Factorized Convolutions : Inception v3采用了Factorized Convolutions (分解卷积),将较大的卷积核分解为多个较小的卷积核。这种方法可以降低网络的参数数量,减少计算复杂度,同时保持良好的性能。
3、使用Batch Normalization:Inception v3在每个卷积层之后都添加了Batch Normalization (BN),这有助于网络的收敛和泛化能力。BN可以减少Internal Covariate Shift (内部协变量偏移)现象,加快训练速度,同时提高模型的鲁棒性。
4、辅助分类器:Inception v3引1入了辅助分类器,可以在网络训练过程中提供额外的梯度信息,帮助网络更好地学习特征。辅助分类器位于网络的某个中间层,其输出会与主分类器的输出进行加权融合,从而得到最终的预测结果。
5、基于RMSProp的优化器:Inception v3使用了RMSProp优化器进行训练。相比于传统的随机梯度下降(SGD)方法,RMSProp可以自适应地调整学习率,使得训练过程更加稳定,收敛速度更快。
Inception v3在图像分类、物体检测和图像分割等计算机视觉任务中均取得了显著的效果。然而,由于其较大的网络结构和计算复杂度,Inception v3在实际应用中可能需要较高的硬件要求。
二、代码流程
1、导入包,设置GPU
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision
import torch.nn.functional as F
from torchvision import transforms, datasets
import os, PIL, pathlib
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
2、导入数据
data_dir = './data/'
data_dir = pathlib.Path(data_dir)
data_paths = list(data_dir.glob('*'))
classeNames = [str(path).split("/")[-1] for path in data_paths]
classeNames
‘cloudy’, ‘sunrise’, ‘shine’, ‘rain’]
3、数据处理
train_transforms = transforms.Compose([
transforms.Resize([299, 299]),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
total_data = datasets.ImageFolder(data_dir,transform=train_transforms)
train_size = int(0.8 * len(total_data))
test_size = len(total_data) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(total_data, [train_size, test_size])
len(train_dataset), len(test_dataset)
(904, 226)
batch_size = 32
train_dl = torch.utils.data.DataLoader(train_dataset,
batch_size=batch_size,
shuffle=True)
test_dl = torch.utils.data.DataLoader(test_dataset,
batch_size=batch_size,
shuffle=True)
4、设置网络
Inception-A
'''---InceptionA---'''
class InceptionA(nn.Module):
def __init__(self, in_channels, pool_features):
super(InceptionA, self).__init__()
self.branch1x1 = BasicConv2d(in_channels, 64, kernel_size=1)
self.branch5x5_1 = BasicConv2d(in_channels, 48, kernel_size=1)
self.branch5x5_2 = BasicConv2d(48, 64, kernel_size=5, padding=2)
self.branch3x3dbl_1 = BasicConv2d(in_channels, 64, kernel_size=1)
self.branch3x3dbl_2 = BasicConv2d(64, 96, kernel_size=3, padding=1)
self.branch3x3dbl_3 = BasicConv2d(96, 96, kernel_size=3, padding=1)
self.branch_pool = BasicConv2d(in_channels, pool_features, kernel_size=1)
def forward(self, x):
branch1x1 = self.branch1x1(x)
branch5x5 = self.branch5x5_1(x)
branch5x5 = self.branch5x5_2(branch5x5)
branch3x3dbl = self.branch3x3dbl_1(x)
branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)
outputs = [branch1x1, branch5x5, branch3x3dbl, branch_pool]
return torch.cat(outputs, 1)
Inception-B
'''---InceptionB---'''
class InceptionB(nn.Module):
def __init__(self, in_channels, channels_7x7):
super(InceptionB, self).__init__()
self.branch1x1 = BasicConv2d(in_channels, 192, kernel_size=1)
c7 = channels_7x7
self.branch7x7_1 = BasicConv2d (in_channels, c7, kernel_size=1)
self.branch7x7_2 = BasicConv2d(c7, c7, kernel_size=(1, 7), padding=(0, 3))
self.branch7x7_3 = BasicConv2d (c7, 192, kernel_size=(7, 1), padding=(3, 0))
self.branch7x7dbl_1 = BasicConv2d(in_channels, c7, kernel_size=