本文为🔗365天深度学习训练营 中的学习记录博客
原作者:K同学啊**
本次任务:
根据ResNet与DenseNet,进行结合来探索和构建一个新的模型框架。
下面分别是ResNet与DenseNet的模型结构图。
图1 ResNet50V2
图2 DenseNet121
一、DPN模型
参考论文:NIPS-2017-dual-path-networks-Paper
上图1是ResNet, 图2 是DenseNet,变粗的部分就是新的网络层的特征,不断与之前的特征concate,注意有多个1x1卷积核。同一个颜色的卷积核代表的是同一个尺度的1x1卷积核,比如,绿色的1x1卷积核代表了是指对第一层的特征进行1x1卷积。而带下划线的1x1卷积核则是为了对concate后的特征进行维度上的整理。图3是对DenseNet的一个改动,假设所有相同颜色的1x1卷积核是共享的,那么图2就可以整理成为图3的格式。图3的左半部分是DenseNet,右半部分是ResNet。回忆在核心论点一中,提到concate是可以等价成为相加的,所以在图2中绿色1x1卷积核、桔色1x1卷积核分别对第一层特征、新特征处理后,再concate进行后面的操作,也就等价于图3中右边通道中二者相加。
图4是提出的DPN,一方面,这个网络结构可以直接从DPN的定义公式得到;另一方面,可以发现,只要把图3 里面的1x1卷积核拆分、整理、变形,也可以得到图4的结构。
上图最右边DPN结构把每个block中每个通道的第一个1x1卷积核合并的结果,和图4的最大不同在于ResNet和DenseNet共享了第一个 1×1 卷积。在实际计算 3×3 卷积时,使用了分组卷积来提升网络的性能。在设计网络的超参时,ResNet的通道数也比DenseNet的通道数多点,防止DenseNet随着层数的增加引发的显存消耗速度过快的问题。和其它网络一样,我们也可以通过堆叠网络块的方式来提升模型的容量。
三、模型实现
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision
from torchvision import transforms, datasets
import os,PIL,pathlib,warnings
import torch.nn.functional as F
warnings.filterwarnings("ignore") #忽略警告信息
# 设置CPU/GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
代码输出:
device(type='cpu')
# 导入数据,与J3的数据是一样的
data_dir='./J3/bird_photos/'
data_dir=pathlib.Path(data_dir)
data_paths=list(data_dir.glob('*'))
classNames=[str(path).split('\\')[2] for path in data_paths]
classNames
代码输出:
['Bananaquit', 'Black Skimmer', 'Black Throated Bushtiti', 'Cockatoo']
num_classes=len(classNames)
num_classes
代码输出:
4
train_transforms = transforms.Compose([
transforms.Resize([224,224]),
transforms.ToTensor(),
transforms.Normalize(
mean = [0.485,0.456,0.406],
std = [0.229,0.224,0.225]
)
])
test_transforms = transforms.Compose([
transforms.Resize([224,224]),
transforms.ToTensor(),
transforms.Normalize(
mean = [0.485,0.456,0.406],
std = [0.229,0.224,0.225]
)
])
total_data = datasets.ImageFolder(data_dir,transform = train_transforms)
total_data
代码输出:
Dataset ImageFolder
Number of datapoints: 565
Root location: J3\bird_photos
StandardTransform
Transform: Compose(
Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=True)
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
# 划分数据集
train_size = int(0.8 * len(total_data))
test_size = len(total_data) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(total_data, [train_size, test_size])
print(train_dataset)
print(test_dataset)
batch_size = 8
train_dl = torch.utils.data.DataLoader(train_dataset,
batch_size=batch_size,
shuffle=True,
#num_workers=1
)
test_dl = torch.utils.data.DataLoader(test_dataset,
batch_size=batch_size,
shuffle=True,
#num_workers=1
)
for X, y in test_dl:
print("Shape of X [N, C, H, W]: ", X.shape)
print("Shape of y: ", y.shape, y.dtype)
break
for X, y in train_dl:
print("Shape of X [N, C, H, W]: ", X.shape)
print("Shape of y: ", y.shape, y.dtype)
break
代码输出:
<torch.utils.data.dataset.Subset object at 0x00000262381F8B80>
<torch.utils.data.dataset.Subset object at 0x00000262381F8B20>
Shape of X [N, C, H, W]: torch.Size([8, 3, 224, 224])
Shape of y: torch.Size([8]) torch.int64
Shape of X [N, C, H, W]: torch.Size([8, 3, 224, 224])
Shape of y: torch.Size([8]) torch.int64
from collections import OrderedDict
import torch
import torch.nn as nn
import torch.nn.functional as F
class Block(nn.Module):
def __init__(self, in_channel, mid_channel, out_channel, dense_channel, stride, groups, is_shortcut=False):
# in_channel,是输入通道数,mid_channel是中间经历的通道数,out_channels是经过一次板块之后的输出通道数。
# dense_channels设置这个参数的原因就是一边进行着resnet方式的卷积运算,另一边也同时进行着dense的卷积计算,之后特征图融合形成新的特征图
super().__init__()
self.is_shortcut = is_shortcut
self.out_channel = out_channel
self.conv1 = nn.Sequential(
nn.Conv2d(in_channel, mid_channel, kernel_size=1, bias=False),
nn.BatchNorm2d(mid_channel),
nn.ReLU()
)
self.conv2 = nn.Sequential(
nn.Conv2d(mid_channel, mid_channel, kernel_size=3, stride=stride, padding=1, groups=groups, bias=False),
nn.BatchNorm2d(mid_channel),
nn.ReLU()
)
self.conv3 = nn.Sequential(
nn.Conv2d(mid_channel, out_channel+dense_channel, kernel_size=1, bias=False),
nn.BatchNorm2d(out_channel+dense_channel)
)
if self.is_shortcut:
self.shortcut = nn.Sequential(
nn.Conv2d(in_channel, out_channel+dense_channel, kernel_size=3, padding=1, stride=stride, bias=False),
nn.BatchNorm2d(out_channel+dense_channel)
)
self.relu = nn.ReLU(inplace=True)
def forward(self, x):
a = x
x = self.conv1(x)
x = self.conv2(x)
x = self.conv3(x)
if self.is_shortcut:
a = self.shortcut(a)
d = self.out_channel
x = torch.cat([a[:,:d,:,:] + x[:,:d,:,:], a[:,d:,:,:], x[:,d:,:,:]], dim=1)
x = self.relu(x)
return x
class DPN(nn.Module):
def __init__(self, cfg):
super(DPN, self).__init__()
self.group = cfg['group']
self.in_channel = cfg['in_channel']
mid_channels = cfg['mid_channels']
out_channels = cfg['out_channels']
dense_channels = cfg['dense_channels']
num = cfg['num']
self.conv1 = nn.Sequential(
nn.Conv2d(3, self.in_channel, 7, stride=2, padding=3, bias=False, padding_mode='zeros'),
nn.BatchNorm2d(self.in_channel),
nn.ReLU(),
nn.MaxPool2d(kernel_size=3, stride=2, padding=0)
)
self.conv2 = self._make_layers(mid_channels[0], out_channels[0], dense_channels[0], num[0], stride=1)
self.conv3 = self._make_layers(mid_channels[1], out_channels[1], dense_channels[1], num[1], stride=2)
self.conv4 = self._make_layers(mid_channels[2], out_channels[2], dense_channels[2], num[2], stride=2)
self.conv5 = self._make_layers(mid_channels[3], out_channels[3], dense_channels[3], num[3], stride=2)
self.pool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(cfg['out_channels'][3] + (num[3]+1) * cfg['dense_channels'][3], cfg['classes']) # fc层需要计算
def _make_layers(self, mid_channel, out_channel, dense_channel, num, stride=2):
layers = []
layers.append(Block(self.in_channel, mid_channel, out_channel, dense_channel, stride=stride, groups=self.group, is_shortcut=True))
# block_1里面is_shortcut=True就是resnet中的shortcut连接,将浅层的特征进行一次卷积之后与进行三次卷积的特征图相加
# 后面几次相同的板块is_shortcut=False简单的理解就是一个多次重复的板块,第一次利用就可以满足浅层特征的利用,后面重复的不在需要
self.in_channel = out_channel + dense_channel*2
# 由于里面包含dense这种一直在叠加的特征图计算,
# 所以第一次是2倍的dense_channel,后面每次一都会多出1倍,所以有(i+2)*dense_channel
for i in range(1, num):
layers.append(Block(self.in_channel, mid_channel, out_channel, dense_channel, stride=1, groups=self.group))
self.in_channel = self.in_channel + dense_channel
#self.in_channel = out_channel + (i+2)*dense_channel
return nn.Sequential(*layers)
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
x = self.conv3(x)
x = self.conv4(x)
x = self.conv5(x)
x = self.pool(x)
x = torch.flatten(x, start_dim=1)
x = self.fc(x)
return x
def DPN92(n_class=10):
cfg = {
'group': 32,
'in_channel': 64,
'mid_channels': (96, 192, 384, 768),
'out_channels': (256, 512, 1024, 2048),
'dense_channels': (16, 32, 24, 128),
'num': (3, 4, 20, 3),
'classes': (n_class)
}
return DPN(cfg)
def DPN98(n_class=10):
cfg = {
'group': 40,
'in_channel': 96,
'mid_channels': (160, 320, 640, 1280),
'out_channels': (256, 512, 1024, 2048),
'dense_channels': (16, 32, 32, 128),
'num': (3, 6, 20, 3),
'classes': (n_class)
}
return DPN(cfg)
# 这里采用DPN98
model=DPN98(n_class=num_classes).to(device)
# 统计模型参数量以及其他指标
import torchsummary as summary
summary.summary(model,(3,224,224))
代码输出:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 96, 112, 112] 14,112
BatchNorm2d-2 [-1, 96, 112, 112] 192
ReLU-3 [-1, 96, 112, 112] 0
MaxPool2d-4 [-1, 96, 55, 55] 0
Conv2d-5 [-1, 160, 55, 55] 15,360
BatchNorm2d-6 [-1, 160, 55, 55] 320
ReLU-7 [-1, 160, 55, 55] 0
Conv2d-8 [-1, 160, 55, 55] 5,760
BatchNorm2d-9 [-1, 160, 55, 55] 320
ReLU-10 [-1, 160, 55, 55] 0
Conv2d-11 [-1, 272, 55, 55] 43,520
BatchNorm2d-12 [-1, 272, 55, 55] 544
Conv2d-13 [-1, 272, 55, 55] 235,008
BatchNorm2d-14 [-1, 272, 55, 55] 544
ReLU-15 [-1, 288, 55, 55] 0
Block-16 [-1, 288, 55, 55] 0
Conv2d-17 [-1, 160, 55, 55] 46,080
BatchNorm2d-18 [-1, 160, 55, 55] 320
ReLU-19 [-1, 160, 55, 55] 0
Conv2d-20 [-1, 160, 55, 55] 5,760
BatchNorm2d-21 [-1, 160, 55, 55] 320
ReLU-22 [-1, 160, 55, 55] 0
Conv2d-23 [-1, 272, 55, 55] 43,520
BatchNorm2d-24 [-1, 272, 55, 55] 544
ReLU-25 [-1, 304, 55, 55] 0
Block-26 [-1, 304, 55, 55] 0
Conv2d-27 [-1, 160, 55, 55] 48,640
BatchNorm2d-28 [-1, 160, 55, 55] 320
ReLU-29 [-1, 160, 55, 55] 0
Conv2d-30 [-1, 160, 55, 55] 5,760
BatchNorm2d-31 [-1, 160, 55, 55] 320
ReLU-32 [-1, 160, 55, 55] 0
Conv2d-33 [-1, 272, 55, 55] 43,520
BatchNorm2d-34 [-1, 272, 55, 55] 544
ReLU-35 [-1, 320, 55, 55] 0
Block-36 [-1, 320, 55, 55] 0
Conv2d-37 [-1, 320, 55, 55] 102,400
BatchNorm2d-38 [-1, 320, 55, 55] 640
ReLU-39 [-1, 320, 55, 55] 0
Conv2d-40 [-1, 320, 28, 28] 23,040
BatchNorm2d-41 [-1, 320, 28, 28] 640
ReLU-42 [-1, 320, 28, 28] 0
Conv2d-43 [-1, 544, 28, 28] 174,080
BatchNorm2d-44 [-1, 544, 28, 28] 1,088
Conv2d-45 [-1, 544, 28, 28] 1,566,720
BatchNorm2d-46 [-1, 544, 28, 28] 1,088
ReLU-47 [-1, 576, 28, 28] 0
Block-48 [-1, 576, 28, 28] 0
Conv2d-49 [-1, 320, 28, 28] 184,320
BatchNorm2d-50 [-1, 320, 28, 28] 640
ReLU-51 [-1, 320, 28, 28] 0
Conv2d-52 [-1, 320, 28, 28] 23,040
BatchNorm2d-53 [-1, 320, 28, 28] 640
ReLU-54 [-1, 320, 28, 28] 0
Conv2d-55 [-1, 544, 28, 28] 174,080
BatchNorm2d-56 [-1, 544, 28, 28] 1,088
ReLU-57 [-1, 608, 28, 28] 0
Block-58 [-1, 608, 28, 28] 0
Conv2d-59 [-1, 320, 28, 28] 194,560
BatchNorm2d-60 [-1, 320, 28, 28] 640
ReLU-61 [-1, 320, 28, 28] 0
Conv2d-62 [-1, 320, 28, 28] 23,040
BatchNorm2d-63 [-1, 320, 28, 28] 640
ReLU-64 [-1, 320, 28, 28] 0
Conv2d-65 [-1, 544, 28, 28] 174,080
BatchNorm2d-66 [-1, 544, 28, 28] 1,088
ReLU-67 [-1, 640, 28, 28] 0
Block-68 [-1, 640, 28, 28] 0
Conv2d-69 [-1, 320, 28, 28] 204,800
BatchNorm2d-70 [-1, 320, 28, 28] 640
ReLU-71 [-1, 320, 28, 28] 0
Conv2d-72 [-1, 320, 28, 28] 23,040
BatchNorm2d-73 [-1, 320, 28, 28] 640
ReLU-74 [-1, 320, 28, 28] 0
Conv2d-75 [-1, 544, 28, 28] 174,080
BatchNorm2d-76 [-1, 544, 28, 28] 1,088
ReLU-77 [-1, 672, 28, 28] 0
Block-78 [-1, 672, 28, 28] 0
Conv2d-79 [-1, 320, 28, 28] 215,040
BatchNorm2d-80 [-1, 320, 28, 28] 640
ReLU-81 [-1, 320, 28, 28] 0
Conv2d-82 [-1, 320, 28, 28] 23,040
BatchNorm2d-83 [-1, 320, 28, 28] 640
ReLU-84 [-1, 320, 28, 28] 0
Conv2d-85 [-1, 544, 28, 28] 174,080
BatchNorm2d-86 [-1, 544, 28, 28] 1,088
ReLU-87 [-1, 704, 28, 28] 0
Block-88 [-1, 704, 28, 28] 0
Conv2d-89 [-1, 320, 28, 28] 225,280
BatchNorm2d-90 [-1, 320, 28, 28] 640
ReLU-91 [-1, 320, 28, 28] 0
Conv2d-92 [-1, 320, 28, 28] 23,040
BatchNorm2d-93 [-1, 320, 28, 28] 640
ReLU-94 [-1, 320, 28, 28] 0
Conv2d-95 [-1, 544, 28, 28] 174,080
BatchNorm2d-96 [-1, 544, 28, 28] 1,088
ReLU-97 [-1, 736, 28, 28] 0
Block-98 [-1, 736, 28, 28] 0
Conv2d-99 [-1, 640, 28, 28] 471,040
BatchNorm2d-100 [-1, 640, 28, 28] 1,280
ReLU-101 [-1, 640, 28, 28] 0
Conv2d-102 [-1, 640, 14, 14] 92,160
BatchNorm2d-103 [-1, 640, 14, 14] 1,280
ReLU-104 [-1, 640, 14, 14] 0
Conv2d-105 [-1, 1056, 14, 14] 675,840
BatchNorm2d-106 [-1, 1056, 14, 14] 2,112
Conv2d-107 [-1, 1056, 14, 14] 6,994,944
BatchNorm2d-108 [-1, 1056, 14, 14] 2,112
ReLU-109 [-1, 1088, 14, 14] 0
Block-110 [-1, 1088, 14, 14] 0
Conv2d-111 [-1, 640, 14, 14] 696,320
BatchNorm2d-112 [-1, 640, 14, 14] 1,280
ReLU-113 [-1, 640, 14, 14] 0
Conv2d-114 [-1, 640, 14, 14] 92,160
BatchNorm2d-115 [-1, 640, 14, 14] 1,280
ReLU-116 [-1, 640, 14, 14] 0
Conv2d-117 [-1, 1056, 14, 14] 675,840
BatchNorm2d-118 [-1, 1056, 14, 14] 2,112
ReLU-119 [-1, 1120, 14, 14] 0
Block-120 [-1, 1120, 14, 14] 0
Conv2d-121 [-1, 640, 14, 14] 716,800
BatchNorm2d-122 [-1, 640, 14, 14] 1,280
ReLU-123 [-1, 640, 14, 14] 0
Conv2d-124 [-1, 640, 14, 14] 92,160
BatchNorm2d-125 [-1, 640, 14, 14] 1,280
ReLU-126 [-1, 640, 14, 14] 0
Conv2d-127 [-1, 1056, 14, 14] 675,840
BatchNorm2d-128 [-1, 1056, 14, 14] 2,112
ReLU-129 [-1, 1152, 14, 14] 0
Block-130 [-1, 1152, 14, 14] 0
Conv2d-131 [-1, 640, 14, 14] 737,280
BatchNorm2d-132 [-1, 640, 14, 14] 1,280
ReLU-133 [-1, 640, 14, 14] 0
Conv2d-134 [-1, 640, 14, 14] 92,160
BatchNorm2d-135 [-1, 640, 14, 14] 1,280
ReLU-136 [-1, 640, 14, 14] 0
Conv2d-137 [-1, 1056, 14, 14] 675,840
BatchNorm2d-138 [-1, 1056, 14, 14] 2,112
ReLU-139 [-1, 1184, 14, 14] 0
Block-140 [-1, 1184, 14, 14] 0
Conv2d-141 [-1, 640, 14, 14] 757,760
BatchNorm2d-142 [-1, 640, 14, 14] 1,280
ReLU-143 [-1, 640, 14, 14] 0
Conv2d-144 [-1, 640, 14, 14] 92,160
BatchNorm2d-145 [-1, 640, 14, 14] 1,280
ReLU-146 [-1, 640, 14, 14] 0
Conv2d-147 [-1, 1056, 14, 14] 675,840
BatchNorm2d-148 [-1, 1056, 14, 14] 2,112
ReLU-149 [-1, 1216, 14, 14] 0
Block-150 [-1, 1216, 14, 14] 0
Conv2d-151 [-1, 640, 14, 14] 778,240
BatchNorm2d-152 [-1, 640, 14, 14] 1,280
ReLU-153 [-1, 640, 14, 14] 0
Conv2d-154 [-1, 640, 14, 14] 92,160
BatchNorm2d-155 [-1, 640, 14, 14] 1,280
ReLU-156 [-1, 640, 14, 14] 0
Conv2d-157 [-1, 1056, 14, 14] 675,840
BatchNorm2d-158 [-1, 1056, 14, 14] 2,112
ReLU-159 [-1, 1248, 14, 14] 0
Block-160 [-1, 1248, 14, 14] 0
Conv2d-161 [-1, 640, 14, 14] 798,720
BatchNorm2d-162 [-1, 640, 14, 14] 1,280
ReLU-163 [-1, 640, 14, 14] 0
Conv2d-164 [-1, 640, 14, 14] 92,160
BatchNorm2d-165 [-1, 640, 14, 14] 1,280
ReLU-166 [-1, 640, 14, 14] 0
Conv2d-167 [-1, 1056, 14, 14] 675,840
BatchNorm2d-168 [-1, 1056, 14, 14] 2,112
ReLU-169 [-1, 1280, 14, 14] 0
Block-170 [-1, 1280, 14, 14] 0
Conv2d-171 [-1, 640, 14, 14] 819,200
BatchNorm2d-172 [-1, 640, 14, 14] 1,280
ReLU-173 [-1, 640, 14, 14] 0
Conv2d-174 [-1, 640, 14, 14] 92,160
BatchNorm2d-175 [-1, 640, 14, 14] 1,280
ReLU-176 [-1, 640, 14, 14] 0
Conv2d-177 [-1, 1056, 14, 14] 675,840
BatchNorm2d-178 [-1, 1056, 14, 14] 2,112
ReLU-179 [-1, 1312, 14, 14] 0
Block-180 [-1, 1312, 14, 14] 0
Conv2d-181 [-1, 640, 14, 14] 839,680
BatchNorm2d-182 [-1, 640, 14, 14] 1,280
ReLU-183 [-1, 640, 14, 14] 0
Conv2d-184 [-1, 640, 14, 14] 92,160
BatchNorm2d-185 [-1, 640, 14, 14] 1,280
ReLU-186 [-1, 640, 14, 14] 0
Conv2d-187 [-1, 1056, 14, 14] 675,840
BatchNorm2d-188 [-1, 1056, 14, 14] 2,112
ReLU-189 [-1, 1344, 14, 14] 0
Block-190 [-1, 1344, 14, 14] 0
Conv2d-191 [-1, 640, 14, 14] 860,160
BatchNorm2d-192 [-1, 640, 14, 14] 1,280
ReLU-193 [-1, 640, 14, 14] 0
Conv2d-194 [-1, 640, 14, 14] 92,160
BatchNorm2d-195 [-1, 640, 14, 14] 1,280
ReLU-196 [-1, 640, 14, 14] 0
Conv2d-197 [-1, 1056, 14, 14] 675,840
BatchNorm2d-198 [-1, 1056, 14, 14] 2,112
ReLU-199 [-1, 1376, 14, 14] 0
Block-200 [-1, 1376, 14, 14] 0
Conv2d-201 [-1, 640, 14, 14] 880,640
BatchNorm2d-202 [-1, 640, 14, 14] 1,280
ReLU-203 [-1, 640, 14, 14] 0
Conv2d-204 [-1, 640, 14, 14] 92,160
BatchNorm2d-205 [-1, 640, 14, 14] 1,280
ReLU-206 [-1, 640, 14, 14] 0
Conv2d-207 [-1, 1056, 14, 14] 675,840
BatchNorm2d-208 [-1, 1056, 14, 14] 2,112
ReLU-209 [-1, 1408, 14, 14] 0
Block-210 [-1, 1408, 14, 14] 0
Conv2d-211 [-1, 640, 14, 14] 901,120
BatchNorm2d-212 [-1, 640, 14, 14] 1,280
ReLU-213 [-1, 640, 14, 14] 0
Conv2d-214 [-1, 640, 14, 14] 92,160
BatchNorm2d-215 [-1, 640, 14, 14] 1,280
ReLU-216 [-1, 640, 14, 14] 0
Conv2d-217 [-1, 1056, 14, 14] 675,840
BatchNorm2d-218 [-1, 1056, 14, 14] 2,112
ReLU-219 [-1, 1440, 14, 14] 0
Block-220 [-1, 1440, 14, 14] 0
Conv2d-221 [-1, 640, 14, 14] 921,600
BatchNorm2d-222 [-1, 640, 14, 14] 1,280
ReLU-223 [-1, 640, 14, 14] 0
Conv2d-224 [-1, 640, 14, 14] 92,160
BatchNorm2d-225 [-1, 640, 14, 14] 1,280
ReLU-226 [-1, 640, 14, 14] 0
Conv2d-227 [-1, 1056, 14, 14] 675,840
BatchNorm2d-228 [-1, 1056, 14, 14] 2,112
ReLU-229 [-1, 1472, 14, 14] 0
Block-230 [-1, 1472, 14, 14] 0
Conv2d-231 [-1, 640, 14, 14] 942,080
BatchNorm2d-232 [-1, 640, 14, 14] 1,280
ReLU-233 [-1, 640, 14, 14] 0
Conv2d-234 [-1, 640, 14, 14] 92,160
BatchNorm2d-235 [-1, 640, 14, 14] 1,280
ReLU-236 [-1, 640, 14, 14] 0
Conv2d-237 [-1, 1056, 14, 14] 675,840
BatchNorm2d-238 [-1, 1056, 14, 14] 2,112
ReLU-239 [-1, 1504, 14, 14] 0
Block-240 [-1, 1504, 14, 14] 0
Conv2d-241 [-1, 640, 14, 14] 962,560
BatchNorm2d-242 [-1, 640, 14, 14] 1,280
ReLU-243 [-1, 640, 14, 14] 0
Conv2d-244 [-1, 640, 14, 14] 92,160
BatchNorm2d-245 [-1, 640, 14, 14] 1,280
ReLU-246 [-1, 640, 14, 14] 0
Conv2d-247 [-1, 1056, 14, 14] 675,840
BatchNorm2d-248 [-1, 1056, 14, 14] 2,112
ReLU-249 [-1, 1536, 14, 14] 0
Block-250 [-1, 1536, 14, 14] 0
Conv2d-251 [-1, 640, 14, 14] 983,040
BatchNorm2d-252 [-1, 640, 14, 14] 1,280
ReLU-253 [-1, 640, 14, 14] 0
Conv2d-254 [-1, 640, 14, 14] 92,160
BatchNorm2d-255 [-1, 640, 14, 14] 1,280
ReLU-256 [-1, 640, 14, 14] 0
Conv2d-257 [-1, 1056, 14, 14] 675,840
BatchNorm2d-258 [-1, 1056, 14, 14] 2,112
ReLU-259 [-1, 1568, 14, 14] 0
Block-260 [-1, 1568, 14, 14] 0
Conv2d-261 [-1, 640, 14, 14] 1,003,520
BatchNorm2d-262 [-1, 640, 14, 14] 1,280
ReLU-263 [-1, 640, 14, 14] 0
Conv2d-264 [-1, 640, 14, 14] 92,160
BatchNorm2d-265 [-1, 640, 14, 14] 1,280
ReLU-266 [-1, 640, 14, 14] 0
Conv2d-267 [-1, 1056, 14, 14] 675,840
BatchNorm2d-268 [-1, 1056, 14, 14] 2,112
ReLU-269 [-1, 1600, 14, 14] 0
Block-270 [-1, 1600, 14, 14] 0
Conv2d-271 [-1, 640, 14, 14] 1,024,000
BatchNorm2d-272 [-1, 640, 14, 14] 1,280
ReLU-273 [-1, 640, 14, 14] 0
Conv2d-274 [-1, 640, 14, 14] 92,160
BatchNorm2d-275 [-1, 640, 14, 14] 1,280
ReLU-276 [-1, 640, 14, 14] 0
Conv2d-277 [-1, 1056, 14, 14] 675,840
BatchNorm2d-278 [-1, 1056, 14, 14] 2,112
ReLU-279 [-1, 1632, 14, 14] 0
Block-280 [-1, 1632, 14, 14] 0
Conv2d-281 [-1, 640, 14, 14] 1,044,480
BatchNorm2d-282 [-1, 640, 14, 14] 1,280
ReLU-283 [-1, 640, 14, 14] 0
Conv2d-284 [-1, 640, 14, 14] 92,160
BatchNorm2d-285 [-1, 640, 14, 14] 1,280
ReLU-286 [-1, 640, 14, 14] 0
Conv2d-287 [-1, 1056, 14, 14] 675,840
BatchNorm2d-288 [-1, 1056, 14, 14] 2,112
ReLU-289 [-1, 1664, 14, 14] 0
Block-290 [-1, 1664, 14, 14] 0
Conv2d-291 [-1, 640, 14, 14] 1,064,960
BatchNorm2d-292 [-1, 640, 14, 14] 1,280
ReLU-293 [-1, 640, 14, 14] 0
Conv2d-294 [-1, 640, 14, 14] 92,160
BatchNorm2d-295 [-1, 640, 14, 14] 1,280
ReLU-296 [-1, 640, 14, 14] 0
Conv2d-297 [-1, 1056, 14, 14] 675,840
BatchNorm2d-298 [-1, 1056, 14, 14] 2,112
ReLU-299 [-1, 1696, 14, 14] 0
Block-300 [-1, 1696, 14, 14] 0
Conv2d-301 [-1, 1280, 14, 14] 2,170,880
BatchNorm2d-302 [-1, 1280, 14, 14] 2,560
ReLU-303 [-1, 1280, 14, 14] 0
Conv2d-304 [-1, 1280, 7, 7] 368,640
BatchNorm2d-305 [-1, 1280, 7, 7] 2,560
ReLU-306 [-1, 1280, 7, 7] 0
Conv2d-307 [-1, 2176, 7, 7] 2,785,280
BatchNorm2d-308 [-1, 2176, 7, 7] 4,352
Conv2d-309 [-1, 2176, 7, 7] 33,214,464
BatchNorm2d-310 [-1, 2176, 7, 7] 4,352
ReLU-311 [-1, 2304, 7, 7] 0
Block-312 [-1, 2304, 7, 7] 0
Conv2d-313 [-1, 1280, 7, 7] 2,949,120
BatchNorm2d-314 [-1, 1280, 7, 7] 2,560
ReLU-315 [-1, 1280, 7, 7] 0
Conv2d-316 [-1, 1280, 7, 7] 368,640
BatchNorm2d-317 [-1, 1280, 7, 7] 2,560
ReLU-318 [-1, 1280, 7, 7] 0
Conv2d-319 [-1, 2176, 7, 7] 2,785,280
BatchNorm2d-320 [-1, 2176, 7, 7] 4,352
ReLU-321 [-1, 2432, 7, 7] 0
Block-322 [-1, 2432, 7, 7] 0
Conv2d-323 [-1, 1280, 7, 7] 3,112,960
BatchNorm2d-324 [-1, 1280, 7, 7] 2,560
ReLU-325 [-1, 1280, 7, 7] 0
Conv2d-326 [-1, 1280, 7, 7] 368,640
BatchNorm2d-327 [-1, 1280, 7, 7] 2,560
ReLU-328 [-1, 1280, 7, 7] 0
Conv2d-329 [-1, 2176, 7, 7] 2,785,280
BatchNorm2d-330 [-1, 2176, 7, 7] 4,352
ReLU-331 [-1, 2560, 7, 7] 0
Block-332 [-1, 2560, 7, 7] 0
AdaptiveAvgPool2d-333 [-1, 2560, 1, 1] 0
Linear-334 [-1, 4] 10,244
================================================================
Total params: 95,008,356
Trainable params: 95,008,356
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 664.46
Params size (MB): 362.43
Estimated Total Size (MB): 1027.47
----------------------------------------------------------------
# 编写训练函数
def train(dataloader,model,loss_fn,optimizer):
size = len(dataloader.dataset)
num_batches = len(dataloader)
train_acc,train_loss = 0,0
for X,y in dataloader:
X,y = X.to(device),y.to(device)
pred = model(X)
loss = loss_fn(pred,y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_loss += loss.item()
train_acc += (pred.argmax(1) == y).type(torch.float).sum().item()
train_loss /= num_batches
train_acc /= size
return train_acc,train_loss
# 编写测试函数
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset) # 测试集的大小
num_batches = len(dataloader) # 批次数目, (size/batch_size,向上取整)
test_loss, test_acc = 0, 0
# 当不进行训练时,停止梯度更新,节省计算内存消耗
with torch.no_grad():
for imgs, target in dataloader:
imgs, target = imgs.to(device), target.to(device)
# 计算loss
target_pred = model(imgs)
loss = loss_fn(target_pred, target)
test_loss += loss.item()
test_acc += (target_pred.argmax(1) == target).type(torch.float).sum().item()
test_acc /= size
test_loss /= num_batches
return test_acc, test_loss
import copy
loss_fn = nn.CrossEntropyLoss()
learn_rate = 1e-4
# SGD与Adam优化器,选择其中一个
# opt = torch.optim.SGD(model.parameters(),lr=learn_rate)
opt = torch.optim.Adam(model.parameters(),lr=learn_rate)
scheduler=torch.optim.lr_scheduler.StepLR(opt,step_size=1,gamma=0.9) #定义学习率高度器
epochs = 100 #设置训练模型的最大轮数为100,但可能到不了100
patience=10 #早停的耐心值,即如果模型连续10个周期没有准确率提升,则跳出训练
train_loss=[]
train_acc=[]
test_loss=[]
test_acc=[]
best_acc = 0 #设置一个最佳的准确率,作为最佳模型的判别指标
no_improve_epoch=0 #用于跟踪准确率是否提升的计数器
epoch=0 #用于统计最终的训练模型的轮数,这里设置初始值为0;为绘图作准备,这里的绘图范围不是epochs = 100
#开始训练
for epoch in range(epochs):
model.train()
epoch_train_acc,epoch_train_loss = train(train_dl,model,loss_fn,opt)
model.eval()
epoch_test_acc,epoch_test_loss = test(test_dl,model,loss_fn)
if epoch_test_acc > best_acc:
best_acc = epoch_test_acc
best_model = copy.deepcopy(model)
no_improve_epoch=0 #重置计数器
#保存最佳模型的检查点
PATH='./J4_best_model(J4_DNF98).pth'
torch.save({
'epoch':epoch,
'model_state_dict':best_model.state_dict(),
'optimizer_state_dict':opt.state_dict(),
'loss':epoch_test_loss,
},PATH)
else:
no_improve_epoch += 1
if no_improve_epoch >= patience:
print(f"Early stoping triggered at epoch {epoch+1}")
break #早停
train_acc.append(epoch_train_acc)
train_loss.append(epoch_train_loss)
test_acc.append(epoch_test_acc)
test_loss.append(epoch_test_loss)
scheduler.step() #更新学习率
lr = opt.state_dict()['param_groups'][0]['lr']
template = ('Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}, Lr:{:.2E}')
print(template.format(epoch+1, epoch_train_acc*100, epoch_train_loss,epoch_test_acc*100, epoch_test_loss, lr))
# 保存最佳模型到文件中
PATH='./j4_best_model_DNF98.pth' #保存的参数文件名
torch.save(model.state_dict(),PATH)
print('Done')
print(epoch)
print('no_improve_epoch:',no_improve_epoch)
代码输出:
Epoch: 1, Train_acc:37.4%, Train_loss:1.406, Test_acc:50.4%, Test_loss:1.136, Lr:9.00E-05
Epoch: 2, Train_acc:59.3%, Train_loss:0.960, Test_acc:60.2%, Test_loss:1.260, Lr:8.10E-05
Epoch: 3, Train_acc:70.1%, Train_loss:0.801, Test_acc:69.9%, Test_loss:0.913, Lr:7.29E-05
Epoch: 4, Train_acc:75.0%, Train_loss:0.624, Test_acc:70.8%, Test_loss:0.926, Lr:6.56E-05
Epoch: 5, Train_acc:75.7%, Train_loss:0.629, Test_acc:51.3%, Test_loss:1.493, Lr:5.90E-05
Epoch: 6, Train_acc:81.2%, Train_loss:0.467, Test_acc:73.5%, Test_loss:0.799, Lr:5.31E-05
Epoch: 7, Train_acc:85.6%, Train_loss:0.373, Test_acc:75.2%, Test_loss:0.684, Lr:4.78E-05
Epoch: 8, Train_acc:88.7%, Train_loss:0.324, Test_acc:79.6%, Test_loss:0.649, Lr:4.30E-05
Epoch: 9, Train_acc:88.5%, Train_loss:0.330, Test_acc:78.8%, Test_loss:0.687, Lr:3.87E-05
Epoch:10, Train_acc:90.7%, Train_loss:0.268, Test_acc:81.4%, Test_loss:0.597, Lr:3.49E-05
Epoch:11, Train_acc:93.4%, Train_loss:0.187, Test_acc:83.2%, Test_loss:0.613, Lr:3.14E-05
Epoch:12, Train_acc:93.4%, Train_loss:0.221, Test_acc:85.8%, Test_loss:0.573, Lr:2.82E-05
Epoch:13, Train_acc:92.9%, Train_loss:0.183, Test_acc:83.2%, Test_loss:0.504, Lr:2.54E-05
Epoch:14, Train_acc:96.5%, Train_loss:0.135, Test_acc:80.5%, Test_loss:0.590, Lr:2.29E-05
Epoch:15, Train_acc:92.7%, Train_loss:0.205, Test_acc:84.1%, Test_loss:0.456, Lr:2.06E-05
Epoch:16, Train_acc:97.1%, Train_loss:0.102, Test_acc:81.4%, Test_loss:0.907, Lr:1.85E-05
Epoch:17, Train_acc:96.0%, Train_loss:0.118, Test_acc:86.7%, Test_loss:0.571, Lr:1.67E-05
Epoch:18, Train_acc:97.1%, Train_loss:0.085, Test_acc:84.1%, Test_loss:0.512, Lr:1.50E-05
Epoch:19, Train_acc:98.2%, Train_loss:0.070, Test_acc:82.3%, Test_loss:0.498, Lr:1.35E-05
Epoch:20, Train_acc:97.6%, Train_loss:0.064, Test_acc:85.0%, Test_loss:0.524, Lr:1.22E-05
Epoch:21, Train_acc:98.0%, Train_loss:0.059, Test_acc:86.7%, Test_loss:0.620, Lr:1.09E-05
Epoch:22, Train_acc:99.1%, Train_loss:0.048, Test_acc:86.7%, Test_loss:0.484, Lr:9.85E-06
Epoch:23, Train_acc:98.2%, Train_loss:0.056, Test_acc:87.6%, Test_loss:0.506, Lr:8.86E-06
Epoch:24, Train_acc:99.3%, Train_loss:0.034, Test_acc:85.0%, Test_loss:0.396, Lr:7.98E-06
Epoch:25, Train_acc:98.5%, Train_loss:0.063, Test_acc:85.0%, Test_loss:0.475, Lr:7.18E-06
Epoch:26, Train_acc:99.1%, Train_loss:0.039, Test_acc:84.1%, Test_loss:0.558, Lr:6.46E-06
Epoch:27, Train_acc:98.2%, Train_loss:0.048, Test_acc:86.7%, Test_loss:0.718, Lr:5.81E-06
Epoch:28, Train_acc:99.3%, Train_loss:0.027, Test_acc:86.7%, Test_loss:0.462, Lr:5.23E-06
Epoch:29, Train_acc:98.9%, Train_loss:0.033, Test_acc:86.7%, Test_loss:0.868, Lr:4.71E-06
Epoch:30, Train_acc:99.6%, Train_loss:0.026, Test_acc:87.6%, Test_loss:0.451, Lr:4.24E-06
Epoch:31, Train_acc:99.1%, Train_loss:0.038, Test_acc:85.8%, Test_loss:0.471, Lr:3.82E-06
Epoch:32, Train_acc:99.3%, Train_loss:0.029, Test_acc:86.7%, Test_loss:0.459, Lr:3.43E-06
Epoch:33, Train_acc:98.5%, Train_loss:0.052, Test_acc:89.4%, Test_loss:0.484, Lr:3.09E-06
Epoch:34, Train_acc:98.9%, Train_loss:0.048, Test_acc:88.5%, Test_loss:0.484, Lr:2.78E-06
Epoch:35, Train_acc:98.2%, Train_loss:0.104, Test_acc:86.7%, Test_loss:0.493, Lr:2.50E-06
Epoch:36, Train_acc:99.3%, Train_loss:0.029, Test_acc:87.6%, Test_loss:0.450, Lr:2.25E-06
Epoch:37, Train_acc:99.8%, Train_loss:0.027, Test_acc:87.6%, Test_loss:0.505, Lr:2.03E-06
Epoch:38, Train_acc:99.6%, Train_loss:0.020, Test_acc:87.6%, Test_loss:0.438, Lr:1.82E-06
Epoch:39, Train_acc:99.6%, Train_loss:0.032, Test_acc:87.6%, Test_loss:0.474, Lr:1.64E-06
Epoch:40, Train_acc:99.3%, Train_loss:0.024, Test_acc:85.8%, Test_loss:0.497, Lr:1.48E-06
Epoch:41, Train_acc:99.1%, Train_loss:0.033, Test_acc:85.0%, Test_loss:0.486, Lr:1.33E-06
Epoch:42, Train_acc:100.0%, Train_loss:0.013, Test_acc:86.7%, Test_loss:0.443, Lr:1.20E-06
Early stoping triggered at epoch 43
Done
42
no_improve_epoch: 10
# 结果可视化
# Loss与Accuracy图
import matplotlib.pyplot as plt
#隐藏警告
import warnings
warnings.filterwarnings("ignore") #忽略警告信息
plt.rcParams['font.sans-serif'] = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号
plt.rcParams['figure.dpi'] = 100 #分辨率
epochs_range = range(epoch)
print(epochs_range)
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, train_acc, label='Training Accuracy')
plt.plot(epochs_range, test_acc, label='Test Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss, label='Training Loss')
plt.plot(epochs_range, test_loss, label='Test Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
代码输出:
range(0, 42)
# 预测
from PIL import Image
classes = list(total_data.class_to_idx)
def predict_one_image(image_path, model, transform, classes):
test_img = Image.open(image_path).convert('RGB')
plt.imshow(test_img) # 展示预测的图片
test_img = transform(test_img)
img = test_img.to(device).unsqueeze(0)
model.eval()
output = model(img)
_,pred = torch.max(output,1)
pred_class = classes[pred]
print(f'预测结果是:{pred_class}')
import os
from pathlib import Path
import random
#下面定义了两种同名的mage_path(data_dir)函数,是为了从所有的图片的随机选择一张图片,这两种函数都是可以使用的
# def image_path(data_dir):
# file_list=os.listdir(data_dir) #列出四个分类标签
# data_file_dir=random.choice(file_list) #从四个分类标签中随机选择一个
# data_dir=Path(data_dir) #data_dir是字符串,要转换为Path对象
# data_file_dir=Path(data_file_dir) #data_file_dir是字符串,要转换为Path对象
# image_file_path=data_dir.joinpath(data_file_dir) #拼接路径
# data_file_paths=image_file_path.iterdir() #罗列文件夹的内容
# data_file_paths=list(data_file_paths) #要转换为列表
# file=random.choice(data_file_paths) #从所有的图像中随机选择一张
# return file
image=[]
def image_path(data_dir):
file_list=os.listdir(data_dir) #列出四个分类标签
data_file_dir=file_list #从四个分类标签中随机选择一个
data_dir=Path(data_dir)
for i in data_file_dir:
i=Path(i)
image_file_path=data_dir.joinpath(i) #拼接路径
data_file_paths=image_file_path.iterdir() #罗列文件夹的内容
data_file_paths=list(data_file_paths) #要转换为列表
image.append(data_file_paths)
file=random.choice(image) #从所有的图像中随机选择一类
file=random.choice(file) #从选择的类中随机选择一张图片
return file
data_dir='J3/bird_photos'
image_path=image_path(data_dir)
image_path
代码输出:
WindowsPath('J3/bird_photos/Black Skimmer/015.jpg')
# 预测训练集中的某张照片
predict_one_image(image_path=image_path,
model=model,
transform=train_transforms,
classes=classes)
代码输出:
预测结果是:Black Skimmer
# 模型评估
# 将参数加载到model当中
best_model.load_state_dict(torch.load(PATH,map_location=device))
epoch_test_acc,epoch_test_loss=test(test_dl,best_model,loss_fn)
epoch_test_acc,epoch_test_loss
代码输出:
(0.8672566371681416, 0.49821303264858824)
三、总结
这次任务对我来说,属实有点困难。我曾经将残差连接和密集连接交替连接各个层,也想着先进行ResNet50V2,再进行DenseNet(或者是先进行DenseNet,再进行ResNet50V2),都没有成功。只好学习和运行别人成熟的模型了。