文章内容来自ResNet50 网络结构搭建(PyTorch)_resnet50网络结构_New WR的博客-CSDN博客
详细内容可看上面网站。
一、原理
ResNet原文中的表格列出了几种基本的网络结构配置,ResNet50是50-layer的一列,如下表:
首先是起始阶段的输入层,即layer0层,由一个7x7,步距为2的卷积+BN+relu,加上3x3最大值池化,步长为2的池化层构成。如下图所示:
后面几层都是由单个的残差模块构成,基本公式是x+f(x),如layer1模块,具体过程如下图所示:
上图意思是在该模块传入x后,对x经过两种处理:一是分别经过卷积核是3X3和1X1的两次卷积处理,当然,两次卷积处理都有BN+relu,处理后得到输出f(x);二是对x做shape改变,即做下采样变换,保证x与f(x)的shape相同,然后将这两种不同处理的输出相加。这样做的好处是模块的最终输出既保留上一层的详细信息,又增加了两次卷积的加强信息,主要防止深层网络的梯度消失。
二、代码如下
import torch.nn as nn
import torch
#1.每个模块的Bottleneck实现
class Bottleneck(nn.Module):
"""
__init__
in_channel:残差块输入通道数
out_channel:残差块输出通道数
stride:卷积步长
downsample:在_make_layer函数中赋值,用于控制shortcut图片下采样 H/2 W/2
"""
expansion=4
def __init__(self,in_channel,out_channel,stride=1,downsample=None):
super(Bottleneck,self).__init__()
#1X1的卷积,H,W不变,C: in_channel -> out_channel
self.conv1=nn.Conv2d(in_channel,out_channel,kernel_size=1,stride=1,bias=False)
self.bn1=nn.BatchNorm2d(out_channel)
# 3X3的卷积,# H/2,W/2,C不变
self.conv2=nn.Conv2d(out_channel,out_channel,kernel_size=3, stride=stride, bias=False, padding=1)
self.bn2=nn.BatchNorm2d(out_channel)
# 1X1的卷积,H,W不变,C: out_channel -> 4*out_channel
self.conv3=nn.Conv2d(out_channel,out_channel*self.expansion, kernel_size=1, stride=1, bias=False)
self.bn3=nn.BatchNorm2d(out_channel*self.expansion)
self.relu=nn.ReLU(inplace=True)
self.downsample=downsample
def forward(self,x):
identity=x
#判断x要不要下采样,为了使x与f(x)的shape相同,这样才能相加。 # 如果需要下采样,那么shortcut后:H/2,W/2,C: out_channel -> 4*out_channel
if self.downsample is not None:
identity=self.downsample(x)
#1X1卷积
out=self.conv1(x)
out=self.bn1(out)
out=self.relu(out)
# 3X3卷积
out=self.conv2(out)
out=self.bn2(out)
out=self.relu(out)
# 1X1卷积
out=self.conv3(out)
out=self.bn3(out)
# x+f(x)
out+=identity
out=self.relu(out)
return out
#2.ResNet网络的实现
class ResNet(nn.Module):
"""
__init__
block: 堆叠的基本模块
block_num: 基本模块堆叠个数,是一个list,对于resnet50=[3,4,6,3]
num_classes: 全连接之后的分类特征维度
_make_layer
block: 堆叠的基本模块
channel: 每个stage中堆叠模块的第一个卷积的卷积核个数,对resnet50分别是:64,128,256,512
block_num: 当期stage堆叠block个数
stride: 默认卷积步长
"""
def __init__(self,block,block_num,num_classes=1000):
super(ResNet,self).__init__()
self.in_channel = 64 # conv1的输出维度
#第0层的卷积模块
self.conv1=nn.Conv2d(in_channels=3,out_channels=self.in_channel,kernel_size=7,stride=2, padding=3, bias=False)
self.bn1=nn.BatchNorm2d(self.in_channel)
self.relu=nn.ReLU(inplace=True)
self.maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)
self.layer1 = self._make_layer(block=block, channel=64, block_num=block_num[0],
stride=1) # H,W不变。downsample控制的shortcut,out_channel=64x4=256
self.layer2 = self._make_layer(block=block, channel=128, block_num=block_num[1],
stride=2) # H/2, W/2。downsample控制的shortcut,out_channel=128x4=512
self.layer3 = self._make_layer(block=block, channel=256, block_num=block_num[2],
stride=2) # H/2, W/2。downsample控制的shortcut,out_channel=256x4=1024
self.layer4 = self._make_layer(block=block, channel=512, block_num=block_num[3],
stride=2) # H/2, W/2。downsample控制的shortcut,out_channel=512x4=2048
self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) # 将每张特征图大小->(1,1),则经过池化后的输出维度=通道数
self.fc = nn.Linear(in_features=512 * block.expansion, out_features=num_classes)
for m in self.modules(): # 权重初始化
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
def _make_layer(self, block, channel, block_num, stride=1):
downsample = None # 用于控制shorcut路的
if stride != 1 or self.in_channel != channel * block.expansion:
downsample=nn.Sequential(
nn.Conv2d(self.in_channel,channel*block.expansion, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(channel*block.expansion)
)
layers = [] # 每一个convi_x的结构保存在一个layers列表中,i={2,3,4,5}
layers.append(block(in_channel=self.in_channel, out_channel=channel, downsample=downsample,
stride=stride)) # 定义convi_x中的第一个残差块,只有第一个需要设置downsample和stride
self.in_channel = channel * block.expansion # 在下一次调用_make_layer函数的时候,self.in_channel已经x4
for _ in range(1,block_num):
layers.append(block(self.in_channel,channel))
return nn.Sequential(*layers)
def forward(self,x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
x = torch.flatten(x, 1)
x = self.fc(x)
return x
def resnet50(num_classes=1000):
return ResNet(block=Bottleneck, block_num=[3, 4, 6, 3], num_classes=num_classes)
if __name__ == '__main__':
input = torch.randn(1, 3, 224, 224) # B C H W
print(input.shape)
ResNet50 = resnet50(1000)
output = ResNet50.forward(input)
print(ResNet50)
print(output.shape)
print(output)