ResNet、ResNext学习

最新推荐文章于 2024-07-25 08:49:42 发布

我什么都不懂zvz

最新推荐文章于 2024-07-25 08:49:42 发布

阅读量1.1k

点赞数 20

分类专栏：分类模型文章标签：学习

本文链接：https://blog.csdn.net/Je1zvz/article/details/136208456

版权

分类模型专栏收录该内容

3 篇文章 0 订阅

订阅专栏

参考：
【1】https://www.bilibili.com/video/BV1rX4y1N7tE/?spm_id_from=333.788&vd_source=9e9b4b6471a6e98c3e756ce7f41eb134
【2】https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/blob/master/pytorch_classification/Test5_resnet/model.py

1 ResNet基本结构和注意的地方

1.1 模型结构示意图

在这里插入图片描述
基本的结构块为conv->bn->relu，且假设输入的图像为(224,224,3)，以下为Flow的细节：
1） 经过layer1 ：卷积的参数为(kernel_size = 7, stride =2, padding=3) ，则输出的图像大小为 $\frac{224-7+6}{2}+1 = 112（向下取整）$ 。然后经过Maxpool2d，高宽减半，既变成(56,56,64)

2） 经过layer2(对于resnet18/34而言)：此处残差块参数为(kernel_size=3,stride=1,padding=1)，既高宽不变，维度不变，既输出的特征变为(56,56,64)
在这里插入图片描述
3）经过layer2(对于resnet50/101/152)：这时输出的维度变成了256，若要做残差连接，在第一层必须有一个1x1的卷积进行升维处理，但是高宽保持不变，故该残差path上的1x1的卷积的步长为1，padding为1.

4）经过layer3/4/5：每一个第一层都需要改变输入的维度和高宽，对于resnet18/34，则在第一个3x3conv上将步长设置为2；对于resnet50/101/154，则将3x3conv的步长设置为2；并且将残差path上的1x1conv的步长也设置为2，如图所示
在这里插入图片描述

1.2 参数量

在这里插入图片描述

1.3 残差结构为什么有用

这里只是做一个简单的数学推理：假设一个简单的residual block为fn+bn->relu->fn+bn->residual，设置这一整个block为函数 $f (x, w)$ ，则有输出公式：
$y = x + f (x, w)$
则 $\frac{\partial y}{\partial x} = I +\frac{\partial f(x,w)}{\partial x}$ ，而 $\frac{\partial l}{\partial x} =\frac{\partial y}{\partial x}\frac{\partial l}{\partial y}= ( I +\frac{\partial f(x,w)}{\partial x})\frac{\partial l}{\partial y}$

这样每一次传导梯度的时候就避免了过小值，而优化器SGD的更新和梯度是有很大关系的。

2 ResNext相对于ResNet的改进

2.1 Group Convolution

参考：https://blog.csdn.net/caip12999203000/article/details/126693895
在这里插入图片描述
可见组卷积的参数量是普通卷积的 $\frac{1}{g}$ 倍，则起到了相当于正则的作用；
缺点就是各个组之间没有互相通信。

2.2 Block的介绍

在这里插入图片描述

以上三种结构是等价的，所以可以用最后的©去替换掉原来的residual block

在这里插入图片描述

C表示组卷积的group数量，4表示group 卷积的个数
总体来看，ResNext和ResNet大致是一样的，不同之处在于将residual block的前两个conv的输出维度提高了一倍，并且中间的3x3conv替换成了group=32且group_num=4的组卷积

2.3 注意

只有在block的深度>=3时，group conv才有意义，所以一般都是ResNet50及以上才会去改进为ResNext

3 ResNet和ResNext手敲代码

3.1 ResNet部分

3.1.1 Basic Block

在这里插入图片描述

这个Block的规律如下：

第一个和第二个卷积的kernel size均为3，padding均为1，且只有两个卷积
如果没有残差1x1 conv，则第一个卷积的stride为1；如果有残差1x1 conv，则第一个卷积的stride为2；第二个卷积的stride均为1；
输出维度均为out_channels，此处设置变量expansion=1
设置downsample判断是否需要下采样
conv不要bias，因为使用了BN层

手敲代码：

import torch
import torch.nn as nn

class BasicBlock(nn.Module):
	expansion = 1
	def __init__(self,in_channel,out_channel,stride=1,downsample=None,**kwargs):
		super().__init__()
		self.conv1 = nn.Conv2d(in_channels=in_channel,
								out_channels=out_channel,
								kernel_size=3,
								stride=stride,padding=1,
								bias=False)
		self.bn1 = nn.BatchNorm2d(out_channel)
		self.conv2 = nn.Conv2d(in_channels=out_channel,
								out_channels=out_channel,
								kernel_size=3,
								stride=1, padding=1,
								bias=False) # 此处stride设置为1
		self.bn2 = nn.BatchNorm2d(out_channel)
		self.relu = nn.ReLU()
		self.downsample = downsample
	
	def forward(self,x):
		identity = x
		if self.downsample is not None:
			identity = self.downsample(x)
		x = self.relu(self.bn1(self.conv1(x)))
		x = self.bn2(self.conv2(x))
		x += identity
		out = self.relu(x)
		return out

3.2 BottleNeck

在这里插入图片描述

这个Block的规律如下：

共有三个卷积，kernel_size分别为1、3、1
第三个卷积的out_channel，为第1、2个卷积out_channel的4倍，所以设置expansion=4
有残差1x1 conv时，第二个卷积的stride为2

手敲代码：

class BottleNeck(nn.Module):
    expansion = 4 
    def __init__(self,in_channel,out_channel,stride=1,downsample=None,**kwargs):
        super().__init__()
        self.downsample= downsample
        self.conv1 = nn.Conv2d(in_channels=in_channel,out_channels=out_channel,kernel_size=1,stride=1,bias=False)
        self.bn1 = nn.BatchNorm2d(out_channel)
        self.conv2 = nn.Conv2d(in_channels=out_channel,out_channels=out_channel,kernel_size=stride,stride=1,padding=1,bias=False)
        self.bn2 = nn.BatchNorm2d(out_channel)
        self.conv1 = nn.Conv2d(in_channels=out_channel,out_channels=out_channel*self.expansion,kernel_size=1,stride=1,bias=False)
        self.bn1 = nn.BatchNorm2d(out_channel*self.expansion)
        self.relu = nn.ReLU(inplace=True)
        
    def forward(self,x):
        identity = x
        if self.downsample is not None:
            identity = self.downsample(x)
        x = self.relu(self.bn1(self.conv1(x)))
        x = self.relu(self.bn2(self.conv2(x)))
        x = self.bn3(self.conv3(x))
        x += identity
        return self.relu(x)

默写写错的地方

第一个和第三个conv是没有padding的，第二个conv padding=1！！
conv只能接收参数groups，不接收参数width_per_groups，用来计算width的

3.3 ResNet

在这里插入图片描述
对于Block的创建的小总结：

layer1 都是相同的，使用7x7 conv(k=7,s=2,p=3)，然后使用maxpool层
对于layer2，resnet18/34和resnet50/101/154不一样，通过判断第一层的out_channel和最后一层的out_channel即可知道（比如resnet18，out_channel都是64； resnet50，out_channel分别为64和64*4=256），所以这是一个是否需要downsample的判断条件。
对于layer2，传入的stride为1，因为无论是否有残差结构都为stride=1
对于layer3、4、5，传入的stride为2，判断是否需要downsample的第二个判断条件就是stride是否为1，如果不是，则第一个block是需要downsample的。
对于residual的1x1 conv，out_channel为第一层的out_channel乘以expansion

__init__函数传入参数介绍：

block： resnet18/34的BasicBlock 或 resnet50/101/154的BottleNeck
blocks_num ：为列表类型，表示每一层的block重复多少次，比如resnet50为[3,4,6,3]
num_classes ：类别数，分类头用
include_top ：是否使用分类头，既layer5之后的部分

make_layer函数传入参数介绍以及函数说明：

block ：注意block包含的参数为in_c; out_c; stride; downsample;
channel ：既第一个卷积的out_channel
block_num ：上述
stride ：步长
设置downsample为None，并根据判断条件1) 步长是否为1 ； 2）第一个卷积的out_channel（也就是channel）是否等于第三个卷积的out_channel（也就是self.in_channel * self.expansion)，如果满足条件之一，则有downsample。第一个条件是针对layer3/4/5，第二个条件是针对layer2。
downsample为1x1 conv(k=1, s=stride, padding=1, out_channel = self.in_channel*self.expansion)
当完成第一个block之后，循环遍历其他block，这时候是没有downsample的，直接传入block即可

手敲代码：

class ResNet(nn.Module):
	def __init__(self,
				block,
				blocks_num,
				num_classes=1000,
				include_top=True):
		super().__init__()
		self.include_top = include_top
		self.in_channel = 64
		
		self.conv1 = nn.Conv2d(3,self.in_channel,kernel_size=7,stride=2,padding=3,bias=False)
		self.bn1 = nn.BatchNorm2d(self.in_channel)
		self.maxpool = nn.MaxPool2d(kernel_size=3,stride=2,padding=1)
		
		self.layer1 = self.make_layer(block,64,blocks_num[0])
		self.layer2 = self.make_layer(block,128,blocks_num[1],stride=2)
		self.layer3 = self.make_layer(block,256,blocks_num[2],stride=2)
		self.layer4 = self.make_layer(block,512,blocks_num[3],stride=2)
		
		if self.include_top :
			self.avgpool = nn.AdaptiveAvgpool2d((1,1))
			self.fc = nn.Linear(512* block.expansion,num_classes)
		
		for m in self.modules():
			if isinstance(m,nn.Conv2d):
				nn.init.kaiming_normal_(m.weight,mode='fan_out',nonlinearity='relu')
	
	def make_layer(block,channel,block_num,stride=1):
		downsample = None
		if stride!=1 or self.in_channel != channel*block.expansion:
			downsample = nn.Sequential(
					nn.Conv2d(self.in_channel,channel*block.expansion,kernel_size=1,stride=stride)
					nn.BatchNorm2d(channel*block.expansion))
		layers = []
		layers.append(block(self.in_channel,channel*block.expansion,stride=stride,downsample=downsample))
		self.in_channel = channel*block.expansion
		for _ in range(1,block_num):
			layers.append(block(self.in_channel,channel*block.expansion))
		return nn.Sequential(*layers)

测试代码：

def resnet34(num_classes=1000,include_top=True):
	return ResNet(Basic,[3,4,6,3],num_classes=num_classes,include_top=include_top)

model = resnet34(3)
x = torch.rand((2,3,224,224))
out = model(x)
print(out)

在这里插入图片描述

3.2 ResNext部分

在这里插入图片描述
基于ResNet50上需要修改的地方：

第一二个卷积上的输出维度翻倍，翻倍用下面的公式
第二个卷积换成group conv，width_per_group=32, group=4
width = int(out_channel *(width_per_group/ 64.))*group ，如果group=1，width_per_group=64，则后面的系数为1；若group=2，width_per_group=32, 则后面的系数为2，则width为out_channel的两倍
只用修改BottleNeck以及make_layers函数！

修改的代码：

BottleNeck部分

class BottleNeck(nn.Module):
    expansion = 4 
    def __init__(self,in_channel,out_channel,stride=1,downsample=None,groups=1,width_per_group=64,**kwargs):
        # ResNext : 增加width； 修改channel; 第二个卷积增加 group和width_per_group参数
        super().__init__()
        width = int(out_channel*(width_per_group/64.)) * groups
        self.downsample = downsample
        self.conv1 = nn.Conv2d(in_channels=in_channel,out_channels=width,kernel_size=1,stride=1,bias=False)
        self.bn1 = nn.BatchNorm2d(width)
        self.conv2 = nn.Conv2d(in_channels=width,out_channels=width,kernel_size=3,padding=1,stride=stride,bias=False,groups=groups)
        self.bn2 = nn.BatchNorm2d(width)
        self.conv3 = nn.Conv2d(in_channels=width,out_channels=out_channel*self.expansion,kernel_size=1,stride=1,bias=False)
        self.bn3 = nn.BatchNorm2d(out_channel*self.expansion)
        self.relu = nn.ReLU(inplace=True)
        
    def forward(self,x):
        identity = x
        if self.downsample is not None:
            identity = self.downsample(x)
        x = self.relu(self.bn1(self.conv1(x)))
        x = self.relu(self.bn2(self.conv2(x)))
        x = self.bn3(self.conv3(x))
        x += identity
        return self.relu(x)

ResNet部分

class ResNet(nn.Module):
    def __init__(self,block,blocks_num,num_classes=1000,include_top=True,groups=1,width_per_group=64):

make_layer部分

    def make_layer(self,block,channel,block_num,stride=1):
        downsample = None
        if stride != 1 or self.in_channel != channel*block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(in_channels = self.in_channel, out_channels = block.expansion*channel, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(block.expansion* channel)
            )
        
        layers = []
        layers.append(block(self.in_channel,
                            channel,
                            downsample=downsample,
                            stride=stride,
                            groups=self.groups,
                           ))
        self.in_channel = channel * block.expansion
        
        for _ in range(1,block_num):
            layers.append(block(self.in_channel,
                                channel,
                                groups=self.groups,
                           
                                ))
        
        return nn.Sequential(*layers)

测试

def resnext50_32x4d(num_classes=1000,include_top=True):
    return ResNet(BottleNeck,[3,4,6,4],num_classes,include_top,groups=4,width_per_group=32)

#model = resnet34(num_classes=3)
#model = resnet50(3)
model = resnext50_32x4d(3)
x = torch.rand((64,3,224,224))
out = model(x)
print(out)