Pytorch实现ResNet网络模型

0. Pytorch的nn.Conv2d()详解

nn.Conv2d()的使用、形参与隐藏的权重参数

二维卷积应该是最常用的卷积方式了,在Pytorchnn模块中,封装了nn.Conv2d()类作为二维卷积的实现。使用方法和普通的类一样,先实例化再使用。

下面是一个只有一层二维卷积的神经网络,作为nn.Conv2d()方法的使用简介:

class Net(nn.Module):
    def __init__(self):
        nn.Module.__init__(self)
        self.conv2d = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=4, stride=2, padding=1)

    def forward(self, x):
        print(x.requires_grad)
        x = self.conv2d(x)
        return x
    
print(net.conv2d.weight)
print(net.conv2d.bias)

它的形参由Pytorch手册可以查得,前三个参数是必须手动提供的,后面的有默认值。接下来将一一介绍:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-VONbufHu-1595851424206)(C:\Users\Administrator\Desktop\Resnet\0.png)]

也许有细心的同学已经发现了, 卷积层最重要的可学习参数——权重参数和偏置参数去哪了?在Tensorflow中都是先定义好weightbias,再去定义卷积层的呀!别担心,在Pytorchnn模块中,它是不需要你手动定义网络层的权重和偏置的,这也是体现Pytorch使用简便的地方。当然,如果有小伙伴适应不了这种不定义权重和偏置的方法,Pytorch还提供了nn.Functional函数式编程的方法,其中的F.conv2d()就和Tensorflow一样,要先定义好卷积核的权重和偏置,作为F.conv2d()的形参之一。
回到nn.Conv2d上来,我们可以通过实例名.weight实例名.bias来查看卷积层的权重和偏置,如上图所示。还有小伙伴要问了,那么它们是如何初始化的呢?
首先给结论,在nn模块中,Pytorch对于卷积层的权重和偏置(如果需要偏置)初始化都是采用He初始化的,因为它非常适合于ReLU函数。 这一点大家看Pytorchnn模块中卷积层的源码实现就能清楚地发现了,当然,我们也可以重新对权重等参数进行其他的初始化,可以查看其他教程,此处不再多言。

in_channels

这个很好理解,就是输入的四维张量[N, C, H, W]中的C了,即输入张量的channels数。这个形参是确定权重等可学习参数的shape所必需的。

out_channels

也很好理解,即期望的四维输出张量的channels数。

kernel_size

卷积核的大小,一般我们会使用5x53x3这种左右两个数相同的卷积核,因此这种情况只需要写kernel_size = 5这样的就行了。如果左右两个数不同,比如3x5的卷积核,那么写作kernel_size = (3, 5),注意需要写一个tuple,而不能写一个列表(list)

stride = 1

卷积核在图像窗口上每次平移的间隔,即所谓的步长。

padding = 0

PytorchTensorflow在卷积层实现上最大的差别就在于padding上。
Padding即所谓的图像填充,后面的int型常数代表填充的多少(行数、列数),默认为0。需要注意的是这里的填充包括图像的上下左右,以padding = 1为例,若原始图像大小为32x32,那么padding后的图像大小就变成了34x34,而不是33x33
Pytorch不同于Tensorflow的地方在于,Tensorflow提供的是padding的模式,比如samevalid,且不同模式对应了不同的输出图像尺寸计算公式。而Pytorch则需要手动输入padding的数量,当然,Pytorch这种实现好处就在于输出图像尺寸计算公式是唯一的,即

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-QVXQWLQ6-1595851424214)(C:\Users\Administrator\Desktop\Resnet\10.png)]

当然,上面的公式过于复杂难以记忆。大多数情况下的kernel_sizepadding左右两数均相同,且不采用空洞卷积(dilation默认为1),因此只需要记 O = ( I − K + 2 P ) / S + 1 O = (I - K + 2P)/ S +1 O=(IK+2P)/S+1这种在深度学习课程里学过的公式就好了。

dilation = 1

这个参数决定了是否采用空洞卷积,默认为1(不采用)。从中文上来讲,这个参数的意义从卷积核上的一个参数到另一个参数需要走过的距离,那当然默认是1了,毕竟不可能两个不同的参数占同一个地方吧(为0)。
更形象和直观的图示可以观察Github上的Dilated convolution animations,展示了dilation=2的情况。

groups = 1

决定了是否采用分组卷积,groups参数可以参考groups参数详解

bias = True

即是否要添加偏置参数作为可学习参数的一个,默认为True

padding_mode = ‘zeros’

padding的模式,默认采用零填充。

1. ResNet解决了什么问题

Resnet网络是为了解决深度网络中的退化问题,即网络层数越深时,在数据集上表现的性能却越差,如下图所示是论文中给出的深度网络退化现象。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-pZmwWGn6-1595851424216)(C:\Users\Administrator\Desktop\Resnet\2.png)]

从图中我们可以看到,作者在CIFAR-10数据集上测试了20层和56层的深度网络,结果就是56层的训练误差和测试误差反而比层数少的20层网络更大,这就是ResNet网络要解决的深度网络退化问题。

而采用ResNet网络之后,可以解决这种退化问题,如下图所示。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-9bS28Pxu-1595851424220)(C:\Users\Administrator\Desktop\Resnet\3.png)]

从图中作者在ImageNet数据集上的训练结果可以看出,在没有采用ResNet结构之前,如左图所示,34层网络plain-34的性能误差要大于18层网络plain-18的性能误差。而采用ResNet网络结构的34层网络结构ResNet-34性能误差小于18层网络ResNet-18。因此,采用ResNet网络结构的网络层数越深,则性能越佳。

2. ResNet原理及结构

接下来介绍ResNet网络原理及结构。
假设我们想要网络块学习到的映射为 H ( x ) H(x) H(x),而直接学习 H ( x ) H(x) H(x)是很难学习到的。若我们学习另一个残差函数 F ( x ) = H ( x ) − x F(x) = H(x) - x F(x)=H(x)x可以很容易学习,因为此时网络块的训练目标是将 F ( x ) F(x) F(x)逼近于0,而不是某一特定映射。因此,最后的映射 H ( x ) H(x) H(x)就是将 F ( x ) F(x) F(x) x x x相加, H ( x ) = F ( x ) + x H(x) = F(x) + x H(x)=F(x)+x,如图所示。
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-PqOoPy4D-1595851424223)(C:\Users\Administrator\Desktop\Resnet\4.png)]

因此,这个网络块的输出 y y y

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nnwbbO7F-1595851424225)(C:\Users\Administrator\Desktop\Resnet\5.png)]

由于相加必须保证 x x x F ( ) F() F()是同维度的,因此可以写成通式如下式, W s Ws Ws用于匹配维度。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-u92V9tUx-1595851424226)(C:\Users\Administrator\Desktop\Resnet\6.png)]

文中提到两种维度匹配的方式(A)用zero-padding增加维度; (B)用1x1卷积增加维度。

下面给出论文中两种基础块结构,BasicBlock结构用于ResNet34及以下的网络,BotteNeck结构用于ResNet50及以上的网络。理解了这两个基础块,ResNet就是这些基础块的叠加了。

2.1 BasicBlock结构

BasicBlock结构图如图所示:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-K7OrudiF-1595851424230)(C:\Users\Administrator\Desktop\Resnet\7.png)]

网络结构如图,两个3x3的卷积层,通道数都是64,然后就是注意那根跳线,也就是Shortcut Connections,将输入x加到输出。

2.2 BottleNeck结构

BottleNeck结构图如图所示,

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-JbMevQsR-1595851424232)(C:\Users\Administrator\Desktop\Resnet\8.png)]

网络结构如图,先是一个1x1的卷积层,然后一个3x3的卷积层,然后又是一个1x1的卷积层。注意的是这里的通道数是变化的,1x1卷积层的作用就是用于改变特征图的通数,使得可以和恒等映射x相叠加,另外这里的1x1卷积层改变维度的很重要的一点是可以降低网络参数量,这也是为什么更深层的网络采用BottleNeck而不是BasicBlock的原因

2.3 ResNet结构

了解了上述BasicBlock基础块和BotteNeck结构后,ResNet结构就直接叠加搭建了。5种不同层数的ResNet结构图如图所示,

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-yDZdjc2t-1595851424234)(C:\Users\Administrator\Desktop\Resnet\1.png)]

图中的每一层其实就是我们上面提到的BasicBlock或者BotteNeck结构。这里给出ResNet-34结构图如图所示,图中的虚线连接线是表示通道数不同,需要调整通道。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fGS1OGUa-1595851424235)(C:\Users\Administrator\Desktop\Resnet\9.png)]

3. ResNet代码详解

这部分将给出Pytorch官方给出的ResNet源码,先分别给出BasicBlockBottleNeck的代码块

3.1 BasicBlock代码块

# 定义BasicBlock
class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, inplanes, planes, stride=1, downsaple=None, groups=1,
                 base_width=64, dilation=1, norm_layer=None):
        super(BasicBlock, self).__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        if groups !=1 or base_width != 64:
            raise ValueError('BasicBlock only supports groups=1 and base_width=64')
        if dilation > 1:
            raise NotImplementedError("Dilation > 1 not supported in BasicBlock")

        # 下面定义BasicBlock中的各个层
        self.conv1 = con3x3(inplanes, planes, stride)
        self.bn1 = norm_layer(planes)
        # inplace为True表示进行原地操作,一般默认为False,表示新建一个变量存储操作
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = con3x3(planes, planes)
        self.bn2 = norm_layer(planes)
        self.dowansample = downsaple
        self.stride = stride

    # 定义前向传播函数将前面定义的各层连接起来
    def forward(self, x):
        identity = x  # 这是由于残差块需要保留原始输入
        
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.dowansample is not None: # 这是为了保证原始输入与卷积后的输出层叠加时维度相同
            identity = self.dowansample(x)

        out += identity
        out = self.relu(out)

        return out

3.2 BottleNeck代码块

# 下面定义Bottleneck层(Resnet50以上用到的基础块)
class Bottleneck(nn.Module):
    expansion = 4  # Bottleneck层输出通道都是输入的4倍

    def __init__(self, inplanes, planes, stride=1, downnsaple=None, groups=1,
                 base_width=64, dilation=1, norm_layer=None):
        super(Bottleneck, self).__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        width = int(planes * (base_width / 64.)) * groups
        # 定义Bottleneck中各层
        self.conv1 = con1x1(inplanes, width)
        self.bn1 = norm_layer(width)
        self.conv2 = con3x3(width, width, stride, groups, dilation)
        self.bn2 = norm_layer(width)
        self.conv3 = con1x1(width, planes * self.expansion)
        self.bn3 = norm_layer(planes * self.expansion)
        self.relu = nn.ReLU(inplanes=True)
        self.downsaple = downnsaple
        self.stride = stride

    # 定义Bottleneck的前向传播
    def forward(self, x):
        identity = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)
        out = self.relu(out)

        if self.downsaple is not None:
            identity = self.downsaple(x)

        out += identity
        out = self.relu(out)

        return out

3.3 ResNet代码

这里给出的代码没有完全列出官方源码,需要完整源码的同学见前面代码链接。

import torch
import torch.nn as nn
from .utils import load_state_dict_from_url


__all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101',
           'resnet152', 'resnext50_32x4d', 'resnext101_32x8d',
           'wide_resnet50_2', 'wide_resnet101_2']


model_urls = {
    'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
    'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
    'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
    'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
    'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
    'resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth',
    'resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth',
    'wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth',
    'wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth',
}


def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
    """3x3卷积带有padding"""
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                     padding=dilation, groups=groups, bias=False, dilation=dilation)


def conv1x1(in_planes, out_planes, stride=1):
    """1x1卷积"""
    return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, bias=False)


class BasicBlock(nn.Module):
    # 基本的block类
    expansion = 1  # 扩张值,用于通道数倍增

    def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,
                 base_width=64, dilation=1, norm_layer=None):
        """
        inplanes:   输入通道数
        planes:     输出通道数
        stride:     步长
        downsample: 下采样层
        groups:     组数,当为1时不进行分组卷积
        base_width: 基本宽度
        dilation:   膨胀率
        norm_layer: 归一化层
        """
        super(BasicBlock, self).__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        # 组数不等于1或则基本宽度不等于64,则报错,表明只支持组数为1,且base_width为64
        if groups != 1 or base_width != 64:
            raise ValueError('BasicBlock only supports groups=1 and base_width=64')
        # 膨胀只能为1,
    	if dilation > 1:
            raise NotImplementedError("Dilation > 1 not supported in BasicBlock")
        # Both self.conv1 and self.downsample layers downsample the input when stride != 1
        self.conv1 = conv3x3(inplanes, planes, stride)  # 3x3卷积层
        self.bn1 = norm_layer(planes) # BN层
        self.relu = nn.ReLU(inplace=True)  # Relu层
        self.conv2 = conv3x3(planes, planes)   # 3x3卷积层
        self.bn2 = norm_layer(planes)  # BN层
        self.downsample = downsample  # 下采样层
        self.stride = stride  # 步长

    def forward(self, x):
        identity = x  # 恒等值

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.downsample is not None:  # 如果下采样层不为None,则进行下采样处理。
            identity = self.downsample(x) 

        out += identity  # 残差连接,与原来x相加
        out = self.relu(out)

        return out


class Bottleneck(nn.Module):
    # Bottleneck in torchvision places the stride for downsampling at 3x3 convolution(self.conv2)
    # while original implementation places the stride at the first 1x1 convolution(self.conv1)
    # according to "Deep residual learning for image recognition"https://arxiv.org/abs/1512.03385.
    # This variant is also known as ResNet V1.5 and improves accuracy according to
    # https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch.

    expansion = 4  # 扩张值,用于通道数倍增

    def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,
                 base_width=64, dilation=1, norm_layer=None):
        super(Bottleneck, self).__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        width = int(planes * (base_width / 64.)) * groups   # 重新计算输出层
        # Both self.conv2 and self.downsample layers downsample the input when stride != 1
        self.conv1 = conv1x1(inplanes, width)  # 1x1卷积层
        self.bn1 = norm_layer(width)  # BN层
        self.conv2 = conv3x3(width, width, stride, groups, dilation)  # 3x3卷积层
        self.bn2 = norm_layer(width)  # BN层
        self.conv3 = conv1x1(width, planes * self.expansion)  # 1x1卷积层
        self.bn3 = norm_layer(planes * self.expansion)  # BN层
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample  # 下采样层
        self.stride = stride

    def forward(self, x):
        identity = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            identity = self.downsample(x)

        out += identity
        out = self.relu(out)

        return out


class ResNet(nn.Module):

    def __init__(self, block, layers, num_classes=1000, zero_init_residual=False,
                 groups=1, width_per_group=64, replace_stride_with_dilation=None,
                 norm_layer=None):
        # block基本模块类BasicBlock(18,34)或者Bottleneck(50,101,152),
        # layes=[2, 2, 2, 2]是每个layer的重复次数, num_classes类别数
        super(ResNet, self).__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d  # norm_layer未指定,默认为nn.BatchNorm2d
        self._norm_layer = norm_layer

        self.inplanes = 64  # 输入通道数
        self.dilation = 1   # 不采用空洞卷积,即膨胀率为1
        if replace_stride_with_dilation is None: # 替换步长用膨胀率,如果为None,设置默认值为[False, False, False]
            # each element in the tuple indicates if we should replace
            # tuple中的每个元素都表示是否应该替换
            # the 2x2 stride with a dilated convolution instead
            replace_stride_with_dilation = [False, False, False]
        if len(replace_stride_with_dilation) != 3:  # 检查是否为None或者长度为3
            raise ValueError("replace_stride_with_dilation should be None "
                             "or a 3-element tuple, got {}".format(replace_stride_with_dilation))
        self.groups = groups  # 组数为1
        self.base_width = width_per_group   # 每个组为64
        self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=7, stride=2, padding=3,
                               bias=False)   # 第一个卷积层,(3,64,7,2,3)
        self.bn1 = norm_layer(self.inplanes) # nn.BatchNorm2d层
        self.relu = nn.ReLU(inplace=True)    # relu层
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)  # 最大池化层
        self.layer1 = self._make_layer(block, 64, layers[0])  # 输出层数64,该层重复2次
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2,  # 输出层数128,该层重复2次,步长为2,
                                       dilate=replace_stride_with_dilation[0])
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2,  # 输出层数256,该层重复2次
                                       dilate=replace_stride_with_dilation[1])
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2,  # 输出层数512,该层重复2次
                                       dilate=replace_stride_with_dilation[2])
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))   # 自适应平均池化层,输出大小为(1,1)
        self.fc = nn.Linear(512 * block.expansion, num_classes)  # fc层(expansion为1或4)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
            elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)

        # Zero-initialize the last BN in each residual branch,
        # so that the residual branch starts with zeros, and each residual block behaves like an identity.
        # This improves the model by 0.2~0.3% according to https://arxiv.org/abs/1706.02677
        if zero_init_residual:
            for m in self.modules():
                if isinstance(m, Bottleneck):
                    nn.init.constant_(m.bn3.weight, 0)
                elif isinstance(m, BasicBlock):
                    nn.init.constant_(m.bn2.weight, 0)

    def _make_layer(self, block, planes, blocks, stride=1, dilate=False):
        # 函数BasicBlock类,输出层128,该层重复次数2,步长1,是否使用膨胀参数替代步长
        norm_layer = self._norm_layer  # nn.BatchNorm2d层
        downsample = None  # 下采样层初始化
        previous_dilation = self.dilation  # 先前的膨胀率
        if dilate:  # 用膨胀,更新膨胀率
            self.dilation *= stride # 膨胀率= 1*步长
            stride = 1  # 步长为1
        # 步长不为1,或self.inplances=64 不等于输出层数乘以基本类的扩张率1 ,则给下采样层赋值
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                conv1x1(self.inplanes, planes * block.expansion, stride),
                norm_layer(planes * block.expansion),
            )  # 1x1的卷积层作为下采样层

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample, self.groups,
                            self.base_width, previous_dilation, norm_layer))  
        self.inplanes = planes * block.expansion  # 更新self.inplanes 值
        for _ in range(1, blocks):  # 重复次数2的for循环
            layers.append(block(self.inplanes, planes, groups=self.groups,
                                base_width=self.base_width, dilation=self.dilation,
                                norm_layer=norm_layer))

        return nn.Sequential(*layers)

    def _forward_impl(self, x):
        # See note [TorchScript super()]
        # torch.Size([1, 3, 224, 224])
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)  # torch.Size([1, 64, 112, 112])
        x = self.maxpool(x)  # torch.Size([1, 64, 56, 56])

        x = self.layer1(x)  # torch.Size([1, 64, 56, 56])
        x = self.layer2(x)  # torch.Size([1, 128, 28, 28])
        x = self.layer3(x)  # torch.Size([1, 128, 14, 14])
        x = self.layer4(x)  # torch.Size([1, 512, 7, 7])

        x = self.avgpool(x)  # torch.Size([1, 512, 1, 1])
        x = torch.flatten(x, 1)  # torch.Size([1, 512])
        x = self.fc(x)  # torch.Size([1, 1000])

        return x

    def forward(self, x):
        return self._forward_impl(x)


def _resnet(arch, block, layers, pretrained, progress, **kwargs):
    model = ResNet(block, layers, **kwargs)  # 模型定义
    if pretrained:  # # 加载预训练模型
        state_dict = load_state_dict_from_url(model_urls[arch],
                                              progress=progress)
        model.load_state_dict(state_dict)
    return model


def resnet18(pretrained=False, progress=True, **kwargs):
    r"""ResNet-18 model from
    `"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    # 预训练模型下载链接,block模块函数,layers是每个layer的重复次数,pretrained是否使用预训练模型
    return _resnet('resnet18', BasicBlock, [2, 2, 2, 2], pretrained, progress,
                   **kwargs)


def resnet34(pretrained=False, progress=True, **kwargs):
    r"""ResNet-34 model from
    `"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    return _resnet('resnet34', BasicBlock, [3, 4, 6, 3], pretrained, progress,
                   **kwargs)


def resnet50(pretrained=False, progress=True, **kwargs):
    r"""ResNet-50 model from
    `"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    return _resnet('resnet50', Bottleneck, [3, 4, 6, 3], pretrained, progress,
                   **kwargs)


def resnet101(pretrained=False, progress=True, **kwargs):
    r"""ResNet-101 model from
    `"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    return _resnet('resnet101', Bottleneck, [3, 4, 23, 3], pretrained, progress,
                   **kwargs)


def resnet152(pretrained=False, progress=True, **kwargs):
    r"""ResNet-152 model from
    `"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    return _resnet('resnet152', Bottleneck, [3, 8, 36, 3], pretrained, progress,
                   **kwargs)


def resnext50_32x4d(pretrained=False, progress=True, **kwargs):
    r"""ResNeXt-50 32x4d model from
    `"Aggregated Residual Transformation for Deep Neural Networks" <https://arxiv.org/pdf/1611.05431.pdf>`_
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    kwargs['groups'] = 32
    kwargs['width_per_group'] = 4
    return _resnet('resnext50_32x4d', Bottleneck, [3, 4, 6, 3],
                   pretrained, progress, **kwargs)


def resnext101_32x8d(pretrained=False, progress=True, **kwargs):
    r"""ResNeXt-101 32x8d model from
    `"Aggregated Residual Transformation for Deep Neural Networks" <https://arxiv.org/pdf/1611.05431.pdf>`_
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    kwargs['groups'] = 32
    kwargs['width_per_group'] = 8
    return _resnet('resnext101_32x8d', Bottleneck, [3, 4, 23, 3],
                   pretrained, progress, **kwargs)


def wide_resnet50_2(pretrained=False, progress=True, **kwargs):
    r"""Wide ResNet-50-2 model from
    `"Wide Residual Networks" <https://arxiv.org/pdf/1605.07146.pdf>`_
    The model is the same as ResNet except for the bottleneck number of channels
    which is twice larger in every block. The number of channels in outer 1x1
    convolutions is the same, e.g. last block in ResNet-50 has 2048-512-2048
    channels, and in Wide ResNet-50-2 has 2048-1024-2048.
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    kwargs['width_per_group'] = 64 * 2
    return _resnet('wide_resnet50_2', Bottleneck, [3, 4, 6, 3],
                   pretrained, progress, **kwargs)


def wide_resnet101_2(pretrained=False, progress=True, **kwargs):
    r"""Wide ResNet-101-2 model from
    `"Wide Residual Networks" <https://arxiv.org/pdf/1605.07146.pdf>`_
    The model is the same as ResNet except for the bottleneck number of channels
    which is twice larger in every block. The number of channels in outer 1x1
    convolutions is the same, e.g. last block in ResNet-50 has 2048-512-2048
    channels, and in Wide ResNet-50-2 has 2048-1024-2048.
    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
        progress (bool): If True, displays a progress bar of the download to stderr
    """
    kwargs['width_per_group'] = 64 * 2
    return _resnet('wide_resnet101_2', Bottleneck, [3, 4, 23, 3],
                   pretrained, progress, **kwargs)

参考

https://blog.csdn.net/qq_42079689/article/details/102642610?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param
https://blog.csdn.net/wuzhuoxi7116/article/details/106309943?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.channel_param

PyTorch实现ResNet50模型代码如下所示: ```python import torch import torchvision.models as models # 加载预训练的ResNet50模型 model = models.resnet50(pretrained=True) # 替换最后一层全连接层的输出类别数 num_classes = 1000 # 假设分类数为1000 model.fc = torch.nn.Linear(model.fc.in_features, num_classes) # 将模型设置为评估模式 model.eval() ``` 在这段代码中,我们首先导入了`torch`和`torchvision.models`模块。然后,我们使用`models.resnet50(pretrained=True)`加载了预训练的ResNet50模型。接下来,我们替换了模型的最后一层全连接层,将其输出类别数设置为我们需要的分类数。最后,我们将模型设置为评估模式。 请注意,这段代码中没有使用到引用\[1\]、\[2\]和\[3\]中的具体内容,因为这些内容与问题的回答无关。 #### 引用[.reference_title] - *1* [关于pytorch直接加载resnet50模型模型参数](https://blog.csdn.net/eye123456789/article/details/124948949)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* *3* [pytorch实现resnet50(训练+测试+模型转换)](https://blog.csdn.net/gm_Ergou/article/details/118419795)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

我是天才很好

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值