经典卷积神经网络--AlexNet分析与pytorch代码

亿是守候 & 亿是承诺

已于 2022-04-22 21:52:23 修改

阅读量3.1k

点赞数 1

分类专栏： python pytorch 深度学习文章标签： python 计算机视觉深度学习

于 2022-04-22 21:43:57 首次发布

本文链接：https://blog.csdn.net/weixin_48678602/article/details/124351246

版权

深度学习同时被 3 个专栏收录

9 篇文章 2 订阅

订阅专栏

python

8 篇文章 4 订阅

订阅专栏

pytorch

4 篇文章 0 订阅

订阅专栏

2012年AlexNet卷积神经网络结构被提出，并且以高出第二名10%的准确率获得2012届ImageNet图像识别大赛中获得冠军，使得CNN成为了图像分类核心算法模型。
AlexNet网络特点
1：AlexNet一共有八层，五个卷积层和三个全连接层。由于是对ImageNet数据集进行分类，所以最后一层的输出会接上softmax，一共1000个输出(ImageNet一共有1000个类别),softmax会产生1000类标签。
2：在AlexNet中，使用RuLu函数来增加模型的非线性能力。
3：使用了Dropout训练期间选择性的暂时忽略一些神经元，来减小模型的过拟合。（Dropout可以减小模型的过拟合）
4：局部响应归一化层(LRN)：提高精度,LRN现在好像用得不多了，一般使用BN。
5：双GPU并行运行，提高训练速度。

在pytorch框架中可以直接调用AlexNet模型，代码如下：

import torchvision.models as models
alexnet = models.alexnet()
print(alexnet)

输入如下

AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace=True)
    (3): Dropout(p=0.5, inplace=False)
    (4): Linear(in_features=4096, out_features=4096, bias=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
)

进程已结束,退出代码0

AlexNet网络模型架构如下所示

在这里插入图片描述
仔细的可以还小呢，pytorch中的AlexNet网络参数与原文不一样，这里就以论文中分析
提示:一般认为卷积池化算一层

第一层：
卷积层
torch.nn.Conv2d(3，96，11，4，0)，第一个参数是in_channel输入通道数feature map，第二个参数是out_channel输出通道数feature map，第三个是卷积核大小kernel，第四个是步长stride，第五个是填充大小padding
其中：输入图像是2272273，一共是96个11113的卷积核，生成96个feature map,步长为4，经过公式（n-f+2p）/s可得,其中，n是输入图像的大小，f为卷积核大小，p为padding大的大小，s为步长且向下取整，得到（227-11+0）/4+1=55，所以Feature Map为55* 55* 48。
ReLU层
torch.nn.ReLU(inplace=True)然后又将输出的Feature Map经过ReLU层，增加网络的非线性因素.
池化层+局部响应归一化
torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False),经过33步长为2的池化层，使用33(kernel_size)的步长为2(stride)的池化单元，（这里是重叠池化，步长小于池化单元的大小），由（n-f+2p）/s+1(其中，n为输入图像的大小，f为卷积核大小，p为padding的大小，s为步长),每组得到的Feature Map为27×27×48。

第二层：
卷积层
torch.nn.Conv2d(96,256,5,1,2)，输入两组272748的Feature Map，这一层一共有256个553的卷积核，生成256个feature map，步长为1。padding为2。得到的Feature Map为2727128.
Relu层：
torch.nn.ReLU()然后又将输出的Feature Map经过ReLU层，增加网络的非线性因素
池化层+局部响应归一化
torch.nn.MaxPool2d(3,2),经过33步长为2的池化单元，输入两组1313*128的Feature Map。

第三层
卷积层
torch.nn.Conv2d(256，384，3，1，1）输入是两组1313128的Feature Map，使用333的卷积核生成生成384个Feature Map，步长为1，padding为1，所以得到每组的Feature Map为1313384
ReLU层
torch.nn.ReLU()，然后又将卷积层输出的Feature Map经过ReLU层，输出两组为1313192

第四层
卷积层
torch.nn.Conv2d(384,384,3,1,1)，输入是两组1313192的Feature Map，使用333的卷积核，生成384个Feature Map，步长为1，padding为1，所以得到最后的两组Feature Map为1313192
ReLU层
torch.nn.ReLU()然后又将卷积层输出的Feature Map经过RELU层。两组输出为1313192

第五层
卷积层
torch.nn.Conv2d(384,384,5,1,1),输入是两组1313192的Feature Map，使用553的卷积核，生成256个Feature Map，步长为1，padding为2，得到的输出Feature Map为1313128
ReLU层
torch.nn.ReLU()然后又将输出的Feature Map经过ReLU层
池化层
torch.nn.MaxPool2d(3,2)，使用33的步长为2的池化单元，得到两组输出为66*128的Feature Map

第六层
全连接层
卷积–>全连接层：torch.nn.Linear(9216,4096),输入为66256，使用4096个66256卷积核。每个卷积核的尺寸与待处理特征图的尺寸完全相同，即卷积核的每个系数只与特征图尺寸的一个像素相乘一一对应。
ReLU层
将这4096个运算结果通过激活函数生成4096个值
Dropout
torch.nn.Dropout(0.5)随机的断开50%的神经元，从而防止过拟合现象。

第七层
全连接层
全连接层—>全连接层：torch.nn.Linear(4096,4096)
ReLU
这4096个运算结果通过ReLU激活函数生成4096个值
Dropout
torch.nn.Dropout(0.5)随机断开50%的神经元的连接，从而防止过拟合现象

第八层
输出层
第七层输出的4096个数据与第八层的1000个神经元进行全连接，输出一个元素个数为1000的一维向量。

最后pytorch的实现代码为：

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net,self).__init__()
        self.conv1 = nn.Conv2d(3,96,11,4,0)
        self.pool = nn.MaxPool2d(3,2)
        self.conv2 = nn.Conv2d(96,256,5,1,2)
        self.conv3 = nn.Conv2d(256,384,3,1,1)
        self.conv4 = nn.Conv2d(384,384,3,1,1)
        self.conv5 = nn.Conv2d(384,256,5,1,1)
        self.drop = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216,4096)
        self.fc2 = nn.Linear(4096,4096)
        self.fc3 = nn.Linear(4096,1000)

    def forward(self,x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x = self.pool(F.relu(self.conv5(x)))
        x = self.drop(F.relu(self.fc1(x)))
        x = self.drop(F.relu(self.fc2(x)))
        x = self.fc3(x)
        return x

    def num_flat_feature(self,x):
        size = x.size()[1:]
        num_feature = 1
        for s in size:
            num_feature *= 5
        return num_feature

net = Net()
print(net)

输出AlexNet网络模型

Net(
  (conv1): Conv2d(3, 96, kernel_size=(11, 11), stride=(4, 4))
  (pool): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(96, 256, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (conv3): Conv2d(256, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv4): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv5): Conv2d(384, 256, kernel_size=(5, 5), stride=(1, 1), padding=(1, 1))
  (drop): Dropout(p=0.5, inplace=False)
  (fc1): Linear(in_features=9216, out_features=4096, bias=True)
  (fc2): Linear(in_features=4096, out_features=4096, bias=True)
  (fc3): Linear(in_features=4096, out_features=1000, bias=True)
)

或者这个，其实差别就在用Sequential和nn.relu与F.relu的区别。
其中nn.ReLU作为一个层结构，必须添加到nn.Module容器中才能使用，而F.ReLU则作为一个函数调用，看上去作为一个函数调用更方便更简洁。具体使用哪种方式，取决于编程风格。

在PyTorch中,nn.X都有对应的函数版本F.X，但是并不是所有的F.X均可以用于forward或其它代码段中，因为当网络模型训练完毕时，在存储model时，在forward中的F.X函数中的参数是无法保存的。

也就是说，在forward中，使用的F.X函数一般均没有状态参数，比如F.ReLU，F.avg_pool2d等，均没有参数，它们可以用在任何代码片段中。

#model.py

import torch.nn as nn
import torch


class AlexNet(nn.Module):
    def __init__(self, num_classes=1000, init_weights=False):   
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(  #打包
            nn.Conv2d(3, 48, kernel_size=11, stride=4, padding=2),  # input[3, 224, 224]  output[48, 55, 55] 自动舍去小数点后
            nn.ReLU(inplace=True), #inplace 可以载入更大模型
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[48, 27, 27] kernel_num为原论文一半
            nn.Conv2d(48, 128, kernel_size=5, padding=2),           # output[128, 27, 27]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[128, 13, 13]
            nn.Conv2d(128, 192, kernel_size=3, padding=1),          # output[192, 13, 13]
            nn.ReLU(inplace=True),
            nn.Conv2d(192, 192, kernel_size=3, padding=1),          # output[192, 13, 13]
            nn.ReLU(inplace=True),
            nn.Conv2d(192, 128, kernel_size=3, padding=1),          # output[128, 13, 13]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[128, 6, 6]
        )
        self.classifier = nn.Sequential(
            nn.Dropout(p=0.5),
            #全链接
            nn.Linear(128 * 6 * 6, 2048),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(2048, 2048),
            nn.ReLU(inplace=True),
            nn.Linear(2048, num_classes),
        )
        if init_weights:
            self._initialize_weights()

    def forward(self, x):
        x = self.features(x)
        x = torch.flatten(x, start_dim=1) #展平   或者view()
        x = self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') #何教授方法
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)  #正态分布赋值
                nn.init.constant_(m.bias, 0)

亿是守候 & 亿是承诺

关注

1
点赞
踩
12

收藏

觉得还不错? 一键收藏
3
评论
经典卷积神经网络--AlexNet分析与pytorch代码

2012年AlexNet卷积神经网络结构被提出，并且以高出第二名10%的准确率获得2012届ImageNet图像识别大赛中获得冠军，使得CNN成为了图像分类核心算法模型。AlexNet网络特点1：AlexNet一共有八层，五个卷积层和三个全连接层。由于是对ImageNet数据集进行分类，所以最后一层的输出会接上softmax，一共1000个输出(ImageNet一共有1000个类别),softmax会产生1000类标签。2：在AlexNet中，使用RuLu函数来增加模型的非线性能力。3：使用了Dro
复制链接

扫一扫

专栏目录