阅读笔记：《ImageNet Classfication with Deep Convolutional Neural Networks》

最新推荐文章于 2023-06-02 18:32:24 发布

to be__

最新推荐文章于 2023-06-02 18:32:24 发布

阅读量178

点赞数

文章标签： pytorch

本文链接：https://blog.csdn.net/csweettea/article/details/120999603

版权

一、Abstract

1. Acieve top-1 and top-5 error rates of 37.5% and 17.0%。

2. This networks has 60 million parameters and 650,000 neurons. 由5个卷积层（有一些后面连着最大池化层）和三个全连接层组成，最后输出1000个softmax的分类结果。

3. 为了训练得更快，采用非饱和的神经元，并且在非常高效的GPU上来实现卷积操作。

4. 为了在全连接层减少overfitting, 采用正则化方法dropout。

二、Introduction

1. 被标注的图像数据集太小，需要更多训练集和更好的模型。

2. CNNs不同的宽度和深度可以来控制网络的能力，并且CNNs参数更少和连接，因此更容易去训练，只是理论上的最佳性能较差。虽然CNNs有比较好的性能，在它们局部架构上也相对比较高效，但是应用在大规模高分辨率的图像上还是比较昂贵。

3. contributions of this paper

(1)在ImageNet的子集上训练了其中最大的卷积神经网络，并且实现了在这些数据集上至今为止最好的结果。

（2）该网络包含了一些新的和不常见的特征，这些特征提高了表征并且减少了训练的时间。

（3）为了防止过拟合，采取了两种方式（data augmentation/dropout)

(4) 在五个卷积层和三个全连接层构成的网络架构中，这个深度都非常重要，除去任何一个卷积层都会使得较差的表现能力。

三、ReLU Nonlinearity

传统的神经网络采取输出都是运用tanh或者sigmoid，就梯度下降的训练时间来说，这些饱和的非线性函数比非饱和的非线性函数例如ReLU要慢得多。ReLU: f(x)=max(0,x),输出不在一个区间内，后面采用了LRN进行归一化处理。

四、LRN

...

五、Overlapping Pooling(覆盖池化)

传统上并未采用覆盖池化，即池化池尺寸z = 步长s，比如z = s = 2,如果input_size = 10*10,则根据公式output_size=(input_size-kernel_size+2*padding)/stride+1 = 5;

采用覆盖池化，即z > s, 文章中提到采取overlapping pooling后top-1和top-5的error rate分别降低了0.4%和0.3%。并且覆盖池化的训练模型不容易产生过拟合。

六、Reducing Overfitting

提到两种方式：（1）data augmentation：分为两种方式，第一种是生成图像转移和水平转换；第二种是在训练图像上改变RGB通道的强度

(2) dropout：以0.5的概率去将随机将隐藏层神经元的输出置为0，并且神经元不会贡献到前向传播，也不会参与到反向传播，但是他们共享网络的权重，这样每次训练都是不同的网络模型。

七、网络结构

'''python'''

import torch
import torch.nn as nn

class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        self.conv1 = nn.Sequential(
            #图片输入size：227*227*3,output_size=(input_size-kernel_size+2*padding)/stride+1=55 向下取整
            #一共96个kernel，则feature_map:55*55*96
            nn.Conv2d(in_channels=3,out_channels=96,kernel_size=11,stride=4,padding=0),
            nn.ReLU(True),
            #output_size = (input_size-kernel_size+2*padding)/stride+1=27
            #则输出feature_map规格：27*27*96
            nn.MaxPool2d(kernel_size=3,stride=2,padding=0)
        )
        self.conv2 = nn.Sequential(
            #图片输入size：27*27*96，output_size = (input_size-kernel_size+2*padding)/stride+1=27
            #一共256个kernel，则feature_map:27*27*256
            nn.Conv2d(in_channels=27,out_channels=256,kernel_size=5,stride=1,padding=2),
            nn.ReLU(True),
            #output_size = (input_size-kernel_size+2*padding)/stride+1=13
            #则输出feature_map规格：13*13*256
            nn.MaxPool2d(kernel_size=3,stride=2,padding=0)
        )
        self.conv3 = nn.Sequential(
            # 图片输入size：13*13*256，output_size = (input_size-kernel_size+2*padding)/stride+1=13
            # 一共256个kernel，则feature_map:13*13*384
            nn.Conv2d(in_channels=13,out_channels=384,kernel_size=3,stride=1,padding=1),
            nn.ReLU(True)
        )
        self.conv4 = nn.Sequential(
            #图片输入size：13*13*384，output_size = (input_size-kernel_size+2*padding)/stride+1=13
            # 一共384个kernel，则feature_map:13*13*384
            nn.Conv2d(in_channels=13,out_channels=384,kernel_size=3,stride=1,padding=1),
            nn.ReLU(True)
        )
        self.conv5 = nn.Sequential(
            # 图片输入size：13*13*384，output_size = (input_size-kernel_size+2*padding)/stride+1=13
            # 一共256个kernel，则feature_map:13*13*256
            nn.Conv2d(in_channels=13,out_channels=256,kernel_size=3,stride=1,padding=1),
            nn.ReLU(True),
            # output_size = (input_size-kernel_size+2*padding)/stride+1=6
            # 则输出feature_map规格：6*6*256
            nn.MaxPool2d(kernel_size=3,stride=2,padding=0)
        )
        self.dense = nn.Sequential(
            nn.Linear(6*6*256,4096),
            nn.ReLU(True),
            nn.Dropout2d(0.5),
            nn.Linear(4096,4096),
            nn.ReLU(True),
            nn.Dropout2d(0.5),
            # 输出分类数
            nn.Linear(4096,1000)
        )
    def forward(self,x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.conv3(x)
        x = self.conv4(x)
        x = self.conv5(x)
        res = x.view(x.size(0),-1)
        out = self.dense(x)
        return out

to be__

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
阅读笔记：《ImageNet Classfication with Deep Convolutional Neural Networks》

一、Abstract 1. Acieve top-1 and top-5 error rates of 37.5% and 17.0%。 2. This networks has 60 million parameters and 650,000 neurons. 由5个卷积层（有一些后面连着最大池化层）和三个全连接层组成，最后输出1000个softmax的分类结果。 3. 为了训练得更快，采用非饱和的神经元，并且在非常高效的GPU上来实现卷积操作。...
复制链接

扫一扫