图像分类丨ILSVRC历届冠军网络「从AlexNet到SENet」-CSDN博客

本文链接：https://blog.csdn.net/woshicver/article/details/105140874

本文梳理了ILSVRC历届冠军网络，包括LeNet、AlexNet、ZFNet、VGG、GoogLeNet、ResNet、ResNeXt和SENet。这些网络的核心思想、架构及其在ImageNet上的实验结果被详细阐述，揭示了深度卷积网络的发展历程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前言

深度卷积网络极大地推进深度学习各领域的发展，ILSVRC作为最具影响力的竞赛功不可没，促使了许多经典工作。我梳理了ILSVRC分类任务的各届冠军和亚军网络，简单介绍了它们的核心思想、网络架构及其实现。
代码主要来自：https://github.com/weiaicunzai/pytorch-cifar100
ImageNet和ILSVRC
- ImageNet是一个超过15 million的图像数据集，大约有22,000类。
- ILSVRC全称ImageNet Large-Scale Visual Recognition Challenge，从2010年开始举办到2017年最后一届，使用ImageNet数据集的一个子集，总共有1000类。
历届结果

年	网络/队名	val top-1	val top-5	test top-5	备注
2012	AlexNet	38.1%	16.4%	16.42%	5 CNNs
2012	AlexNet	36.7%	15.4%	15.32%	7CNNs。用了2011年的数据
2013	OverFeat			14.18%	7 fast models
2013	OverFeat			13.6%	赛后。7 big models
2013	ZFNet			13.51%	ZFNet论文上的结果是14.8
2013	Clarifai			11.74%
2013	Clarifai			11.20%	用了2011年的数据
2014	VGG			7.32%	7 nets, dense eval
2014	VGG（亚军）	23.7%	6.8%	6.8%	赛后。2 nets
2014	GoogleNet v1			6.67%	7 nets, 144 crops
	GoogleNet v2	20.1%	4.9%	4.82%	赛后。6 nets, 144 crops
	GoogleNet v3	17.2%	3.58%		赛后。4 nets, 144 crops
	GoogleNet v4	16.5%	3.1%	3.08%	赛后。v4+Inception-Res-v2
2015	ResNet			3.57%	6 models
2016	Trimps-Soushen			2.99%	公安三所
2016	ResNeXt（亚军）			3.03%	加州大学圣地亚哥分校
2017	SENet			2.25%	Momenta 与牛津大学

评价标准
top1是指概率向量中最大的作为预测结果，若分类正确，则为正确；top5则只要概率向量中最大的前五名里有分类正确的，则为正确。

LeNet

Gradient-Based Learning Applied to Document Recognition

网络架构

import torch.nn as nn
import torch.nn.functional as func
class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, kernel_size=5)
        self.conv2 = nn.Conv2d(6, 16, kernel_size=5)
        self.fc1 = nn.Linear(16*16, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = func.relu(self.conv1(x))
        x = func.max_pool2d(x, 2)
        x = func.relu(self.conv2(x))
        x = func.max_pool2d(x, 2)
        x = x.view(x.size(0), -1)
        x = func.relu(self.fc1(x))
        x = func.relu(self.fc2(x))
        x = self.fc3(x)
        return x

AlexNet

ImageNet Classification with Deep Convolutional Neural Networks

核心思想

AlexNet相比前人有以下改进：
1.采用ReLU激活函数

2.局部响应归一化LRN

3.Overlapping Pooling

4.引入Drop out

5.数据增强

6.多GPU并行

网络架构

代码实现

class AlexNet(nn.Module):
    def __init__(self, num_classes=NUM_CLASSES):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(1, 96, kernel_size=11,padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(96, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(256, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
        )
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 2 * 2, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, 10),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), 256 * 2 * 2)
        x = self.classifier(x)
        return x