【深度学习】ALEXNET实现（pyTorch）

最新推荐文章于 2024-04-06 19:18:44 发布

一颗苹果OAOA

最新推荐文章于 2024-04-06 19:18:44 发布

阅读量801

点赞数

分类专栏： pyTorch深度学习框架文章标签：神经网络图像识别深度学习卷积神经网络

本文链接：https://blog.csdn.net/qq_43360533/article/details/107418932

版权

pyTorch深度学习框架专栏收录该内容

30 篇文章 10 订阅

订阅专栏

1 Intorduction

2012年，AlexNet横空出世。

AlexNet使用了8层卷积神经网络，并以很大的优势赢得了ImageNet 2012图像识别挑战赛。它首次证明了学习到的特征可以超越手工设计的特征，从而一举打破计算机视觉研究的前状。

AlexNet与LeNet的设计理念非常相似，但也有显著的区别：

第一，与相对较小的LeNet相比，AlexNet包含8层变换，其中有5层卷积和2层全连接隐藏层，以及1个全连接输出层。

第二，AlexNet将sigmoid激活函数改成了更加简单的ReLU激活函数。、这是由于当sigmoid激活函数输出极接近0或1时，这些区域的梯度几乎为0，从而造成反向传播无法继续更新部分模型参数；而ReLU激活函数在正区间的梯度恒为1。因此，若模型参数初始化不当，sigmoid函数可能在正区间得到几乎为0的梯度，从而令模型无法得到有效训练。

第三，AlexNet通过Dropout来控制全连接层的模型复杂度,而LeNet并没有使用Dropout。

第四，AlexNet引入了大量的图像增广，如翻转、裁剪和颜色变化，从而进一步扩大数据集来缓解过拟合。

在这里插入图片描述

2 AlexNet网络结构

import time
import torch 
from torch import nn, optim
import torchvision
import sys
sys.path.append("..")
import d2lzh_pytorch as d2l
device = torch.device('cuda' if torch.cuda.is_available else 'cpu')

class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(1, 96, 11, 4),
            nn.ReLU(),
            nn.MaxPool2d(3, padding=0, stride=2),
            
            nn.Conv2d(96, 256, 5, 1, 2),
            nn.ReLU(),
            nn.MaxPool2d(3, padding=0, stride=2),
            
            # 连续3个卷积层，且使用更小的卷积窗口。除了最后的卷积层外，进一步增大了输出通道数。
            # 前两个卷积层后不使用池化层来减小输入的高和宽
            nn.Conv2d(256, 384, 3, 1, 1),
            nn.ReLU(),
            nn.Conv2d(384, 384, 3, 1, 1),
            nn.ReLU(),
            nn.Conv2d(384, 256, 3, 1, 1),
            nn.ReLU(),
            nn.MaxPool2d(3, padding=0, stride=2)
        )
        self.fc = nn.Sequential(
            nn.Linear(256*5*5, 4096),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(4096, 10),
            )
    def forward(self,img):
        feature = self.conv(img)
        output = self.fc(feature.view(img.shape[0], -1))
        return output

net = AlexNet()
print(net)

AlexNet(
  (conv): Sequential(
    (0): Conv2d(1, 96, kernel_size=(11, 11), stride=(4, 4))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(96, 256, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU()
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(256, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU()
    (8): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU()
    (10): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU()
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc): Sequential(
    (0): Linear(in_features=6400, out_features=4096, bias=True)
    (1): ReLU()
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU()
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=10, bias=True)
  )
)

3 预处理数据

def load_data_fashion_mnist(batch_size, resize = None, root='~/Datasets/FashionMNIST'):
    trans=[]
    if resize:
        trans.append(torchvision.transforms.Resize(size = resize))
    trans.append(torchvision.transforms.ToTensor())
        
    transform = torchvision.transforms.Compose(trans)
    mnist_train = torchvision.datasets.FashionMNIST(root = root, train = True, download = True, transform = transform)
    mnist_test = torchvision.datasets.FashionMNIST(root = root, train = False, download = True, transform = transform)   
        
    train_iter = torch.utils.data.DataLoader(mnist_train, batch_size = batch_size, shuffle = True, num_workers = 4)
    test_iter = torch.utils.data.DataLoader(mnist_test, batch_size = batch_size, shuffle = True, num_workers = 4)
    return train_iter, test_iter

batch_size = 128
train_iter, test_iter = load_data_fashion_mnist(batch_size, resize = 224, root='~/Datasets/FashionMNIST')

4 训练模型

lr , num_epochs =0.001, 5
optimizer = torch.optim.Adam(net.parameters(),lr= lr)
d2l.train_ch5(net, train_iter, test_iter, batch_size, optimizer, device, num_epochs)

training on  cuda
epoch 1, loss 0.5860, train acc 0.778, test acc 0.865, time 360.2 sec
epoch 2, loss 0.1674, train acc 0.877, test acc 0.887, time 359.3 sec
epoch 3, loss 0.0953, train acc 0.894, test acc 0.896, time 372.7 sec
epoch 4, loss 0.0637, train acc 0.907, test acc 0.904, time 381.4 sec
epoch 5, loss 0.0464, train acc 0.914, test acc 0.901, time 380.4 sec

Hints:

AlexNet跟LeNet结构类似，但使用了更多的卷积层和更大的参数空间来拟合大规模数据集ImageNet。

它是浅层神经网络和深度神经网络的分界线,AlexNet这个观念上的转变和真正优秀实验结果的产生令学术界付出了很多年。

参考原文

欢迎关注【OAOA】

一颗苹果OAOA

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
2
评论
【深度学习】ALEXNET实现（pyTorch）

Contents1 Intorduction2 AlexNet网络结构3 预处理数据4 训练模型1 Intorduction2012年，AlexNet横空出世。AlexNet使用了8层卷积神经网络，并以很大的优势赢得了ImageNet 2012图像识别挑战赛。它首次证明了学习到的特征可以超越手工设计的特征，从而一举打破计算机视觉研究的前状。AlexNet与LeNet的设计理念非常相似，但也有显著的区别：第一，与相对较小的LeNet相比，AlexNet包含8层变换，其中有5层卷积和2层全连接隐藏层，
复制链接

扫一扫