用pytorh搭建AlexNet网络，详细解说_用python写出alexnet的实现过程-CSDN博客

本文链接：https://blog.csdn.net/prague6695/article/details/114267068

本文详细介绍了如何使用PyTorch搭建AlexNet网络，从模型搭建、训练到预测的全过程。讨论了ReLU激活函数的优势，以及在训练过程中涉及的数据预处理、模型优化、损失函数和验证阶段的精度计算。通过实际操作加深对模型结构的理解。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

前言
一、AlexNet网络
二、搭建步骤
总结

前言

从去年刚开始接触人工智能，python，深度学习，机器学习这些方面的东西之后，一直都是在看b站，自己看书，很少有总结的过程，也就是只有输入没有输出，导致有些概念理解不深，前学后忘。从寒假以来，一直想写博客记录下自己的学习心得和感悟。因此研二第二学期的第一天就开始尝试的写一下，先从最近所学的用pytorh搭建AlexNet网络开始记录

一、AlexNet网络

在这里插入图片描述
sigmoid激活函数有两个缺点：
第一求导麻烦
第二当网络比较深时，会出现梯度消失
而ReLU激活函数解决了这两个问题

二、搭建步骤

本次搭建的重点不是在代码上进行回顾，而是在网络的完整性和各个部分所包含的结构进行重点回顾，主要分为三个部分，模型搭建阶段，训练阶段和预测阶段

1.模型搭建阶段

模型搭建需要torch.nn模块，其次需要创建AlexNet类，其中包含__init__(),forward(),_initialize_weight()模块

代码如下（示例）：

import torch.nn as nn
import torch

class AlexNet(nn.Module):
	def __init__(self, num_classes=1000, init_weights=False):
		super(AlexNet,self).__init__()
		self.features = nn.Sequential(
            nn.Conv2d(3, 48, kernel_size=11, stride=4, padding=2),  # input[3, 224, 224]  output[48, 55, 55]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[48, 27, 27]
            nn.Conv2d(48, 128, kernel_size=5, padding=2),           # output[128, 27, 27]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[128, 13, 13]
            nn.Conv2d(128, 192, kernel_size=3, padding=1),          # output[192, 13, 13]
            nn.ReLU(inplace=True),
            nn.Conv2d(192, 192, kernel_size=3, padding=1),          # output[192, 13, 13]
            nn.ReLU(inplace=True),
            nn.Conv2d(192, 128, kernel_size=3, padding=1),          # output[128, 13, 13]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[128, 6, 6]
        )
        self.classifier = nn.Sequential(
            nn.Dropout(p=0.5),
            nn.Linear(128 * 6 * 6, 2048),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(2048, 2048),
            nn.ReLU(inplace=True),
            nn.Linear(2048, num_classes),
        )
        if init_weights:
            self._initialize_weights()
	
	def forward(self, x):
		x = self.features(x)
		x = torch.flatten(x, start_dim=1)
		x = self.classifier(x)
		teturn x
	def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)

模型搭建阶段都是按套路有走的，
本模型在__init__()阶段阶段传入了两个参数num_classes=1000, init_weights=False，前者主要决定的是数据集所需要分类的数量，后者是决定是否初始化参数。
在forward（）阶段，先进行特征提取，然后进行分类。由于分类阶段都是全连接层，而特征提取阶段的输出是 $128 \times 6\times 6$ ，因此需要进行扁平化处理，其方法为torch.flatten()
在模型参数的初始化过程中，卷积层的参数和全连接层的参数初始化方法不一，在卷积层中权重采用凯明初始化方法，偏差设为0，在全连接层中，权重采用正态分布均值为0，偏差为0.01的初始化，偏差设为0.

2.训练阶段

模型搭建需要
os, json, time, tqdm等模块进行文件路径的选择，json文件的读写，明确运行时间，训练进度的掌握等
torch.optim模块对模型进行优化
torchvision的transforms,datasets模块分别对数据进行图像预处理和常用数据集的dataset实现
代码如下（示例）：

import os
import json

import time
import torch
import torch.nn as nn
from torchvision import transforms, datasets, utils