【深度卷积神经网络-Alexnet】

飞蓬heart

已于 2022-04-18 17:25:46 修改

阅读量2.5k

点赞数

分类专栏：人工智能文章标签： python

于 2022-04-14 17:25:17 首次发布

本文链接：https://blog.csdn.net/weixin_42622045/article/details/124176931

版权

人工智能专栏收录该内容

8 篇文章 1 订阅 ¥9.90 ¥99.00

订阅专栏

超级会员免费看

本文详细介绍了AlexNet深度卷积神经网络的结构和工作原理，包括卷积层、池化层、局部响应归一化及全连接层的设置。AlexNet包含8个权重层，5个卷积层和3个全连接层，最后通过softmax输出1000类的图像分类结果。

摘要由CSDN通过智能技术生成

原理

在这里插入图片描述

上图中的输入是224×224，不过经过计算(224−11)/4=54.75并不是论文中的55×55，而使用227×227作为输入，则(227-11)/4=55

网络包含8个带权重的层；前5层是卷积层，剩下的3层是全连接层。最后一层全连接层的输出是1000维softmax的输入，softmax会产生1000类标签的分布网络包含8个带权重的层；前5层是卷积层，剩下的3层是全连接层。最后一层全连接层的输出是1000维softmax的输入，softmax会产生1000类标签的分布。

卷积层C1
该层的处理流程是:卷积–>ReLU–>池化–>归一化。

卷积，输入是227× 227，使用96个11×11×3的卷积核，得到的FeatureMap为55×55×96。
ReLU，将卷积层输出的FeatureMap输入到ReLU函数中。
池化，使用3×3步长为2的池化单元(重叠池化，步长小于池化单元的宽度)，输出为27×27× 96 (55-3)/2＋1=27)
局部响应归一化，使用k =2,n = 5,a = 10-4,B=0.75进行局部归一化，输出的仍然为27× 27×96，输出分为两组,每组的大小为27 ×27 x48，各自在一组独立的GPU上进行运算
卷积层C2
该层的处理流程是:卷积–>ReLU–>池化–>归一化

卷积，输入是2组27×27×48。使用2组，每组128个尺寸为5×5×48的卷积核，并作了边缘填充padding=2，卷积的步长为1.则输出的FeatureMap为2组，每组的大小为27 x27x128.((27＋2* 2-5)/1＋1= 27)
ReLU，将卷积层输出的FeatureMap输入到ReLU函数中
池化运算的尺寸为3×3，步长为2，池化后图像的尺寸为(27一3)/2+1=13，输出为13× 13×256
局部响应归一化，使用k =2,n =5,α = 10-4,B=0.75进行局部归一化，输出的仍然为13× 13 × 256，输出分为2组,每组的大小为13× 13x 128
卷积层C3
该层的处理流程是:卷积–>ReLU

卷积，输入是13× 13× 256，使用2组共384尺寸为3×3× 256的卷积核，做了边缘填充padding=1，卷积的步长为1.则翰出的FeatureMap为13×13x384
ReLU，将卷积层输出的FeatureMap输入到ReLU函数中
卷积层C4
该层的处理流程是:卷积–>ReLU该层和C3类似。

卷积，输入是13× 13 ×384，分为两组，每组为13×13×192.使用2组，每组192个尺寸为3×3× 192的卷积核，做了边缘填充padding=1，卷积的步长为1.0输出的FeatureMap为13× 13 x384，分为两组，每组为13× 13× 192
ReLU，将卷积层输出的FeatureMap输入到ReLU函数中
卷积层C5
该层处理流程为:卷积–>ReLU–>池化

卷积，输入为13× 13×384，分为两组，每组为13× 13× 192。使用2组，每组为128尺寸为3×3×192的卷积核，做了边缘填充padding=1,卷积阳的步长为1.0输出的FetureMap为13× 13x256
ReLU，将卷积层输出的FeatureMap输入到ReLU函数中
池化，池化运算的尺寸为3×3，步长为2，池化后图像的尺寸为(13-3)/2＋1=6,即池化后的输出为6×6× 256
全连接层FC6
该层的流程为:(卷积)全连接–>ReLU -->Dropout

卷积→全连接:输入为6 ×6×256该层有4096个卷积核，每个卷积核的大小为6×6 × 256。由于卷积核的尺寸刚好与待处理待征图(输入）的尺寸相同，即卷积核中的每个系数只与特征图(输入)尺寸的一个像素值相乘，——对应，因此，该层被称为全连接层。由于卷积核与特征固的尺寸相同,卷积运算后只有一个值，因此，卷积后的像素层尺寸为4096 ×1 x1，即有4096个神经元。
ReLU,这4096个运算结果通过ReLU激活函数生成4096个值
Dropout,抑制过拟合，随机的断开某些神经元的连接或者是不激活某些神经元
全连接层FC7
流程为:全连接–>ReLU–>Dropout。

全连接，输入为4096的向量
ReLU,这4096个运算结果通过ReLU激活函数生成4096个值
Dropout,抑制过拟合，随机的断开某些神经元的连接或者是不激活某些神经元
输出层
第七层输出的4096个数据与第八层的1000个神经元进行全连接，经过训练后输出1000个float型的值，这就是预测结果

图像分类

当前目录下存放train和val文件夹，每个文件夹下建立每个类别文件夹并存放对应的图片，如下所示，存放ai和house类别图片的文件夹
在这里插入图片描述

axlenet.py

#coding=utf-8
import torch
from torch import nn
from torchsummary import summary
torch.manual_seed(4)

# 构建axlenet网络结构
class Axlenet(nn.Module):   # 所有网络结构都需要继承nn.Module
    def __init__(self, num=10):
        super(Axlenet,self).__init__()
        # nn.Sequential容器
        self.feature = nn.Sequential(      
                       nn.Conv2d(3, 48, 11, 4, 2, bias=False),
                       nn.ReLU(True),
                       nn.MaxPool2d(3,2),
                       nn.Conv2d(48, 128, 5, 1, 2, bias=True),
                       nn.ReLU(True),
                       nn.MaxPool2d(3,2),
                       nn.Conv2d(128, 192, 3, 1, 1, bias=True),
                       nn.ReLU(True),
                       nn.Conv2d(192, 192, 3, 1, 1, bias=True),
                       nn.ReLU(True),
                       nn.Conv2d(192, 128, 3, 1, 1, bias=True),
                       nn.ReLU(True), 
                       nn.MaxPool2d(3,2))
        self.fc = nn.Sequential(
                       nn.Dropout(0.5),
                       nn.Linear(4608,2048),
                       nn.ReLU(True),
                       nn.Dropout(0.5),
                       nn.Linear(2048,2048),
                       nn.ReLU(True),
                       nn.Linear(2048,num)) 
        self.init_weight()
        
    def init_weight(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight)
                if m.bias is not None:
                    nn.init.constant_(m.bias,0)
            elif isinstance(m, nn.Linear):
                nn.init.xavier_normal_(m.weight)
                nn.init.constant_(m.bias,0)
        
    def forward(self,x):
        x = self.feature(x)
        x = torch.flatten(x,1)  # 从第几个维度拉平
        x = self.fc(x)
        return x
    
if __name__=="__main__":
    image = torch.ones((2,3,224,224))
    net = Axlenet()
    print(net(image))
    summary(net, (3,224,224),batch_size=1,device="cpu")

train_alexnet.py

#coding=utf-8
import torch
import numpy as np
from torch import nn,optim
from axlenet import Axlenet
from matplotlib import pyplot as plt
from torchvision import datasets,transforms
from torch.utils.data import DataLoader
import pickle
torch.manual_seed(4)
train_loss = []
test_loss = []
test_acc = []
best_acc = 0.0

def train(epoch,train_data_loader,device,net,loss_functoin,optm):
    net.train()   # 设置模型是训练模态  影响dropout和bn
    #train_image = train_image.to(device)   # 把训练数据加载到设备上
    #train_y = train_y.to(device)   # 把训练数据加载到设备上
    # train函数训练一轮  总样本是50000条数据 一个批次是32，训练需要迭代1562，更新1562次模型参数
    epoch_loss = 0.0
    iter_cout = len(train_data_loader) 
    for index,(train,targe) in enumerate(train_data_loader):
        train = train.to(device)
        targe = targe.to(device)
        y_predict = net(train)   # 前向传播得到预测结果
        loss = loss_functoin(y_predict, targe)   # 计算损失值
        optm.zero_grad()      # 梯度清0
        loss.backward()
        optm.step()
        print(f"训练轮次:{epoch+1}\t第几个批次:{index+1}\t训练误差:{loss.item()}")
        epoch_loss += loss.item()
    train_loss.append(epoch_loss/iter_cout)

def test(test_data_loader,device,net,loss_functoin):
    global best_acc
    net.eval()
    #test_iamge = test_iamge.to(device)   # 把测试数据数据加载到设备上
    #test_y = test_y.to(device)   # 把测试数据加载到设备上
    test_epoch_loss = 0.0
    predict_correct_num = 0
    test_num=0
    for index, (test,test_targe) in enumerate(test_data_loader):
        test = test.to(device)
        test_=len(test)
        test_num=test_num+test_
        test_targe = test_targe.to(device)
        y_predict = net(test)  # 前向传播得到预测结果
        y_predict_index = torch.argmax(y_predict,1)
        loss = loss_functoin(y_predict, test_targe)
        test_epoch_loss += loss.item()
        predict_correct_num += sum([1 for index in range(len(test_targe)) if test_targe[index]==y_predict_index[index]])
    acc = predict_correct_num/34
    print("acc",acc,test_num)
    test_acc.append(acc)
    test_loss.append(test_epoch_loss/len(test_data_loader))
    if test_acc[-1]>best_acc:
        best_acc = test_acc[-1]
        # 保存模型
        torch.save(net.state_dict(),f'best_model.pth')
        print("y_predict_index",y_predict_index)
        print("target",test_targe)

if __name__ == "__main__":
    train_transform = transforms.Compose([
                             transforms.Resize(224),
                             transforms.ToTensor()])
    test_transform = transforms.Compose([
                             transforms.Resize(224),
                             transforms.ToTensor()])
    # 下载数据集
    train_set = datasets.CIFAR10(root="../dataset",transform = train_transform,train=True,download=True)
    test_set = datasets.CIFAR10(root="../dataset",transform = test_transform, train=False,download=True)
    train_data_loader = DataLoader(train_set,32,True,drop_last=True)
    test_data_loader = DataLoader(test_set,100)
    #train_set
    idx_to_class = {value:key for key,value in train_set.class_to_idx.items()}
    with open('image_label.pkl','wb') as f:
        pickle.dump(idx_to_class,f)

    train_image = torch.rand((4,3,224,224))
    train_y = torch.LongTensor([1,0,2,1])
    #train_y = torch.from_numpy(np.array())
    test_iamge = torch.rand((4,3,224,224))
    test_y = torch.LongTensor([1,0,2,1])

    device = 'cuda' if torch.cuda.is_available() else 'cpu'  
    net = Axlenet()
    net = net.to(device)    # 把模型加载到GPU或者cpu上
    loss_functoin = nn.CrossEntropyLoss()
    optm = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
    epochs = 20 # 轮次   一轮就是把训练样本训练一遍
    for epoch in range(1):
        train(epoch,train_data_loader,device,net,loss_functoin,optm)
        test(test_data_loader,device,net,loss_functoin)
    print(train_loss)
    plt.plot(range(len(train_loss)),train_loss)
    plt.show()

train_main.py

import torchvision
from train_alexnet import *
from torchvision import transforms#进行训练数据的转换
from torch.utils.data import DataLoader
torch.manual_seed(4)
#要求我们的图像都在一个适当的目录结构中，其中每个目录分别是一个标签
train_data_path="./train/"
test_data_path="./val/"
#设置转换的各项参数
transforms=transforms.Compose([transforms.Resize((224,224)),#将每个图片都缩放为相同的分辨率64x64，便于GPU的处理
                               transforms.ToTensor(),#将数据集转化为张量
                               # transforms.Normalize(mean=[0.485,0.456,0.406],
                               #                      std=[0.229,0.224,0.225])#设置用于归一化的参数
                               ])
#处理训练数据集
train_data=torchvision.datasets.ImageFolder(root=train_data_path,transform=transforms)
test_data=torchvision.datasets.ImageFolder(root=test_data_path,transform=transforms)
train_data_loader = DataLoader(train_data,32,True,drop_last=True)
test_data_loader = DataLoader(test_data,17)

device = 'cuda' if torch.cuda.is_available() else 'cpu'
net = Axlenet(2)
net = net.to(device)  # 把模型加载到GPU或者cpu上
loss_functoin = nn.CrossEntropyLoss()
optm = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
epochs = 10  # 轮次   一轮就是把训练样本训练一遍


for epoch in range(epochs):
    train(epoch, train_data_loader, device, net, loss_functoin, optm)
    test(test_data_loader, device, net,loss_functoin)

print(train_loss)
plt.plot(range(len(train_loss)), train_loss)
plt.show()

predict,py

#coding=utf-8
import torch,os,time
from axlenet import Axlenet
from torchvision import transforms
from PIL import Image
import pickle

# with open('image_label.pkl','rb') as f:
#     idx_to_class = pickle.load(f)
test_image = './val/ai/20220411094533_39.37.jpg'
image = Image.open(test_image)
net = Axlenet(2)
net.eval()
net.load_state_dict(torch.load('best_model.pth'))

train_transform = transforms.Compose([
                         transforms.Resize((224,224)),
                         transforms.ToTensor(),
    # transforms.Normalize(mean=[0.485, 0.456, 0.406],
    #                      std=[0.229, 0.224, 0.225])  # 设置用于归一化的参数
])
device = 'cuda' if torch.cuda.is_available() else 'cpu'
image = train_transform(image)
image = torch.unsqueeze(image,0)
result = net(image)
print(result)
result_index = torch.argmax(result,1)
predict_index = result_index.item()
print(predict_index)
# predict_result = idx_to_class[predict_index]
# print(predict_result)

飞蓬heart

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【深度卷积神经网络-Alexnet】

图像分类当前目录下存放train和val文件夹，每个文件夹下建立每个类别文件夹并存放对应的图片，如下所示，存放ai和house类别图片的文件夹axlenet.py#coding=utf-8import torchfrom torch import nnfrom torchsummary import summarytorch.manual_seed(4)# 构建axlenet网络结构class Axlenet(nn.Module): # 所有网络结构都需要继承nn.Module
复制链接

扫一扫

专栏目录