Pytorch深度学习实践 11 卷积神经网络(高级)

11.卷积神经网络(高级篇)_哔哩哔哩_bilibili

最近在 B站 刘二大人 学习PyTorch ,上传一些学习用的代码,仅供参考和交流。

卷积神经网络基础部分参考:

Pytorch深度学习实践 10 卷积神经网络(基础)-CSDN博客

目录

1、Inception Module

2、包含Inception Module的卷积神经网络

3、Residual Block

4.包含Residual Block的卷积神经网络


1、Inception Module

        Inception Module,又称为GoogLeNet Inception Module,是由Google公司在2014年提出的一个卷积神经网络结构,是GoogLeNet网络的核心组件,旨在解决网络中的计算资源和参数数量的问题,并通过多尺度卷积核的组合来捕捉不同尺度的特征。

        Inception Module 的主要特点是并行结构,它包括多个分支(Branches),每个分支使用不同尺寸的卷积核。通过同时使用不同卷积核的信息,网络可以同时捕捉到局部和全局的特征,从而提高了网络的表达能力。

        Inception Module的基本组成部分如下:

  • 1x1 卷积核分支: 1x1卷积核用于捕捉通道间的相关性,有助于改变输入通道数,从而降低计算复杂度。

  • 3x3 卷积核分支: 3x3卷积核用于捕捉局部特征。通过不同尺寸的卷积核,网络可以更好地适应不同大小的目标。

  • 5x5 卷积核分支: 5x5卷积核也用于捕捉局部特征,与3x3卷积核的目的相似。但是它的感受野更大,可以捕捉更大范围的信息。

  • 3x3 最大池化分支: 最大池化操作有助于捕捉更加抽象和全局的特征。

  • Concatenation: 将所有分支的输出在通道维度上拼接在一起,形成最终的输出。

       基础单元定义:

self.branch1x1=nn.Conv2d(in_channels,16,kernel_size=1)

self.branch5x5_1=nn.Conv2d(in_channels,16,kernel_size=1)
self.branch5x5_2=nn.Conv2d(16,24,kernel_size=5,padding=2)

self.branch3x3_1=nn.Conv2d(in_channels,16,kernel_size=1)
self.branch3x3_2 = nn.Conv2d(16, 24, kernel_size=3,padding=1)
self.branch3x3_3 = nn.Conv2d(24, 24, kernel_size=3,padding=1)

self.branch_pool=nn.Conv2d(in_channels,24,kernel_size=1)

        分支1定义:

branch_pool=F.avg_pool2d(x,kernel_size=3,stride=1,padding=1)
branch_pool=self.branch_pool(branch_pool)

        分支2定义:

 branch1x1=self.branch1x1(x)

        分支3定义:

branch5x5=self.branch5x5_1(x)
branch5x5=self.branch5x5_2(branch5x5)

        分支4定义:

branch3x3=self.branch3x3_1(x)
branch3x3=self.branch3x3_2(branch3x3)
branch3x3=self.branch3x3_3(branch3x3)

        最终将4个分支的结果拼接起来,如图所示:

        拼接的代码如下:

outputs=[branch1x1,branch5x5,branch3x3,branch_pool]
final_out=torch.cat(outputs,dim=1)#在通道上拼接

         Inception Module代码定义:

class InceptionA(nn.Module):
    def __init__(self,in_channels):
        super(InceptionA,self).__init__()
        self.branch1x1=nn.Conv2d(in_channels,16,kernel_size=1)

        self.branch5x5_1=nn.Conv2d(in_channels,16,kernel_size=1)
        self.branch5x5_2=nn.Conv2d(16,24,kernel_size=5,padding=2)

        self.branch3x3_1=nn.Conv2d(in_channels,16,kernel_size=1)
        self.branch3x3_2 = nn.Conv2d(16, 24, kernel_size=3,padding=1)
        self.branch3x3_3 = nn.Conv2d(24, 24, kernel_size=3,padding=1)

        self.branch_pool=nn.Conv2d(in_channels,24,kernel_size=1)
    def forward(self,x):
        branch1x1=self.branch1x1(x)

        branch5x5=self.branch5x5_1(x)
        branch5x5=self.branch5x5_2(branch5x5)

        branch3x3=self.branch3x3_1(x)
        branch3x3=self.branch3x3_2(branch3x3)
        branch3x3=self.branch3x3_3(branch3x3)

        branch_pool=F.avg_pool2d(x,kernel_size=3,stride=1,padding=1)
        branch_pool=self.branch_pool(branch_pool)

        outputs=[branch1x1,branch5x5,branch3x3,branch_pool]
        return torch.cat(outputs,dim=1)#在通道上拼接

2、包含Inception Module的卷积神经网络

  • 先使用类对Inception Module进行封装
  • 卷积神经网络的结构为:卷积层Conv1(1*10*5*5)->最大池化层(2*2)->Inception Module1->卷积层Conv2(88*20*5*5)->最大池化层(2*2)->Inception Module2->全连接层
  • 全连接层输入特征数为1408(通过view操作从卷积层的输出展平得到),输出类别数为10

        完整代码如下:

import torch
import torch.nn as nn
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.nn.functional as F
import matplotlib.pyplot as plt

#加载数据集
batch_size=64
transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,),(0.3081,))
])

train_dataset=datasets.MNIST(root='../dataset/mnist/',
    train=True,
    download=False,
    transform=transform)
train_loader = DataLoader(train_dataset,
    shuffle=True,
    batch_size=batch_size)
test_dataset = datasets.MNIST(root='../dataset/mnist/',
    train=False,
    download=False,
    transform=transform)
test_loader = DataLoader(test_dataset,
    shuffle=False,
    batch_size=batch_size)

#构造模型
class InceptionA(nn.Module):
    def __init__(self,in_channels):
        super(InceptionA,self).__init__()
        self.branch1x1=nn.Conv2d(in_channels,16,kernel_size=1)

        self.branch5x5_1=nn.Conv2d(in_channels,16,kernel_size=1)
        self.branch5x5_2=nn.Conv2d(16,24,kernel_size=5,padding=2)

        self.branch3x3_1=nn.Conv2d(in_channels,16,kernel_size=1)
        self.branch3x3_2 = nn.Conv2d(16, 24, kernel_size=3,padding=1)
        self.branch3x3_3 = nn.Conv2d(24, 24, kernel_size=3,padding=1)

        self.branch_pool=nn.Conv2d(in_channels,24,kernel_size=1)
    def forward(self,x):
        branch1x1=self.branch1x1(x)

        branch5x5=self.branch5x5_1(x)
        branch5x5=self.branch5x5_2(branch5x5)

        branch3x3=self.branch3x3_1(x)
        branch3x3=self.branch3x3_2(branch3x3)
        branch3x3=self.branch3x3_3(branch3x3)

        branch_pool=F.avg_pool2d(x,kernel_size=3,stride=1,padding=1)
        branch_pool=self.branch_pool(branch_pool)

        outputs=[branch1x1,branch5x5,branch3x3,branch_pool]
        return torch.cat(outputs,dim=1)#在通道上拼接
class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(88, 20, kernel_size=5)

        self.incep1 = InceptionA(in_channels=10)
        self.incep2 = InceptionA(in_channels=20)

        self.mp = nn.MaxPool2d(2)
        self.fc = nn.Linear(1408, 10)

    def forward(self, x):
        # flatten data from (n,1,28,28) to (n, 784)
        in_size = x.size(0)
        x = F.relu(self.mp(self.conv1(x)))
        x = self.incep1(x)
        x = F.relu(self.mp(self.conv2(x)))
        x = self.incep2(x)
        x = x.view(in_size, -1)  # -1 此处为nx88x4x4
        x = self.fc(x)
        return x
model=Net()

#定义损失函数
criterion=torch.nn.CrossEntropyLoss()#损失函数,不求均值
#定义优化器
optimizer=torch.optim.SGD(model.parameters(),lr=0.01,momentum=0.6)#SGD优化器,带动量

losses=[]
accs=[]
def train(epoch):
    running_loss=0.0
    for batch_idx,data in enumerate(train_loader,0):
        inputs,target=data
        optimizer.zero_grad()

        #forward+backward+update
        outputs=model(inputs)
        loss=criterion(outputs,target)
        loss.backward()
        optimizer.step()

        running_loss+=loss.item()
        if batch_idx%300==299:
            print('[%d %5d] loss:%.3f'%(epoch + 1, batch_idx + 1, running_loss / 300))
            losses.append(running_loss / 300)
            running_loss=0

def test():
    correct=0
    total=0
    with torch.no_grad():
        for data in test_loader:
            images,labels=data
            outputs=model(images)
            _,predicted=torch.max(outputs.data,dim=1)
            total+=labels.size(0)
            correct+=(predicted==labels).sum().item()
    acc=100*correct/total
    print('Accuracy on test set:%0.3f'%(acc))
    accs.append(acc)

if __name__=='__main__':
    for epoch in range(12):
        train(epoch)
        test()

    plt.figure(1)
    plt.plot(losses, label='loss')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.title('Variation of Cross-Entropy Loss')

    plt.figure(2)
    plt.plot(accs, label='acc')
    plt.xlabel('epoch')
    plt.ylabel('acc')
    plt.title('Accuracy Evolution')
    plt.show()

训练过程中损失函数值变化如下:

        最终测试集准确率为98.88%,测试集准确率变化如下:

3、Residual Block

        Residual Block 是 Residual Networks(ResNet)中的基本构建块,用于构建深度神经网络。Residual Block 的设计允许模型学习输入与输出之间的残差,通过残差连接(residual connection)来缓解梯度消失问题,使得网络更易于训练。

H(x)=x+F(x)

        F(x)与x维度一致,解决随着模型深度加深而出现的梯度消失问题。(注:梯度消失问题是指在深度神经网络中,梯度在反向传播过程中逐渐减小并趋近于零,导致深层网络的参数无法得到有效的更新。当梯度接近零时,权重更新的幅度非常小,因此深层网络的底层参数几乎不会发生变化,导致模型难以学习底层特征和表示,影响整个网络的训练效果。)

        在一个残差块中,假设输入为 x,输出为 y,残差映射为 F(x)。那么,梯度的传递可以表示为:

\frac{\partial Loss}{\partial x}=\frac{\partial Loss}{\partial y}\frac{\partial y}{\partial x}

        使用链式法则展开:

\frac{\partial Loss}{\partial x}=\frac{\partial Loss}{\partial y}(1+\frac{\partial F(x)}{\partial x})

        在传统的网络结构中,如果\frac{\partial F(x)}{\partial x}的绝对值较小,那么梯度\frac{\partial Loss}{\partial x}的绝对值也较小,导致梯度消失问题。然而,在残差块中,即使 \frac{\partial F(x)}{\partial x}的绝对值较小,仍然有一个单位矩阵被加到梯度中。这使得即使残差映射的梯度很小,梯度仍然可以有效地传递。

        Residual Block 的基本构建如下:

        Residual Block定义代码如下:

class ResidualBlock(nn.Module):
    def __init__(self,channels):
        super(ResidualBlock,self).__init__()
        # self.channels=nn.Conv2d(channels,channels,kernel_size=3,padding=1)
        self.conv1=nn.Conv2d(channels,channels,kernel_size=3,padding=1)
        self.conv2 = nn.Conv2d(channels, channels, kernel_size=3, padding=1)
    def forward(self,x):
        y=F.relu(self.conv1(x))
        y=self.conv2(y)
        return F.relu(x+y)

4.包含Residual Block的卷积神经网络

        ResNet全称为Residual Networks,是由微软亚洲研究院提出的一种深度学习架构,旨在解决深度神经网络中的梯度消失问题。ResNet通过引入残差块(Residual Block)的结构,使得神经网络能够更轻松地训练非常深的模型。

  • 先使用类对Residual Block进行封装
  • 卷积神经网络的结构为:卷积层Conv1(1*16*5*5)->最大池化层(2*2)->Residual Block1->卷积层Conv2(16*32*5*5)->最大池化层(2*2)->Residual Block2->全连接层
  • 全连接层输入特征数为512(通过view操作从卷积层的输出展平得到),输出类别数为10

        完整代码如下:

import torch
import torch.nn as nn
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.nn.functional as F
import matplotlib.pyplot as plt

#加载数据集
batch_size=64
transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,),(0.3081,))
])

train_dataset=datasets.MNIST(root='../dataset/mnist/',
    train=True,
    download=False,
    transform=transform)
train_loader = DataLoader(train_dataset,
    shuffle=True,
    batch_size=batch_size)
test_dataset = datasets.MNIST(root='../dataset/mnist/',
    train=False,
    download=False,
    transform=transform)
test_loader = DataLoader(test_dataset,
    shuffle=False,
    batch_size=batch_size)

#构造模型
class ResidualBlock(nn.Module):
    def __init__(self,channels):
        super(ResidualBlock,self).__init__()
        # self.channels=nn.Conv2d(channels,channels,kernel_size=3,padding=1)
        self.conv1=nn.Conv2d(channels,channels,kernel_size=3,padding=1)
        self.conv2 = nn.Conv2d(channels, channels, kernel_size=3, padding=1)
    def forward(self,x):
        y=F.relu(self.conv1(x))
        y=self.conv2(y)
        return F.relu(x+y)
class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, kernel_size=5)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=5)

        self.rblock1 = ResidualBlock(channels=16)
        self.rblock2 = ResidualBlock(channels=32)

        self.mp = nn.MaxPool2d(2)
        self.fc = nn.Linear(512, 10)

    def forward(self, x):
        # flatten data from (n,1,28,28) to (n, 784)
        in_size = x.size(0)
        x = self.mp(F.relu(self.conv1(x)))
        x = self.rblock1(x)
        x = self.mp(F.relu(self.conv2(x)))
        x = self.rblock2(x)
        x = x.view(in_size, -1)  # -1 此处为nx32x4x4
        x = self.fc(x)
        return x
model=Net()

#定义损失函数
criterion=torch.nn.CrossEntropyLoss()#损失函数,不求均值
#定义优化器
optimizer=torch.optim.SGD(model.parameters(),lr=0.01,momentum=0.6)#SGD优化器,带动量

losses=[]
accs=[]
def train(epoch):
    running_loss=0.0
    for batch_idx,data in enumerate(train_loader,0):
        inputs,target=data
        optimizer.zero_grad()

        #forward+backward+update
        outputs=model(inputs)
        loss=criterion(outputs,target)
        loss.backward()
        optimizer.step()

        running_loss+=loss.item()
        if batch_idx%300==299:
            print('[%d %5d] loss:%.3f'%(epoch + 1, batch_idx + 1, running_loss / 300))
            losses.append(running_loss / 300)
            running_loss=0

def test():
    correct=0
    total=0
    with torch.no_grad():
        for data in test_loader:
            images,labels=data
            outputs=model(images)
            _,predicted=torch.max(outputs.data,dim=1)
            total+=labels.size(0)
            correct+=(predicted==labels).sum().item()
    acc=100*correct/total
    print('Accuracy on test set:%0.3f'%(acc))
    accs.append(acc)

if __name__=='__main__':
    for epoch in range(12):
        train(epoch)
        test()

    plt.figure(1)
    plt.plot(losses, label='loss')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.title('Variation of Cross-Entropy Loss')

    plt.figure(2)
    plt.plot(accs, label='acc')
    plt.xlabel('epoch')
    plt.ylabel('acc')
    plt.title('Accuracy Evolution')
    plt.show()

        训练过程中损失函数值变化如下:

        最终测试集准确率为99.07%,测试集准确率变化如下:

  • 24
    点赞
  • 27
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值