PyTorch第十一讲 CNN进阶篇

CNN 进阶

GoogleNet

请添加图片描述

可以看到,网络中又有网络(Inception Module)

Inception Model

请添加图片描述

为什么里面需要一个1×1的卷积呢?

请添加图片描述

通过对三个通道的一次1×1卷积运算,相当于把三个通道的信息融合在了一个通道中,类似高中时期,最后的排名是将所有成绩相加再来进行比较的
请添加图片描述

  • 如果直接对192@28×28进行一个5×5卷积,最终的计算量为: 卷积核的5²×原尺寸28²×原通道数192×现通道数32=120422400
  • 如果先进行一次1×1卷积,压缩通道数量,再来进行5×5卷积,则最终的计算量为: 卷积核的1²×原尺寸28²×原通道数192×现通道数16+卷积核的5²×原尺寸28²×原通道数16×现通道数32=12433648
  • (12433648/120422400)*100%=10.325% 相当于计算量减少为为原来的1/10,也就是说运算时间也减少1/10,原来跑10小时的,现在只需要跑1小时,大大缩短时间成本

相应代码

请添加图片描述

  • self.xxx是在__init__()里面的代码,剩下的是在forward()里面的代码
  • 通过设置Padding来避免w、h的减小
    请添加图片描述

相应的拼接代码:

outputs=[branch1x1,branch5x5,branch3x3,branch_pool1]
return torch.cat(outputs,dim=1)

其中的dim=1,指的是从通道维度进行拼接操作,4个维度分别为:p:patch、c:通道、w:宽度、h:高度 分布对应dim=0、1、2、3

代码

import matplotlib.pyplot as plt
import torch
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms  #将w(宽)×h(高)×c(channel)转换成c×w×h,即把通道提到最前面

batch_size = 64
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))  #mnist数据集的均值,前人已经算好的,直接用这两个数就行
])

train_dataset = datasets.MNIST(root='./dataset/minist/',
                               train=True,
                               download=True,
                               transform=transform)
train_loader = DataLoader(train_dataset,
                          shuffle=True,
                          batch_size=batch_size)
test_dataset = datasets.MNIST(root='./dataset/mnist/',
                              train=False,
                              download=True,
                              transform=transform)
test_loader = DataLoader(test_dataset,
                         shuffle=False,
                         batch_size=batch_size)


#---------------------------------------------------以下为CNN-------------------------------------------------------------------------#
class InceptionA(torch.nn.Module):
    def __init__(self, in_channels):
        super(InceptionA, self).__init__()
        self.branch1x1 = torch.nn.Conv2d(in_channels, 16, kernel_size=1)

        self.branch5x5_1 = torch.nn.Conv2d(in_channels, 16, kernel_size=1)
        self.branch5x5_2 = torch.nn.Conv2d(16, 24, kernel_size=5, padding=2)

        self.branch3x3_1 = torch.nn.Conv2d(in_channels, 16, kernel_size=1)
        self.branch3x3_2 = torch.nn.Conv2d(16, 24, kernel_size=3, padding=1)
        self.branch3x3_3 = torch.nn.Conv2d(24, 24, kernel_size=3, padding=1)

        self.branch_pool = torch.nn.Conv2d(in_channels, 24, kernel_size=1)

    def forward(self, x):
        branch1x1 = self.branch1x1(x)  #1×1

        branch5x5 = self.branch5x5_1(x)  #5×5
        branch5x5 = self.branch5x5_2(branch5x5)

        branch3x3 = self.branch3x3_1(x)
        branch3x3 = self.branch3x3_2(branch3x3)
        branch3x3 = self.branch3x3_3(branch3x3)

        branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)  #先平均池化
        branch_pool = self.branch_pool(branch_pool)  #再1×1卷积

        #将4个branch的结果按通道拼接
        outputs = [branch1x1, branch5x5, branch3x3, branch_pool]
        return torch.cat(outputs, dim=1)


class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = torch.nn.Conv2d(88, 20, kernel_size=5)

        self.incep1 = InceptionA(in_channels=10)
        self.incep2 = InceptionA(in_channels=20)

        self.mp = torch.nn.MaxPool2d(2)
        self.fc = torch.nn.Linear(1408, 10)

    def forward(self, x):
        in_size = x.size(0)
        x = F.relu(self.mp(self.conv1(x)))
        x = self.incep1(x)
        x = F.relu(self.mp(self.conv2(x)))
        x = self.incep2(x)
        x = x.view(in_size, -1)
        x = self.fc(x)
        return x


model = Net()

#**********************以下为调用GPU改进处*******************************#
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)

#**********************以上为调用GPU改进处*******************************#
print(device)
#---------------------------------------------------以上为CNN-------------------------------------------------------------------------#
criterion = torch.nn.CrossEntropyLoss()  #交叉熵
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)  #momentum:动量,有助于更快收敛,也有助于跳出局部最优


def train(epoch):
    running_loss = 0.0
    for batch_idx, data in enumerate(train_loader, 0):
        inputs, target = data

        #**********************以下为调用GPU改进处*******************************#
        inputs, target = inputs.to(device), target.to(device)
        #**********************以上为调用GPU改进处*******************************#

        optimizer.zero_grad()

        #前馈+反馈+更新
        outputs = model(inputs)
        loss = criterion(outputs, target)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if batch_idx % 300 == 299:
            print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
            running_loss = 0.0


def test():
    correct = 0
    total = 0
    with torch.no_grad():  #test不需要算梯度
        for data in test_loader:
            images, labels = data

            #**********************以下为调用GPU改进处*******************************#
            images, labels = images.to(device), labels.to(device)
            #**********************以上为调用GPU改进处*******************************#

            outputs = model(images)
            _, predicted = torch.max(outputs.data, dim=1)  # 用_表示一个不重要的值,后面也没用到,就只占个位置,dim=1表示横向求max
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    print('Accuracy on test set: %d %% [%d/%d]' % (100 * correct / total, correct, total))
    return correct / total


epoch_list = []
accu_list = []

if __name__ == '__main__':
    for epoch in range(10):
        train(epoch)
        accu = test()
        epoch_list.append(epoch)
        accu_list.append(accu)

    plt.plot(epoch_list, accu_list, 'o-')
    plt.xticks(range(10))
    plt.xlabel('epoch')
    plt.ylabel('accuracy')
    plt.grid(alpha=0.4)

    plt.show()
cuda:0
[1,   300] loss: 0.966
[1,   600] loss: 0.217
[1,   900] loss: 0.157
Accuracy on test set: 96 % [9669/10000]
[2,   300] loss: 0.125
[2,   600] loss: 0.107
[2,   900] loss: 0.095
Accuracy on test set: 97 % [9758/10000]
[3,   300] loss: 0.087
[3,   600] loss: 0.083
[3,   900] loss: 0.077
Accuracy on test set: 98 % [9802/10000]
[4,   300] loss: 0.069
[4,   600] loss: 0.067
[4,   900] loss: 0.070
Accuracy on test set: 98 % [9846/10000]
[5,   300] loss: 0.061
[5,   600] loss: 0.061
[5,   900] loss: 0.060
Accuracy on test set: 98 % [9830/10000]
[6,   300] loss: 0.055
[6,   600] loss: 0.055
[6,   900] loss: 0.052
Accuracy on test set: 98 % [9845/10000]
[7,   300] loss: 0.050
[7,   600] loss: 0.045
[7,   900] loss: 0.050
Accuracy on test set: 98 % [9835/10000]
[8,   300] loss: 0.043
[8,   600] loss: 0.047
[8,   900] loss: 0.046
Accuracy on test set: 98 % [9864/10000]
[9,   300] loss: 0.042
[9,   600] loss: 0.041
[9,   900] loss: 0.042
Accuracy on test set: 98 % [9867/10000]
[10,   300] loss: 0.039
[10,   600] loss: 0.038
[10,   900] loss: 0.042
Accuracy on test set: 98 % [9872/10000]

请添加图片描述

Residual-Net网络

请添加图片描述

  • 通过再原来的F(x)基础上添加一个x,能够有效避免梯度消失问题
    请添加图片描述

ResidualNet代码实现

请添加图片描述

import matplotlib.pyplot as plt
import torch
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms  #将w(宽)×h(高)×c(channel)转换成c×w×h,即把通道提到最前面

batch_size = 64
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))  #mnist数据集的均值,前人已经算好的,直接用这两个数就行
])

train_dataset = datasets.MNIST(root='./dataset/minist/',
                               train=True,
                               download=True,
                               transform=transform)
train_loader = DataLoader(train_dataset,
                          shuffle=True,
                          batch_size=batch_size)
test_dataset = datasets.MNIST(root='./dataset/mnist/',
                              train=False,
                              download=True,
                              transform=transform)
test_loader = DataLoader(test_dataset,
                         shuffle=False,
                         batch_size=batch_size)


#---------------------------------------------------以下为CNN-------------------------------------------------------------------------#
class ResidualBlock(torch.nn.Module):
    def __init__(self, channels):
        super(ResidualBlock, self).__init__()
        self.channels = channels
        self.conv1 = torch.nn.Conv2d(channels, channels, kernel_size=3, padding=1)
        self.conv2 = torch.nn.Conv2d(channels, channels, kernel_size=3, padding=1)

    def forward(self, x):
        y = F.relu(self.conv1(x))
        y = self.conv2(y)
        return F.relu(x + y)


class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = torch.nn.Conv2d(1, 16, kernel_size=5)
        self.conv2 = torch.nn.Conv2d(16, 32, kernel_size=5)
        self.mp = torch.nn.MaxPool2d(2)

        self.rblock1=ResidualBlock(16)
        self.rblock2=ResidualBlock(32)

        self.fc = torch.nn.Linear(512, 10)

    def forward(self, x):
        in_size = x.size(0)
        x = F.relu(self.mp(self.conv1(x)))
        x = self.rblock1(x)
        x = F.relu(self.mp(self.conv2(x)))
        x = self.rblock2(x)
        x = x.view(in_size, -1)
        x = self.fc(x)
        return x


model = Net()

#**********************以下为调用GPU改进处*******************************#
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)

#**********************以上为调用GPU改进处*******************************#
print(device)
#---------------------------------------------------以上为CNN-------------------------------------------------------------------------#
criterion = torch.nn.CrossEntropyLoss()  #交叉熵
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)  #momentum:动量,有助于更快收敛,也有助于跳出局部最优


def train(epoch):
    running_loss = 0.0
    for batch_idx, data in enumerate(train_loader, 0):
        inputs, target = data

        #**********************以下为调用GPU改进处*******************************#
        inputs, target = inputs.to(device), target.to(device)
        #**********************以上为调用GPU改进处*******************************#

        optimizer.zero_grad()

        #前馈+反馈+更新
        outputs = model(inputs)
        loss = criterion(outputs, target)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if batch_idx % 300 == 299:
            print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
            running_loss = 0.0


def test():
    correct = 0
    total = 0
    with torch.no_grad():  #test不需要算梯度
        for data in test_loader:
            images, labels = data

            #**********************以下为调用GPU改进处*******************************#
            images, labels = images.to(device), labels.to(device)
            #**********************以上为调用GPU改进处*******************************#

            outputs = model(images)
            _, predicted = torch.max(outputs.data, dim=1)  # 用_表示一个不重要的值,后面也没用到,就只占个位置,dim=1表示横向求max
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    print('Accuracy on test set: %d %% [%d/%d]' % (100 * correct / total, correct, total))
    return correct / total


epoch_list = []
accu_list = []

if __name__ == '__main__':
    for epoch in range(10):
        train(epoch)
        accu = test()
        epoch_list.append(epoch)
        accu_list.append(accu)

    plt.plot(epoch_list, accu_list, 'o-')
    plt.xticks(range(10))
    plt.xlabel('epoch')
    plt.ylabel('accuracy')
    plt.grid(alpha=0.4)

    plt.show()
cuda:0
[1,   300] loss: 0.529
[1,   600] loss: 0.155
[1,   900] loss: 0.114
Accuracy on test set: 96 % [9677/10000]
[2,   300] loss: 0.091
[2,   600] loss: 0.072
[2,   900] loss: 0.070
Accuracy on test set: 98 % [9841/10000]
[3,   300] loss: 0.059
[3,   600] loss: 0.062
[3,   900] loss: 0.049
Accuracy on test set: 98 % [9870/10000]
[4,   300] loss: 0.045
[4,   600] loss: 0.044
[4,   900] loss: 0.049
Accuracy on test set: 98 % [9847/10000]
[5,   300] loss: 0.043
[5,   600] loss: 0.039
[5,   900] loss: 0.038
Accuracy on test set: 98 % [9886/10000]
[6,   300] loss: 0.034
[6,   600] loss: 0.035
[6,   900] loss: 0.032
Accuracy on test set: 98 % [9886/10000]
[7,   300] loss: 0.030
[7,   600] loss: 0.031
[7,   900] loss: 0.029
Accuracy on test set: 98 % [9870/10000]
[8,   300] loss: 0.026
[8,   600] loss: 0.025
[8,   900] loss: 0.029
Accuracy on test set: 98 % [9877/10000]
[9,   300] loss: 0.023
[9,   600] loss: 0.025
[9,   900] loss: 0.023
Accuracy on test set: 98 % [9865/10000]
[10,   300] loss: 0.018
[10,   600] loss: 0.023
[10,   900] loss: 0.023
Accuracy on test set: 98 % [9873/10000]

请添加图片描述

其他类型的网络结构

请添加图片描述

请添加图片描述

接下来的学习方向

  1. 理论《深度学习》(花书)
  2. 阅读Pytorch文档(至少通读一遍):看看支持的运算、有哪些层
  3. 复现经典工作:读代码–→训练、测试、数据读取架构、损失函数构建–→根据论文讲的东西自己去完成,读代码和写代码相互迭代
  4. 选择特定领域,大量阅读论文,看看大家都用了什么技巧,积累一些模块,扩充视野,看多了就可以想创新点

完结谢辞

整个深度学习基础课程到这里结束,由于目前做的是CV方向,后面的循环神经网络暂时用不上,就先不看了。

在此感谢B站刘二大人的课程给予我指导,也给我后续的进一步学习指明了方向

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值