CNN基础论文精读+复现----VGG(三)

深度不学习！！

已于 2022-03-31 14:17:45 修改

阅读量3.1k

点赞数 1

分类专栏：论文精读+复现个人笔记文章标签：深度学习 python 人工智能

于 2022-03-11 12:09:42 首次发布

本文链接：https://blog.csdn.net/qq_38737428/article/details/123388308

版权

个人笔记同时被 2 个专栏收录

319 篇文章 16 订阅

订阅专栏

论文精读+复现

40 篇文章 44 订阅

订阅专栏

前言

之前已经把VGG的论文弄完了，没看过的可以去看一下，

CNN基础论文精读+复现----VGG(一)
CNN基础论文精读+复现----VGG(二)

今天用代码复现一下吧。

pytorch实现VGG16。

网络搭建

在第三章说了很多参数，暂时记一下。

batch = 256
动量 = 0.9
权重衰减 $5 * 10^{-4}$
dropout = 0.5
学习率 0.01
epoch = 74

看下面这张图开始搭建网络。
在这里插入图片描述
直接先写出来前两层的卷积+池化，
文中已给出卷积核 3 * 3，池化大小 2 * 2 .步长为2。

#1
nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1,padding=1),
nn.ReLU(True),
#2
nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1,padding=1),
nn.ReLU(True),

nn.MaxPool2d(kernel_size=2,stride=2),

这两行代码没啥好说的，不过论文中有一个VGG块的概念，我还没懂这个块到底要在代码中怎么实现，然后查了查资料，发现都是这样直接一流下来的，还有一种就是逐个add添加的，像下面这样:

# block 1
net = []
net.append(nn.Conv2d(in_channels=3, out_channels=64, padding=1, kernel_size=3, stride=1))
net.append(nn.ReLU())
net.append(nn.Conv2d(in_channels=64, out_channels=64, padding=1, kernel_size=3, stride=1))
net.append(nn.ReLU())
net.append(nn.MaxPool2d(kernel_size=2, stride=2))

这种写法就是建立了一个列表，里面存着神经网络的每一层，然后逐个往里面append。

说实话我不是很懂，论文中提到的块到底是啥意思，难道就是这样的吗，这和我上面一流写下来有啥区别啊，算了不管了，就按照我一开始的那种方法写吧。。

按照上面的图搭建出完整的网络，线性层别忘了flatten和dropout。

贴一下线性层第一部分的代码:

nn.Flatten(),
#14
nn.Linear(in_features=7 * 7 * 512, out_features=4096),
nn.ReLU(True),
nn.Dropout2d(p=0.5,),

整个层的代码我这里就先不贴了，太长了，代码统一放最后吧。

初始化权重

这篇文章还有一个点就是作者在3.2的时候说初始化参数很重要，然后作者一开始是通过实验组一点一点预训练来初始化参数，后来改进使用 Glorot＆Bengio（2010） 也就是Xavier来初始化参数，所以这里我也使用Xavier来初始化。

定义初始化参数的函数:

def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                
                nn.init.xavier_uniform_(m.weight)
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight)
                # nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)

for m in self.modules():：

遍历循环全部的神经网络层

if isinstance(m, nn.Conv2d):：
isinstance() : 对比参数1和参数2是否是同类型，

对比当前层和卷积层是否是同类型，

nn.init.xavier_uniform_(m.weight):
使用上面说到的 xavier初始化方法

if m.bias is not None:
判断当前权重是否有值

nn.init.constant_(m.bias, 0)
init.constant_ （）：使用参数2 填充参数1

将偏置设置为0

损失函数和优化器以及训练函数都是老生常谈了，复制Alex那一套过来正好都符合，继续用就行。

其他代码的详细解释可以看一下下面这个文章。

训练部分代码逐句解读

结果分析

没办法分析结果了，数据集使用的CIFAR10，代码batch设置成1都跑不起来，一直提示显存不足，放到云上都跑不起来，我也不知道这是什么情况，是我的代码写的有问题吗？

我将代码又改成那种VGG块的写法，依旧是跑不起来，可能是CIFAR10数据集稍大一点，网络深了很多的原因吧？有懂的大佬看一下我的代码指点一下。

import torch
from torch import optim
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.nn as nn
import matplotlib.pyplot as plt


batch_size = 1
transform = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor (),
    transforms.Normalize((0.485, 0.456, 0.406), ( 0.229, 0.224, 0.225)),
])

train_dataset = datasets.CIFAR10(root='../data/', train=True, download=True, transform=transform)
train_loader = DataLoader(train_dataset, shuffle=True, batch_size=batch_size)
test_dataset = datasets.CIFAR10(root='../data/', train=False, download=True, transform=transform)
test_loader = DataLoader(test_dataset, shuffle=False, batch_size=batch_size)

print("训练集长度",len(train_dataset))
print("测试集长度",len(test_dataset))

# 模型类设计

class VGG16(nn.Module):
    def __init__(self):
        super(VGG16, self).__init__()
        self.mode1 = nn.Sequential(
            #1
            nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1,padding=1),
            nn.ReLU(True),
            #2
            nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1,padding=1),
            nn.ReLU(True),

            nn.MaxPool2d(kernel_size=2,stride=2),
            #3
            nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1,padding=1),
            nn.ReLU(True),
            #4
            nn.Conv2d(in_channels=128, out_channels=128, kernel_size=3, stride=1,padding=1),
            nn.ReLU(True),

            nn.MaxPool2d(kernel_size=2,stride=2),

            #5
            nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=1,padding=1),
            nn.ReLU(True),
            #6
            nn.Conv2d(in_channels=256, out_channels=256, padding=1, kernel_size=3, stride=1),
            nn.ReLU(inplace=True),

            #7
            nn.Conv2d(in_channels=256, out_channels=256, padding=1, kernel_size=3, stride=1),
            nn.ReLU(inplace=True),

            nn.MaxPool2d(stride=2, kernel_size=2),

            #8
            nn.Conv2d(in_channels=256, out_channels=512, padding=1, kernel_size=3, stride=1),
            nn.ReLU(inplace=True),
            #9
            nn.Conv2d(in_channels=512, out_channels=512, padding=1, kernel_size=3, stride=1),
            nn.ReLU(inplace=True),
            #10
            nn.Conv2d(in_channels=512, out_channels=512, padding=1, kernel_size=3, stride=1),
            nn.ReLU(inplace=True),


            nn.MaxPool2d(stride=2, kernel_size=2),

            #11
            nn.Conv2d(in_channels=512, out_channels=512, padding=1, kernel_size=3, stride=1),
            nn.ReLU(inplace=True),
            #12
            nn.Conv2d(in_channels=512, out_channels=512, padding=1, kernel_size=3, stride=1),
            nn.ReLU(inplace=True),
            #13
            nn.Conv2d(in_channels=512, out_channels=512, padding=1, kernel_size=3, stride=1),
            nn.ReLU(inplace=True),

            nn.MaxPool2d(kernel_size=2, stride=2),

            # 线性层
            nn.Flatten(),
            #14
            nn.Linear(in_features=7 * 7 * 512, out_features=4096),
            nn.ReLU(True),
            nn.Dropout2d(p=0.5,),
            #15
            nn.Linear(in_features=4096, out_features=4096),
            nn.ReLU(True),
            nn.Dropout2d(p=0.5,),
            #16
            nn.Linear(in_features=4096, out_features=1000),
            nn.ReLU(True),
        )
        self._initialize_weights()

    def forward(self, input):

        x = self.mode1(input)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.xavier_uniform_(m.weight)
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight)
                nn.init.constant_(m.bias, 0)

if hasattr(torch.cuda, 'empty_cache'):
    torch.cuda.empty_cache()

model = VGG16().cuda()
# 损失函数
criterion = torch.nn.CrossEntropyLoss().cuda()
# 优化器A
optimizer = optim.SGD(model.parameters(),lr=0.01,weight_decay=0.0005,momentum=0.9)


def train(epoch):
    runing_loss = 0.0
    i = 1
    for i, data in enumerate(train_loader):
        x, y = data
        x, y = x.cuda(), y.cuda()
        i +=1
        if i % 10 == 0:
            print("运行中，当前运行次数:",i)
        # 清零 正向传播  损失函数  反向传播 更新
        optimizer.zero_grad()
        y_pre = model(x)
        loss = criterion(y_pre, y)
        loss.backward()
        optimizer.step()
        runing_loss += loss.item()
    # 每轮训练一共训练1W个样本，这里的runing_loss是1W个样本的总损失值，要看每一个样本的平均损失值， 记得除10000

    print("这是第 %d轮训练，当前损失值 %.5f" % (epoch + 1, runing_loss / 782))

    return runing_loss / 782

def test(epoch):
    correct = 0
    total = 0
    with torch.no_grad():
        for data in test_loader:
            x, y = data
            x, y = x.cuda(), y.cuda()
            pre_y = model(x)
            # 这里拿到的预测值 每一行都对应10个分类，这10个分类都有对应的概率，
            # 我们要拿到最大的那个概率和其对应的下标。

            j, pre_y = torch.max(pre_y.data, dim=1)  # dim = 1 列是第0个维度，行是第1个维度

            total += y.size(0)  # 统计方向0上的元素个数 即样本个数

            correct += (pre_y == y).sum().item()  # 张量之间的比较运算
    print("第%d轮测试结束，当前正确率:%d %%" % (epoch + 1, correct / total * 100))
    return correct / total * 100
if __name__ == '__main__':
    plt_epoch = []
    loss_ll = []
    corr = []
    for epoch in range(20):
        plt_epoch.append(epoch+1) # 方便绘图
        loss_ll.append(train(epoch)) # 记录每一次的训练损失值 方便绘图
        corr.append(test(epoch)) # 记录每一次的正确率
# 可视化
    plt.rcParams['font.sans-serif'] = ['KaiTi']
    plt.figure(figsize=(12,6))
    plt.subplot(1,2,1)
    plt.title("训练模型")
    plt.plot(plt_epoch,loss_ll)
    plt.xlabel("循环次数")
    plt.ylabel("损失值loss")


    plt.subplot(1,2,2)
    plt.title("测试模型")
    plt.plot(plt_epoch,corr)
    plt.xlabel("循环次数")
    plt.ylabel("正确率")
    plt.show()

看不了结果无伤大雅，不耽误学习这篇论文。

代码放到Github上了:https://github.com/shitbro6/paper

深度不学习！！

关注

1
点赞
踩
13

收藏

觉得还不错? 一键收藏
打赏
3
评论
CNN基础论文精读+复现----VGG(三)

前言之前已经把VGG大部分的东西弄完了，没看过的可以去看一下，CNN基础论文精读+复现----VGG(一)CNN基础论文精读+复现----VGG(二)今天用代码复现一下吧。pytorch实现VGG16。网络搭建在第三章说了很多参数，batch = 256动量 = 0.9权重衰减 5∗10−45 * 10^{-4}5∗10−4dropout = 0.5学习率 0.01epoch = 74看下面这张图开始搭建网络。直接先写出来前两层的卷积+池化，文中已给出卷积核 3
复制链接

扫一扫