计算机视觉知识点-基础网络-VGGNet

最新推荐文章于 2022-12-05 18:20:22 发布

蓝色的杯子

最新推荐文章于 2022-12-05 18:20:22 发布

阅读量355

点赞数 1

分类专栏：计算机视觉知识点文章标签：计算机视觉

本文链接：https://blog.csdn.net/wisdomfriend/article/details/108273714

版权

计算机视觉知识点专栏收录该内容

31 篇文章 2 订阅

订阅专栏

VGGNet是英国剑桥Visual Geometry Group (VGG)小组提出的一个分块概念的网络结构，取得了2014年imagenet分类任务的冠军。VGGNet有这接特征：

1) 使用maxpool，dropout, relu
2) 使用block的概念，简化设计
3) 卷积层通过 kernel=3，padding=1保存输入尺寸

4) 使用pool_size=2， alexnet的pool_size=3， pooling层使分辨率减半
5) 提出一个这样的块结构，每个块都是几个卷积层加一个maxpooling, 卷积层保持尺度不变，pooling层使尺度减半

使用mxnet进行一下代码演示。mxnet的安装方法

pip install d2l==0.14.3
pip install -U mxnet-cu101mkl==1.6.0.post0  
pip install gluoncv

一个VGGNet中的标准块

from d2l import mxnet as d2l
from mxnet import np, npx
from mxnet.gluon import nn
npx.set_np()

def vgg_block(num_convs, num_channels):
    blk = nn.Sequential()
    for _ in range(num_convs):
        blk.add(nn.Conv2D(num_channels, kernel_size=3,
                          padding=1, activation='relu'))
    blk.add(nn.MaxPool2D(pool_size=2, strides=2))
    return blk

VGG网络和AlexNet的比较：

网络的卷积部分依次连上图中的几个VGG块卷积层数和输出通道数，是调用vgg_block函数所需的参数,VGG网络的完全连接部分与AlexNet中介绍的部分相同。原始的VGG网络具有5个卷积块，其中前两个分别具有一个卷积层，后三个分别包含两个卷积层,第一块具有64个输出通道，随后的每个块将输出通道的数量加倍，直到达到512个为止。该网络使用8个卷积层和3个完全连接的层，因此通常称为VGG-11。下面是VGG11的结构代码

conv_arch = ((1, 64), (1, 128), (2, 256), (2, 512), (2, 512))
def vgg(conv_arch):
    net = nn.Sequential()
    # The convolutional part
    for (num_convs, num_channels) in conv_arch:
        net.add(vgg_block(num_convs, num_channels))
    # The fully-connected part
    net.add(nn.Dense(4096, activation='relu'), nn.Dropout(0.5),
            nn.Dense(4096, activation='relu'), nn.Dropout(0.5),
            nn.Dense(10))
    return net

net = vgg(conv_arch)

测试一下输出shape

net.initialize()
X = np.random.uniform(size=(1, 1, 224, 224))
for blk in net:
    X = blk(X)
    print(blk.name, 'output shape:\t', X.shape)

sequential1 output shape:    (1, 64, 112, 112)
sequential2 output shape:    (1, 128, 56, 56)
sequential3 output shape:    (1, 256, 28, 28)
sequential4 output shape:    (1, 512, 14, 14)
sequential5 output shape:    (1, 512, 7, 7)
dense0 output shape:         (1, 4096)
dropout0 output shape:       (1, 4096)
dense1 output shape:         (1, 4096)
dropout1 output shape:       (1, 4096)
dense2 output shape:         (1, 10)

加载fashion_mnist训练数据

batch_size=128
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size, resize=224)

训练代码

ratio = 4
small_conv_arch = [(pair[0], pair[1] // ratio) for pair in conv_arch]
net = vgg(small_conv_arch)
lr, num_epochs, batch_size = 0.05, 10, 128
d2l.train_ch6(net, train_iter, test_iter, num_epochs, lr)

在T4下训练大概30分钟就可以完成了

训练结果

loss 0.175, train acc 0.935, test acc 0.923
1814.6 examples/sec on gpu(0)

最后的话:

这篇文章发布在CSDN/蓝色的杯子, 没事多留言,让我们一起爱智求真吧.我的邮箱wisdomfriend@126.com.

蓝色的杯子

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录