VGGNet是英国剑桥Visual Geometry Group (VGG)小组提出的一个分块概念的网络结构,取得了2014年imagenet分类任务的冠军。VGGNet有这接特征:
1) 使用maxpool,dropout, relu
2) 使用block的概念,简化设计
3) 卷积层通过 kernel=3,padding=1保存输入尺寸
4) 使用pool_size=2, alexnet的pool_size=3, pooling层使分辨率减半
5) 提出一个这样的块结构,每个块都是几个卷积层加一个maxpooling, 卷积层保持尺度不变,pooling层使尺度减半
使用mxnet进行一下代码演示。mxnet的安装方法
pip install d2l==0.14.3
pip install -U mxnet-cu101mkl==1.6.0.post0
pip install gluoncv
一个VGGNet中的标准块
from d2l import mxnet as d2l
from mxnet import np, npx
from mxnet.gluon import nn
npx.set_np()
def vgg_block(num_convs, num_channels):
blk = nn.Sequential()
for _ in range(num_convs):
blk.add(nn.Conv2D(num_channels, kernel_size=3,
padding=1, activation='relu'))
blk.add(nn.MaxPool2D(pool_size=2, strides=2))
return blk
VGG网络和AlexNet的比较:
网络的卷积部分依次连上图中的几个VGG块 卷积层数和输出通道数,是调用vgg_block函数所需的参数,VGG网络的完全连接部分与AlexNet中介绍的部分相同。原始的VGG网络具有5个卷积块,其中前两个分别具有一个卷积层,后三个分别包含两个卷积层,第一块具有64个输出通道,随后的每个块将输出通道的数量加倍,直到达到512个为止。该网络使用8个卷积层和3个完全连接的层,因此通常称为VGG-11。下面是VGG11的结构代码
conv_arch = ((1, 64), (1, 128), (2, 256), (2, 512), (2, 512))
def vgg(conv_arch):
net = nn.Sequential()
# The convolutional part
for (num_convs, num_channels) in conv_arch:
net.add(vgg_block(num_convs, num_channels))
# The fully-connected part
net.add(nn.Dense(4096, activation='relu'), nn.Dropout(0.5),
nn.Dense(4096, activation='relu'), nn.Dropout(0.5),
nn.Dense(10))
return net
net = vgg(conv_arch)
测试一下输出shape
net.initialize()
X = np.random.uniform(size=(1, 1, 224, 224))
for blk in net:
X = blk(X)
print(blk.name, 'output shape:\t', X.shape)
sequential1 output shape: (1, 64, 112, 112)
sequential2 output shape: (1, 128, 56, 56)
sequential3 output shape: (1, 256, 28, 28)
sequential4 output shape: (1, 512, 14, 14)
sequential5 output shape: (1, 512, 7, 7)
dense0 output shape: (1, 4096)
dropout0 output shape: (1, 4096)
dense1 output shape: (1, 4096)
dropout1 output shape: (1, 4096)
dense2 output shape: (1, 10)
加载fashion_mnist训练数据
batch_size=128
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size, resize=224)
训练代码
ratio = 4
small_conv_arch = [(pair[0], pair[1] // ratio) for pair in conv_arch]
net = vgg(small_conv_arch)
lr, num_epochs, batch_size = 0.05, 10, 128
d2l.train_ch6(net, train_iter, test_iter, num_epochs, lr)
在T4下训练大概30分钟就可以完成了
训练结果
loss 0.175, train acc 0.935, test acc 0.923
1814.6 examples/sec on gpu(0)
最后的话:
这篇文章发布在CSDN/蓝色的杯子, 没事多留言,让我们一起爱智求真吧.我的邮箱wisdomfriend@126.com.