Very small (3 × 3) convolution filters
1.Introduction
- Utilised smaller receptive window size and smaller stride of the first convolutional layer.
- Training and testing the networks densely over the whole image and over multiple scales.
- Address another important aspect of ConvNet architecture design – its depth.
2.ConvNet Configurations
2.1 Architecture
- Input to:is a fixed-size 224 × 224 RGB image
- Preprocessing: substrac the mean RGB value, computed on the training set
- Passed through a stack of convolutional layers, with a very small receptive field: 3×3
- Utilise 1×1 conv filters, which can be seen as a linear transformation of the input channels (followed by non-linearity).
- Spatial padding(preserved resolution)
- Spatial pooling is carried out by five max-pooling layers( a 2 × 2 pixel window, with stride 2 )
- Three Fully-Connected (FC) layers
- All hidden layers are equipped with the rectification (ReLU)
2.2 Configurations
2.3 Discussion
1. Use very small 3 × 3 receptive fields throughout the whole net
A stack of two 3×3 conv. layers (without spatial pooling in between) has an effective receptive field of 5×5;
Three 3×3 conv. layers have a 7 × 7 effective receptive field
- Incorporate three non-linear rectification layers instead of a single one, which makes the decision function more discriminative.
- Decrease the number of parameters
1 × 1 conv. layers
A way to increase the non-linearity of the decision function without affecting the receptive fields of the conv. layers.
Classification Framework
VGG特点
小卷积核。卷积核全部替换为3x3(极少用了1x1),比较小的stride;
计算量
参数量影响不大
越大的卷积核计算量越大
两个3x3的卷积堆叠获得的感受野大小,相当一个5x5的卷积;而3个3x3卷积的堆叠获取到的感受野相当于一个7x7的卷积
感受野
两个3x3的卷积堆叠获得的感受野大小,相当一个5x5的卷积;而3个3x3卷积的堆叠获取到的感受野相当于一个7x7的卷积。
backprop过程
每个神经元相对于前一层甚至输入层的感受野大小也就意味着参数更新会影响到的神经元数目。在分割问题中卷积核的大小对结果有一定的影响,在上图三层的conv3x3中,最后一个神经元的计算是基于第一层输入的7个神经元,换句话说,反向传播时,该层会影响到第一层conv3x3的前7个参数。从输出层往回forward同样的层数下,大卷积影响(做参数更新时)到的前面的输入神经元越多。
小池化核。相比AlexNet的3x3的池化核,VGG全部为2x2的池化核
在做特征工程上的事情,通过池化也逐渐忽略局部信息
层数更深特征图更宽
基于前两点外,由于卷积核专注于扩大通道数、池化专注于缩小宽和高,使得模型架构上更深更宽的同时,计算量的增加放缓;
全连接转卷积
网络测试阶段将训练阶段的三个全连接替换为三个卷积,测试重用训练时的参数,使得测试得到的全卷积网络因为没有全连接的限制,因而可以接收任意宽或高为的输入。
不用考虑输入图片的大小。