创新点
vgg提出了通过3x3大小的卷积滤波器和2x2的最大池化层堆叠,将网络深度增加到16-19层,显著提升了大规模图像识别的准确性。
网络结构
其中,D、E分别为vgg16和vgg19网络。
参数量和精度对比
参数量
可以看出,虽然从A到E的网络逐渐加深,但是网络的参数量并没有增长很多,这是因为全连接层没变。
精度
可以看出,vgg16的错误率更低一点。
代码实现
以vgg16网络为例:
# 224,224,3
image_input = Input(shape = (224,224,3))
# 112,112,64
x = Conv2D(64,(3,3),activation = 'relu',padding = 'same',name = 'block1_conv1')(image_input)
x = Conv2D(64,(3,3),activation = 'relu',padding = 'same', name = 'block1_conv2')(x)
x = MaxPooling2D((2,2), strides = (2,2), name = 'block1_pool')(x)
# 56,56,128
x = Conv2D(128,(3,3),activation = 'relu',padding = 'same',name = 'block2_conv1')(x)
x = Conv2D(128,(3,3),activation = 'relu',padding = 'same',name = 'block2_conv2')(x)
x = MaxPooling2D((2,2),strides = (2,2),name = 'block2_pool')(x)
# 28,28,256
x = Conv2D(256,(3,3),activation = 'relu',padding = 'same',name = 'block3_conv1')(x)
x = Conv2D(256,(3,3),activation = 'relu',padding = 'same',name = 'block3_conv2')(x)
x = Conv2D(256,(3,3),activation = 'relu',padding = 'same',name = 'block3_conv3')(x)
x = MaxPooling2D((2,2),strides = (2,2),name = 'block3_pool')(x)
# 14,14,512
x = Conv2D(512,(3,3),activation = 'relu',padding = 'same', name = 'block4_conv1')(x)
x = Conv2D(512,(3,3),activation = 'relu',padding = 'same', name = 'block4_conv2')(x)
x = Conv2D(512,(3,3),activation = 'relu',padding = 'same', name = 'block4_conv3')(x)
x = MaxPooling2D((2,2),strides = (2,2),name = 'block4_pool')(x)
# 7,7,512
x = Conv2D(512,(3,3),activation = 'relu',padding = 'same', name = 'block5_conv1')(x)
x = Conv2D(512,(3,3),activation = 'relu',padding = 'same', name = 'block5_conv2')(x)
x = Conv2D(512,(3,3),activation = 'relu',padding = 'same', name = 'block5_conv3')(x)
x = MaxPooling2D((2,2),strides = (2,2),name = 'block5_pool')(x)
# 7,7,512
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
参数量:
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0