在图像分类任务中,模型经过最后CNN层后的尺寸为[bath_size, img_width, img_height, channels],通常的做法是:接一个flatten layer,将尺寸变为[batch_size, w * h * channels],再至少接一个FC layer,这样做的最大问题是:模型参数多,且容易过拟合。
为此,研究者提出了利用pooling layer来替代最后的FC layer,下面利用Keras进行实例说明:
方法一:利用GlobalAveragePooling2D
from keras.layers import Dense, Input, Conv2D
from keras.layers import MaxPooling2D, GlobalAveragePooling2D
x = Input(shape=[8, 8, 2048])
# 假定最后一层CNN的层输出为(None, 8, 8, 2048)
x = GlobalAveragePooling2D(name='avg_pool')(x) # shape=(?, 2048)
# 取每一个特征图的平均值作为输出,用以替代全连接层
x = Dense(1000, activation='softmax', name='predictions')(x) # shape=(?, 1000)
# 1000为num_classes
方法二:合理设置pool_szie
import tensorflow as tf
from keras.layers import Dense, Input, Conv2D
from keras.layers import AveragePooling2D
x = Input(shape=[8, 8, 2048])
x = AveragePooling2D(pool_size=(8, 8), padding='valid')(x)
# 合理设置pool_size尺寸,使得输出为(?, 1, 1, 2048)
x = Conv2D(1000, (1, 1), padding='same')(x) # shape=(?, 1, 1, 1000)
x = tf.squeeze(x, (1, 2)) # shape=(?, 1000)
x = Dense(1000, activation='softmax', name='predictions')(x) # shape (?, 1000)