这篇文章我们来搭建一个早期的经典网络VGG16,数据集采用稍复杂的Cifar-10。该数据集Tensorflow同样提供了官方的加载方式
(train_images, train_labels, test_images, test_labels) = load_CIFAR('/home/user/Documents/dataset/Cifar-10')
train_labels = tf.keras.utils.to_categorical(train_labels, 10)
test_labels = tf.keras.utils.to_categorical(test_labels, 10)
其中tf.keras.utils.to_categorical
是将数据转换为one hot格式。
第一次执行时会自动下载,但是该数据集一共有200M左右,而且官方提供的下载如果在国内的话速度会很慢,所以这里建议从官网下载。压缩包解压后并非一张张图片,而是采用二进制方式存储,我提供一份加载数据的代码
def load_CIFAR_batch(filename):
""" load single batch of cifar """
with open(filename, 'rb')as f:
datadict = p.load(f, encoding='iso-8859-1')
X = datadict['data']
Y = datadict['labels']
X = X.reshape(10000, 3, 32, 32)
Y = np.array(Y)
return X, Y
def load_CIFAR(Foldername):
train_data = np.zeros([50000, 32, 32, 3], dtype=np.float32)
train_label = np.zeros([50000, 10], dtype=np.float32)
test_data = np.zeros([10000, 32, 32, 3], dtype=np.float32)
test_label = np.zeros([10000, 10], dtype=np.float32)
for sample in range(5):
X, Y = load_CIFAR_batch(Foldername + "/data_batch_" + str(sample + 1))
for i in range(3):
train_data[10000 * sample:10000 * (sample + 1), :, :, i] = X[:, i, :, :]
for i in range(10000):
train_label[i + 10000 * sample][Y[i]] = 1
X, Y = load_CIFAR_batch(Foldername + "/test_batch")
for i in range(3):
test_data[:, :, :, i] = X[:, i, :, :]
for i in range(10000):
test_label[i][Y[i]] = 1
return train_data, train_label, test_data, test_label
网上关于VGG论文的解读非常多,因此这里对网络结构和参数不多赘述,可以像下面这样简单的搭建好,由于我们所用的数据是Cifar-10,所以最终网络的输出维度设为10。并且超参数的设置遵循原文,即
weight_decay = 5e-4
,dropout_rate = 0.5
。
def VGG16():
model = models.Sequential()
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3), kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Flatten()) # 2*2*512
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
return model
下面介绍变学习率的设置方式。要用到的是model.fit
中的callbacks
参数,从参数名可以理解,我们需要写一个回调函数来实现学习率随训练轮数增加而减小。VGG原文中采用带动量的SGD,初始学习率为0.01,每次下降为原来的十分之一,这里我们让网络训练50个epoch,即epoch_num = 50
,其中前20个采用0.01,中间20个采用0.001,最后10个采用0.0001
def scheduler(epoch):
if epoch < epoch_num * 0.4:
return learning_rate
if epoch < epoch_num * 0.8:
return learning_rate * 0.1
return learning_rate * 0.01
sgd = optimizers.SGD(lr=learning_rate, momentum=0.9, nesterov=True)
change_lr = LearningRateScheduler(scheduler)
最后,在训练网络时将change_lr
参数传入即可
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.fit(train_images, train_labels,
batch_size=batch_size,
epochs=epoch_num,
callbacks=[change_lr],
validation_data=(test_images, test_labels))
最终测试集准确率大约在83%,当然这并不是一个很好的结果,一方面是因为Batch normalization这一重要技术出现在VGG之后,另一方面我们也没有使用数据增强。这两点将在下一篇教程ResNet中讲解。
完整的代码可以在我的github上找到
https://github.com/Apm5/tensorflow_2.0_tutorial/blob/master/CNN/VGG16.py