上一篇介绍了退化学习率以及TensorFlow1.x版本上在MNIST的实现,这次着重说一下在2.0版本的实现。
准备前提:
如果是刚配置环境的电脑(win10),没有安装keras的可以在cmd上输入指令
pip install keras==(相应的版本)
查看在TensorFlow与keras对应的版本:https://docs.floydhub.com/guides/environments/
在Keras当中,我没有找到特别好的Callback直接实现指数型下降,于是利用Callback类实现了一个。
指数型下降指的就是学习率会随着指数函数不断下降。
TensorFlow2.0相较于1.x版本做出了许多优化,在MNIST训练过程需要在1.x版本上做出一系列改进,代码如下:
import numpy as np
import matplotlib.pyplot as plt
import keras
from keras import backend as K
from keras.layers import Flatten,Conv2D,Dropout,Input,Dense,MaxPooling2D
from keras.models import Model
def exponent(global_epoch,
learning_rate_base,
decay_rate,
min_learn_rate=0,
):
learning_rate = learning_rate_base * pow(decay_rate, global_epoch)
learning_rate = max(learning_rate,min_learn_rate)
return learning_rate
class ExponentDecayScheduler(keras.callbacks.Callback):
"""
继承Callback,实现对学习率的调度
"""
def __init__(self,
learning_rate_base,
decay_rate,
global_epoch_init=0,
min_learn_rate=0,
verbose=0):
super(ExponentDecayScheduler, self).__init__()
# 基础的学习率
self.learning_rate_base = learning_rate_base
# 全局初始化epoch
self.global_epoch = global_epoch_init
self.decay_rate = decay_rate
# 参数显示
self.verbose = verbose
# learning_rates用于记录每次更新后的学习率,方便图形化观察
self.min_learn_rate = min_learn_rate
self.learning_rates = []
def on_epoch_end(self, epochs ,logs=None):
self.global_epoch = self.global_epoch + 1
lr = K.get_value(self.model.optimizer.lr)
self.learning_rates.append(lr)
#更新学习率
def on_epoch_begin(self, batch, logs=None):
lr = exponent(global_epoch=self.global_epoch,
learning_rate_base=self.learning_rate_base,
decay_rate = self.decay_rate,
min_learn_rate = self.min_learn_rate)
K.set_value(self.model.optimizer.lr, lr)
if self.verbose > 0:
print('\nBatch %05d: setting learning '
'rate to %s.' % (self.global_epoch + 1, lr))
# 载入Mnist手写数据集
mnist = keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train = np.expand_dims(x_train,-1)
x_test = np.expand_dims(x_test,-1)
#-----------------------------#
# 创建模型
#-----------------------------#
inputs = Input([28,28,1])
x = Conv2D(32, kernel_size= 5,padding = 'same',activation="relu")(inputs)
x = MaxPooling2D(pool_size = 2, strides = 2, padding = 'same',)(x)
x = Conv2D(64, kernel_size= 5,padding = 'same',activation="relu")(x)
x = MaxPooling2D(pool_size = 2, strides = 2, padding = 'same',)(x)
x = Flatten()(x)
x = Dense(1024)(x)
x = Dense(256)(x)
out = Dense(10, activation='softmax')(x)
model = Model(inputs,out)
# 设定优化器,loss,计算准确率
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# 设置训练参数
epochs = 10
init_epoch = 0
# 每一次训练使用多少个Batch
batch_size = 31
# 最大学习率
learning_rate_base = 1e-3
sample_count = len(x_train)
# 学习率
exponent_lr = ExponentDecayScheduler(learning_rate_base = learning_rate_base,
global_epoch_init = init_epoch,
decay_rate = 0.9,
min_learn_rate = 1e-6
)
# 利用fit进行训练
model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size,
verbose=1, callbacks=[exponent_lr])
plt.plot(exponent_lr.learning_rates)
plt.xlabel('Step', fontsize=20)
plt.ylabel('lr', fontsize=20)
plt.axis([0, epochs, 0, learning_rate_base*1.1])
plt.xticks(np.arange(0, epochs, 1))
plt.grid()
plt.title('learning rate decay with exponent', fontsize=20)
plt.show()
其中,使用tensorflow2.x从而无法导入mninst。tensorflow2.x将数据集集成在Keras中。
所以我们使用
mint=tf.keras.datasets.mnist
(x_,y_),(x_1,y_1)=mint.load_data()
运行后观察结果:
参考:
https://blog.csdn.net/weixin_44791964/article/details/105334098