目录
七、Keras高层接口
7.1 Metrics
参考:Keras高层API之Metrics
评估函数用于评估当前训练模型的性能。当模型编译后,评价函数应该作为metrics的参数输入;
通俗一点的解释:例如获得loss的均值,之前都是所有的loss全部求解出之后才求均值,但是在训练的过程中有一种需求是实时得到loss的均值,那就需要每产生一个新的loss就喂入,结合之前的所有loss求得实时均值。而可以实现在每个时间戳都能产生实时评估的接口就是Metrics
step1:Build a meter
acc_meter = metrics.Accuracy()
loss_meter = metrics.Mean()
step2:Update data
loss_meter.update_state(loss) # 喂入数据
acc_meter.update_state(y,pred)
step3:Get Average data
print(step,'loss:',loss_meter.result().numpy()) # 产生评估值
print(step,'Evaluate Acc:',total_correct/total,acc_meter.result().numpy())
清除缓存
if step % 100 == 0:
print(step,'loss:',loss_meter.result().numpy())
loss_meter.reset_states() # 清除之前的缓冲重新评估
if step % 500 ==0:
total,total_correct = 0.,0
acc_meter.reset_states()
实战:
import tensorflow as tf
from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
def preprocess(x, y):
x = tf.cast(x, dtype=tf.float32) / 255.
y = tf.cast(y, dtype=tf.int32)
return x,y
batchsz = 128
(x, y), (x_val, y_val) = datasets.mnist.load_data()
print('datasets:', x.shape, y.shape, x.min(), x.max())
db = tf.data.Dataset.from_tensor_slices((x,y))
db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10)
ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
ds_val = ds_val.map(preprocess).batch(batchsz)
network = Sequential([layers.Dense(256, activation='relu'),
layers.Dense(128, activation='relu'),
layers.Dense(64, activation='relu'),
layers.Dense(32, activation='relu'),
layers.Dense(10)])
network.build(input_shape=(None, 28*28))
network.summary()
optimizer = optimizers.Adam(lr=0.01)
acc_meter = metrics.Accuracy()
loss_meter = metrics.Mean()
for step, (x,y) in enumerate(db):
with tf.GradientTape() as tape:
# [b, 28, 28] => [b, 784]
x = tf.reshape(x, (-1, 28*28))
# [b, 784] => [b, 10]
out = network(x)
# [b] => [b, 10]
y_onehot = tf.one_hot(y, depth=10)
# [b]
loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True))
loss_meter.update_state(loss)
grads = tape.gradient(loss, network.trainable_variables)
optimizer.apply_gradients(zip(grads, network.trainable_variables))
if step % 100 == 0:
print(step, 'loss:', loss_meter.result().numpy())
loss_meter.reset_states()
# evaluate
if step % 500 == 0:
total, total_correct = 0., 0
acc_meter.reset_states()
for step, (x, y) in enumerate(ds_val):
# [b, 28, 28] => [b, 784]
x = tf.reshape(x, (-1, 28*28))
# [b, 784] => [b, 10]
out = network(x)
# [b, 10] => [b]
pred = tf.argmax(out, axis=1)
pred = tf.cast(pred, dtype=tf.int32)
# bool type
correct = tf.equal(pred, y)
# bool tensor => int tensor => numpy
total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()
total += x.shape[0]
acc_meter.update_state(y, pred)
print(step, 'Evaluate Acc:', total_correct/total, acc_meter.result().numpy())
OUT:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) multiple 200960
_________________________________________________________________
dense_1 (Dense) multiple 32896
_________________________________________________________________
dense_2 (Dense) multiple 8256
_________________________________________________________________
dense_3 (Dense) multiple 2080
_________________________________________________________________
dense_4 (Dense) multiple 330
=================================================================
Total params: 244,522
Trainable params: 244,522
Non-trainable params: 0
_________________________________________________________________
2020-05-07 20:38:46.837981: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
0 loss: 2.3376217
78 Evaluate Acc: 0.1247 0.1247
100 loss: 0.5312836
200 loss: 0.24513794
300 loss: 0.21127209
400 loss: 0.1908613
500 loss: 0.15478776
......
78 Evaluate Acc: 0.9753 0.9753
4100 loss: 0.065056
4200 loss: 0.07652673
4300 loss: 0.07414787
4400 loss: 0.077764966
4500 loss: 0.07281441
78 Evaluate Acc: 0.9708 0.9708
4600 loss: 0.055623364
7.2 Compile & Fit
对下图内容的封装 ,keras就有了使用compile对其进行封装,使用fit进行迭代epoch训练
详细参考:tensorflow2的compile & fit函数
7.2.1 compile
配置用于训练的模型
network = Sequential([layers.Dense(256, activation='relu'),
layers.Dense(128, activation='relu'),
layers.Dense(64, activation='relu'),
layers.Dense(32, activation='relu'),
layers.Dense(10)])
network.build(input_shape=(None, 28 * 28))
network.summary()
print("--------------------Step1-----------------------")
network.compile(optimizer=optimizers.Adam(lr=0.01),
loss=tf.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy']
)
其中
- optimizer 用来配置模型的优化器
- loss - 用来配置模型的损失函数
- metrics - 用来配置模型评价的方法,如accuracy、mse等,例如实战中:
loss: 2.3259 - accuracy: 0.1094
7.2.2 fit
喂入训练数据
print("--------------------Step2-----------------------")
network.fit(db_train, epochs=5, validation_data=ds_val, validation_freq=2)
此处:
db_train 为训练数据集
epochs 训练批次
validation_data 为测试数据集
validation_freq 意味着每2个epoch做一次验证 例如实战中:- val_loss: 0.1482 - val_accuracy: 0.9604
7.2.3 evaluate
返回测试模式下模型的损失值和指标值
print("--------------------Step3-----------------------")
network.evaluate(ds_val)
此处ds_val为喂入的测试集数据,对应实战中:
--------------------Step3-----------------------
1/79 [..............................] - ETA: 0s - loss: 0.0255 - accuracy: 0.9922
10/79 [==>...........................] - ETA: 0s - loss: 0.1624 - accuracy: 0.9664
19/79 [======>.......................] - ETA: 0s - loss: 0.1717 - accuracy: 0.9589
28/79 [=========>....................] - ETA: 0s - loss: 0.1599 - accuracy: 0.9623
37/79 [=============>................] - ETA: 0s - loss: 0.1601 - accuracy: 0.9620
47/79 [================>.............] - ETA: 0s - loss: 0.1473 - accuracy: 0.9643
57/79 [====================>.........] - ETA: 0s - loss: 0.1345 - accuracy: 0.9679
66/79 [========================>.....] - ETA: 0s - loss: 0.1239 - accuracy: 0.9704
74/79 [===========================>..] - ETA: 0s - loss: 0.1142 - accuracy: 0.9723
7.2.4 predict
生成输入样本的输出预测
print("--------------------Step4-----------------------")
sample = next(iter(ds_val))
x = sample[0]
y = sample[1] # one-hot
pred = network.predict(x) # [b, 10]
# convert back to number
y = tf.argmax(y, axis=1)
pred = tf.argmax(pred, axis=1)
print(pred)
print(y)
此处输出对应实战中:
--------------------Step4-----------------------
tf.Tensor(
[7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2 7
1 2 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4 6 4 3 0 7 0 2 9
1 7 3 2 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9 6 0 5 4 9 9 2 1 9 4 8
7 3 9 7 9 4 4 9 2 5 4 7 6 7 9 0 5], shape=(128,), dtype=int64)
tf.Tensor(
[7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2 7
1 2 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4 6 4 3 0 7 0 2 9
1 7 3 2 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9 6 0 5 4 9 9 2 1 9 4 8
7 3 9 7 4 4 4 9 2 5 4 7 6 7 9 0 5], shape=(128,), dtype=int64)
7.2.5 常规工作流实战
Code:
import tensorflow as tf
from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
def preprocess(x, y):
"""
x is a simple image, not a batch
"""
x = tf.cast(x, dtype=tf.float32) / 255.
x = tf.reshape(x, [28 * 28])
y = tf.cast(y, dtype=tf.int32)
y = tf.one_hot(y, depth=10)
return x, y
batchsz = 128
(x, y), (x_val, y_val) = datasets.mnist.load_data()
print('datasets:', x.shape, y.shape, x.min(), x.max())
db = tf.data.Dataset.from_tensor_slices((x, y))
db = db.map(preprocess).shuffle(60000).batch(batchsz)
ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
ds_val = ds_val.map(preprocess).batch(batchsz)
sample = next(iter(db))
print(sample[0].shape, sample[1].shape)
network = Sequential([layers.Dense(256, activation='relu'),
layers.Dense(128, activation='relu'),
layers.Dense(64, activation='relu'),
layers.Dense(32, activation='relu'),
layers.Dense(10)])
network.build(input_shape=(None, 28 * 28))
network.summary()
print("--------------------Step1-----------------------")
network.compile(optimizer=optimizers.Adam(lr=0.01),
loss=tf.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy']
)
print("--------------------Step2-----------------------")
network.fit(db, epochs=5, validation_data=ds_val, validation_freq=2)
print("--------------------Step3-----------------------")
network.evaluate(ds_val)
print("--------------------Step4-----------------------")
sample = next(iter(ds_val))
x = sample[0]
y = sample[1] # one-hot
pred = network.predict(x) # [b, 10]
# convert back to number
y = tf.argmax(y, axis=1)
pred = tf.argmax(pred, axis=1)
print(pred)
print(y)
OUT:
(128, 784) (128, 10)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) multiple 200960
_________________________________________________________________
dense_1 (Dense) multiple 32896
_________________________________________________________________
dense_2 (Dense) multiple 8256
_________________________________________________________________
dense_3 (Dense) multiple 2080
_________________________________________________________________
dense_4 (Dense) multiple 330
=================================================================
Total params: 244,522
Trainable params: 244,522
Non-trainable params: 0
_________________________________________________________________
--------------------Step1-----------------------
--------------------Step2-----------------------
Train for 469 steps, validate for 79 steps
Epoch 1/5
2020-05-07 21:00:53.795435: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
1/469 [..............................] - ETA: 14:57 - loss: 2.3189 - accuracy: 0.0859
19/469 [>.............................] - ETA: 46s - loss: 1.4378 - accuracy: 0.5074
37/469 [=>............................] - ETA: 23s - loss: 1.0796 - accuracy: 0.6372
55/469 [==>...........................] - ETA: 15s - loss: 0.8633 - accuracy: 0.7135
......此处省略......
442/469 [===========================>..] - ETA: 0s - loss: 0.3000 - accuracy: 0.9088
457/469 [============================>.] - ETA: 0s - loss: 0.2954 - accuracy: 0.9102
469/469 [==============================] - 3s 7ms/step - loss: 0.2916 - accuracy: 0.9115
Epoch 2/5
1/469 [..............................] - ETA: 11:05 - loss: 0.2211 - accuracy: 0.9297
18/469 [>.............................] - ETA: 36s - loss: 0.1450 - accuracy: 0.9566
35/469 [=>............................] - ETA: 18s - loss: 0.1354 - accuracy: 0.9598
53/469 [==>...........................] - ETA: 12s - loss: 0.1397 - accuracy: 0.9580
......此处省略......
451/469 [===========================>..] - ETA: 0s - loss: 0.1358 - accuracy: 0.9610
466/469 [============================>.] - ETA: 0s - loss: 0.1354 - accuracy: 0.9612
469/469 [==============================] - 3s 7ms/step - loss: 0.1353 - accuracy: 0.9611 - val_loss: 0.1373 - val_accuracy: 0.9603
Epoch 3/5
1/469 [..............................] - ETA: 10:59 - loss: 0.0504 - accuracy: 0.9844
20/469 [>.............................] - ETA: 32s - loss: 0.1069 - accuracy: 0.9734
39/469 [=>............................] - ETA: 16s - loss: 0.0908 - accuracy: 0.9756
58/469 [==>...........................] - ETA: 11s - loss: 0.0881 - accuracy: 0.9756
......此处省略......
423/469 [==========================>...] - ETA: 0s - loss: 0.1067 - accuracy: 0.9708
440/469 [===========================>..] - ETA: 0s - loss: 0.1075 - accuracy: 0.9706
456/469 [============================>.] - ETA: 0s - loss: 0.1069 - accuracy: 0.9707
469/469 [==============================] - 3s 6ms/step - loss: 0.1072 - accuracy: 0.9707
Epoch 4/5
1/469 [..............................] - ETA: 11:17 - loss: 0.0381 - accuracy: 0.9922
20/469 [>.............................] - ETA: 33s - loss: 0.0649 - accuracy: 0.9824
37/469 [=>............................] - ETA: 18s - loss: 0.0612 - accuracy: 0.9816
54/469 [==>...........................] - ETA: 12s - loss: 0.0694 - accuracy: 0.9805
......此处省略......
417/469 [=========================>....] - ETA: 0s - loss: 0.0957 - accuracy: 0.9736
437/469 [==========================>...] - ETA: 0s - loss: 0.0958 - accuracy: 0.9736
456/469 [============================>.] - ETA: 0s - loss: 0.0960 - accuracy: 0.9735
469/469 [==============================] - 3s 7ms/step - loss: 0.0970 - accuracy: 0.9732 - val_loss: 0.1482 - val_accuracy: 0.9604
Epoch 5/5
1/469 [..............................] - ETA: 13:31 - loss: 0.0629 - accuracy: 0.9922
14/469 [..............................] - ETA: 58s - loss: 0.0946 - accuracy: 0.9727
29/469 [>.............................] - ETA: 27s - loss: 0.0860 - accuracy: 0.9752
44/469 [=>............................] - ETA: 18s - loss: 0.0935 - accuracy: 0.9757
57/469 [==>...........................] - ETA: 14s - loss: 0.0970 - accuracy: 0.9737
......此处省略......
432/469 [==========================>...] - ETA: 0s - loss: 0.0871 - accuracy: 0.9764
448/469 [===========================>..] - ETA: 0s - loss: 0.0875 - accuracy: 0.9765
462/469 [============================>.] - ETA: 0s - loss: 0.0874 - accuracy: 0.9764
469/469 [==============================] - 3s 7ms/step - loss: 0.0876 - accuracy: 0.9764
--------------------Step3-----------------------
1/79 [..............................] - ETA: 0s - loss: 0.0255 - accuracy: 0.9922
10/79 [==>...........................] - ETA: 0s - loss: 0.1624 - accuracy: 0.9664
19/79 [======>.......................] - ETA: 0s - loss: 0.1717 - accuracy: 0.9589
28/79 [=========>....................] - ETA: 0s - loss: 0.1599 - accuracy: 0.9623
37/79 [=============>................] - ETA: 0s - loss: 0.1601 - accuracy: 0.9620
47/79 [================>.............] - ETA: 0s - loss: 0.1473 - accuracy: 0.9643
57/79 [====================>.........] - ETA: 0s - loss: 0.1345 - accuracy: 0.9679
66/79 [========================>.....] - ETA: 0s - loss: 0.1239 - accuracy: 0.9704
74/79 [===========================>..] - ETA: 0s - loss: 0.1142 - accuracy: 0.9723
79/79 [==============================] - 0s 6ms/step - loss: 0.1189 - accuracy: 0.9712
--------------------Step4-----------------------
tf.Tensor(
[7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2 7
1 2 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4 6 4 3 0 7 0 2 9
1 7 3 2 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9 6 0 5 4 9 9 2 1 9 4 8
7 3 9 7 9 4 4 9 2 5 4 7 6 7 9 0 5], shape=(128,), dtype=int64)
tf.Tensor(
[7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2 7
1 2 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4 6 4 3 0 7 0 2 9
1 7 3 2 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9 6 0 5 4 9 9 2 1 9 4 8
7 3 9 7 4 4 4 9 2 5 4 7 6 7 9 0 5], shape=(128,), dtype=int64)
7.3 自定义网络
关于自定义网络参考:
自定义层函数需要继承layers.Layer
,自定义网络需要继承keras.Model
。
keras.Sequential
是keras.Model
的子类
net=Sequential([Layer])
net.build(input_shape=(...))=net(x)
model.trainable_variables
layers.Layer
和keras.Model
类中有:
_ _ init _ _
函数,你可以在其中执行所有与输入无关的初始化build()
函数,可以获得输入张量的形状,并可以进行其余的初始化call()
函数,构建网络结构,进行前向传播- 仅Model:
compile / fit / evaluate / predict
代码中出现model(x)
时就会默认调用call方法实现前向传播
调用layers.Layer
和keras.Model
类方法时默认调用_ _ init _ _
自定义model
的时候需要实现这个方法来实现正向传播的逻辑,从而支持model(x)
的写法,背后逻辑是调用了model.__call__(x)
,然后再调用model.call()
class MyModel(keras.Model):
def __init__(self):
super(MyModel, self).__init__()
self.fc1 = MyDense(28*28, 256)
self.fc2 = MyDense(256, 128)
self.fc3 = MyDense(128, 64)
self.fc4 = MyDense(64, 32)
self.fc5 = MyDense(32, 10)
def call(self, inputs, training=None):
x = self.fc1(inputs)
x = tf.nn.relu(x)
x = self.fc2(x)
x = tf.nn.relu(x)
x = self.fc3(x)
x = tf.nn.relu(x)
x = self.fc4(x)
x = tf.nn.relu(x)
x = self.fc5(x)
return x
network = MyModel()
7.4 模型的保存与加载
save/load weights
轻量级的,只保存网络参数,需要重新创建相同的网络
network.save_weights('weights.ckpt') # 'weights.ckpt'为保存路径
...
network.load_weights('weights.ckpt')
save/load entire model
暴力保存,保存所有状态,而且可以恢复,不需要重新创建网络
model.save('xx.h5') # 'xx.h5'为保存路径
...
model=tf.keras.models.load_model('xx.h5')
saved_model
生产环境通用格式,可以给其它语言使用
tf.saved_model.save(model, 'path')
imported=tf.saved_model.load('path')