低阶 API 模型
首先读取数据集并进行简单切分,这里对字符标签进行了独热编码方便后面计算损失值。
import numpy as np
import tensorflow as tf
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
digits = load_digits() # DIGITS 数据集是 scikit-learn 提供的简单手写字符识别数据集。
digits_y = np.eye(10)[digits.target] # 标签独热编码
# 独热编码是一种稀疏向量,其中:一个元素设为 1,所有其他元素均设为 0。
# 切分数据集,测试集占 20%
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits_y,test_size=0.2, random_state=1)
然后使用 TensorFlow 2 低阶 API 构建一个包含 1 个隐含层的简单神经网络结构。神经网络的输入是单个手写字符样本的向量长度 64,隐含层输入为 30,最终的输出层为 10。我们对隐含层进行 RELU 激活,输出层不激活。输出层的单样本长度为 10,这样正好就和上方独热编码后的值对应上了。
class Model(object):
def __init__(self):
# 随机初始化张量参数
self.W1 = tf.Variable(tf.random.normal([64, 30]))
self.b1 = tf.Variable(tf.random.normal([30]))
self.W2 = tf.Variable(tf.random.normal([30, 10]))
self.b2 = tf.Variable(tf.random.normal([10]))
def __call__(self, x):
x = tf.cast(x, tf.float32) # 转换输入数据类型
# 线性计算 + RELU 激活
fc1 = tf.nn.relu(tf.add(tf.matmul(x, self.W1), self.b1)) # 全连接层 1
fc2 = tf.add(tf.matmul(fc1, self.W2), self.b2) # 全连接层 2
return fc2
下面开始构建损失函数。损失函数使用 TensorFlow 提供的tf.nn.softmax_cross_entropy_with_logits
,这是一个自带 Softmax 的交叉熵损失函数。最终通过 reduce_mean
求得全局平均损失。
def loss_fn(model, x, y):
preds = model(x)
return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=preds, labels=y))
为了方便后续评估测试集分类准确度,需要手动构建一个准确度评估函数。tf.argmax
可以将 Softmax 结果转换为对应的字符值。然后使用 tf.equal
比对各样本的结果是否正确,最终使用 reduce_mean
求得全部样本的分类准确度。
def accuracy_fn(logits, labels):
preds = tf.argmax(logits, axis=1) # 取值最大的索引,正好对应字符标签
labels = tf.argmax(labels, axis=1)
return tf.reduce_mean(tf.cast(tf.equal(preds, labels), tf.float32))
下面开始构建最关键的训练迭代过程。其中,优化器选择了 Adam,一种比 SDG 随机梯度下降更常用于深度学习的优化算法。
EPOCHS = 100 # 迭代此时
LEARNING_RATE = 0.02 # 学习率
model = Model() # 实例化模型类
for epoch in range(EPOCHS):
with tf.GradientTape() as tape: # 追踪梯度
loss = loss_fn(model, X_train, y_train)
trainable_variables = [model.W1, model.b1, model.W2, model.b2] # 需优化参数列表
grads = tape.gradient(loss, trainable_variables) # 计算梯度
optimizer = tf.optimizers.Adam(learning_rate=LEARNING_RATE) # Adam 优化器
optimizer.apply_gradients(zip(grads, trainable_variables)) # 更新梯度
accuracy = accuracy_fn(model(X_test), y_test) # 计算准确度
# 输出各项指标
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch+1}/{EPOCHS}], Train loss: {loss}, Test accuracy: {accuracy}')
输出结果:
Epoch [10/100], Train loss: 39.54377746582031, Test accuracy: 0.4861111044883728
Epoch [20/100], Train loss: 8.057710647583008, Test accuracy: 0.75
Epoch [30/100], Train loss: 4.4153900146484375, Test accuracy: 0.8055555820465088
Epoch [40/100], Train loss: 3.0851094722747803, Test accuracy: 0.8500000238418579
Epoch [50/100], Train loss: 2.5090126991271973, Test accuracy: 0.855555534362793
Epoch [60/100], Train loss: 2.156092643737793, Test accuracy: 0.8722222447395325
Epoch [70/100], Train loss: 1.8762010335922241, Test accuracy: 0.8861111402511597
Epoch [80/100], Train loss: 1.7182233333587646, Test accuracy: 0.8916666507720947
Epoch [90/100], Train loss: 1.55845308303833, Test accuracy: 0.8972222208976746
Epoch [100/100], Train loss: 1.3927630186080933, Test accuracy: 0.8888888955116272
随着迭代次数的增加,模型的分类准确率稳步提升。
Keras 顺序模型
建立顺序模型后,向其中添加所需的 2 个全连接层:
model = tf.keras.Sequential() # 建立顺序模型
model.add(tf.keras.layers.Dense(units=30, input_dim=64, activation='relu')) # 隐含层
model.add(tf.keras.layers.Dense(units=10, activation='softmax')) # 输出层
下面,可以开始编译和训练模型。这里使用 tf.optimizers.Adam
作为优化器,tf.losses.categorical_crossentropy
多分类交叉熵作为损失函数。与 tf.nn.softmax_cross_entropy_with_logits
不同的是,tf.losses.categorical_crossentropy
是从 Keras 中演化而来的,其去掉了 Softmax 的过程。而这个过程被我们直接加入到模型的构建中。
# 编译模型,添加优化器,损失函数和评估方法
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# 训练模型
model.fit(X_train, y_train, batch_size=64, epochs=20, validation_data=(X_test, y_test))
输出结果:
Train on 1437 samples, validate on 360 samples
Epoch 1/20
1437/1437 [==============================] - 2s 1ms/sample - loss: 5.7839 - accuracy: 0.0995 - val_loss: 3.4517 - val_accuracy: 0.1444
Epoch 2/20
1437/1437 [==============================] - 0s 326us/sample - loss: 2.8467 - accuracy: 0.1886 - val_loss: 2.2067 - val_accuracy: 0.2861
Epoch 3/20
1437/1437 [==============================] - 0s 275us/sample - loss: 1.9659 - accuracy: 0.3556 - val_loss: 1.6264 - val_accuracy: 0.4806
Epoch 4/20
1437/1437 [==============================] - 0s 282us/sample - loss: 1.5098 - accuracy: 0.4934 - val_loss: 1.2940 - val_accuracy: 0.5806
Epoch 5/20
1437/1437 [==============================] - 0s 273us/sample - loss: 1.1941 - accuracy: 0.6040 - val_loss: 1.0005 - val_accuracy: 0.7083
Epoch 6/20
1437/1437 [==============================] - 1s 362us/sample - loss: 0.9140 - accuracy: 0.7063 - val_loss: 0.7953 - val_accuracy: 0.7556
Epoch 7/20
1437/1437 [==============================] - 0s 194us/sample - loss: 0.7222 - accuracy: 0.7801 - val_loss: 0.6452 - val_accuracy: 0.8083
Epoch 8/20
1437/1437 [==============================] - 0s 338us/sample - loss: 0.5882 - accuracy: 0.8142 - val_loss: 0.5462 - val_accuracy: 0.8278
Epoch 9/20
1437/1437 [==============================] - 0s 223us/sample - loss: 0.4907 - accuracy: 0.8455 - val_loss: 0.4626 - val_accuracy: 0.8417
Epoch 10/20
1437/1437 [==============================] - 0s 217us/sample - loss: 0.4156 - accuracy: 0.8754 - val_loss: 0.4016 - val_accuracy: 0.8750
Epoch 11/20
1437/1437 [==============================] - 0s 284us/sample - loss: 0.3569 - accuracy: 0.8873 - val_loss: 0.3481 - val_accuracy: 0.9000
Epoch 12/20
1437/1437 [==============================] - 1s 407us/sample - loss: 0.3081 - accuracy: 0.8998 - val_loss: 0.3096 - val_accuracy: 0.9000
Epoch 13/20
1437/1437 [==============================] - 0s 281us/sample - loss: 0.2723 - accuracy: 0.9158 - val_loss: 0.2744 - val_accuracy: 0.9000
Epoch 14/20
1437/1437 [==============================] - 0s 274us/sample - loss: 0.2419 - accuracy: 0.9248 - val_loss: 0.2530 - val_accuracy: 0.9083
Epoch 15/20
1437/1437 [==============================] - 0s 281us/sample - loss: 0.2187 - accuracy: 0.9332 - val_loss: 0.2321 - val_accuracy: 0.9250
Epoch 16/20
1437/1437 [==============================] - 0s 279us/sample - loss: 0.1990 - accuracy: 0.9381 - val_loss: 0.2167 - val_accuracy: 0.9194
Epoch 17/20
1437/1437 [==============================] - 0s 276us/sample - loss: 0.1837 - accuracy: 0.9422 - val_loss: 0.2038 - val_accuracy: 0.9250
Epoch 18/20
1437/1437 [==============================] - 0s 280us/sample - loss: 0.1684 - accuracy: 0.9506 - val_loss: 0.1967 - val_accuracy: 0.9278
Epoch 19/20
1437/1437 [==============================] - 1s 358us/sample - loss: 0.1574 - accuracy: 0.9527 - val_loss: 0.1827 - val_accuracy: 0.9306
Epoch 20/20
1437/1437 [==============================] - 0s 283us/sample - loss: 0.1469 - accuracy: 0.9603 - val_loss: 0.1729 - val_accuracy: 0.9417
Keras 的训练过程可以采用小批量迭代,直接指定 batch_size 即可。validation_data 可以传入测试数据得到准确度评估结果。同样,随着迭代次数的增加,模型的分类准确率稳步提升。
Keras 函数式模型
函数式模型最直观的地方在于可以看清楚输入和输出。下面开始定义函数式模型。首先是 Input
层,这在顺序模型中是没有的。然后我们将 inputs
传入 Dense
层,最终再输出。
# 函数式模型
inputs = tf.keras.Input(shape=(64,)) # 输入层
x = tf.keras.layers.Dense(30, activation='relu')(inputs) # 隐含层
outputs = tf.keras.layers.Dense(10, activation='softmax')(x) # 输出层
# 指定输入和输出
model = tf.keras.Model(inputs=inputs, outputs=outputs)
函数式模型中需要使用 tf.keras.Model
来最终确定输入和输出。同样开始模型编译和训练:
# 编译模型,添加优化器,损失函数和评估方法
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# 训练模型
model.fit(X_train, y_train, batch_size=64, epochs=20, validation_data=(X_test, y_test))
输出结果:
Train on 1437 samples, validate on 360 samples
Epoch 1/20
1437/1437 [==============================] - 1s 683us/sample - loss: 5.2139 - accuracy: 0.1802 - val_loss: 3.3347 - val_accuracy: 0.2833
Epoch 2/20
1437/1437 [==============================] - 0s 344us/sample - loss: 2.6016 - accuracy: 0.3326 - val_loss: 1.8578 - val_accuracy: 0.4556
Epoch 3/20
1437/1437 [==============================] - 1s 410us/sample - loss: 1.3975 - accuracy: 0.5685 - val_loss: 1.1428 - val_accuracy: 0.6667
Epoch 4/20
1437/1437 [==============================] - 1s 351us/sample - loss: 0.8651 - accuracy: 0.7223 - val_loss: 0.8136 - val_accuracy: 0.7556
Epoch 5/20
1437/1437 [==============================] - 0s 336us/sample - loss: 0.6291 - accuracy: 0.8031 - val_loss: 0.6560 - val_accuracy: 0.8139
Epoch 6/20
1437/1437 [==============================] - 1s 367us/sample - loss: 0.5075 - accuracy: 0.8372 - val_loss: 0.5677 - val_accuracy: 0.8361
Epoch 7/20
1437/1437 [==============================] - 0s 289us/sample - loss: 0.4176 - accuracy: 0.8692 - val_loss: 0.5029 - val_accuracy: 0.8528
Epoch 8/20
1437/1437 [==============================] - 0s 347us/sample - loss: 0.3622 - accuracy: 0.8838 - val_loss: 0.4457 - val_accuracy: 0.8639
Epoch 9/20
1437/1437 [==============================] - 0s 304us/sample - loss: 0.3148 - accuracy: 0.9026 - val_loss: 0.4093 - val_accuracy: 0.8694
Epoch 10/20
1437/1437 [==============================] - 0s 321us/sample - loss: 0.2841 - accuracy: 0.9123 - val_loss: 0.3697 - val_accuracy: 0.8917
Epoch 11/20
1437/1437 [==============================] - 1s 360us/sample - loss: 0.2580 - accuracy: 0.9241 - val_loss: 0.3533 - val_accuracy: 0.9000
Epoch 12/20
1437/1437 [==============================] - 1s 349us/sample - loss: 0.2285 - accuracy: 0.9304 - val_loss: 0.3225 - val_accuracy: 0.9083
Epoch 13/20
1437/1437 [==============================] - 0s 278us/sample - loss: 0.2131 - accuracy: 0.9360 - val_loss: 0.3015 - val_accuracy: 0.9139
Epoch 14/20
1437/1437 [==============================] - 0s 345us/sample - loss: 0.1931 - accuracy: 0.9422 - val_loss: 0.2878 - val_accuracy: 0.9139
Epoch 15/20
1437/1437 [==============================] - 0s 340us/sample - loss: 0.1789 - accuracy: 0.9457 - val_loss: 0.2685 - val_accuracy: 0.9222
Epoch 16/20
1437/1437 [==============================] - 1s 355us/sample - loss: 0.1658 - accuracy: 0.9527 - val_loss: 0.2596 - val_accuracy: 0.9278
Epoch 17/20
1437/1437 [==============================] - 1s 376us/sample - loss: 0.1516 - accuracy: 0.9541 - val_loss: 0.2454 - val_accuracy: 0.9278
Epoch 18/20
1437/1437 [==============================] - 1s 490us/sample - loss: 0.1442 - accuracy: 0.9603 - val_loss: 0.2335 - val_accuracy: 0.9306
Epoch 19/20
1437/1437 [==============================] - 1s 475us/sample - loss: 0.1343 - accuracy: 0.9624 - val_loss: 0.2245 - val_accuracy: 0.9306
Epoch 20/20
1437/1437 [==============================] - 1s 425us/sample - loss: 0.1220 - accuracy: 0.9645 - val_loss: 0.2249 - val_accuracy: 0.9278
Keras 实现过程比低阶 API 简单很多,顺序模型只用了 5 行代码,函数式模型只用了 6 行代码,优势非常明显。
Keras 混合模型
有一些多输入模型只能使用函数式 API 进行构建,这也就是混合模型存在的必要性了。我们可以继承tf.keras.Model
来构建模型。这种模型的定义方法自由度更高,我们可以添加更多的中间组件。
# 继承 tf.keras.Model 构建模型类
class Model(tf.keras.Model):
def __init__(self):
super(Model, self).__init__()
self.dense_1 = tf.keras.layers.Dense(30, activation='relu') # 初始化
self.dense_2 = tf.keras.layers.Dense(10, activation='softmax')
def call(self, inputs):
x = self.dense_1(inputs) # 前向传播过程
return self.dense_2(x)
实例化模型然后训练并评估:
model = Model() # 实例化模型
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=64, epochs=10, validation_data=(X_test, y_test))
输出结果:
Train on 1437 samples, validate on 360 samples
Epoch 1/10
1437/1437 [==============================] - 1s 745us/sample - loss: 8.5452 - accuracy: 0.1204 - val_loss: 5.3776 - val_accuracy: 0.1917
Epoch 2/10
1437/1437 [==============================] - 1s 369us/sample - loss: 4.0322 - accuracy: 0.2415 - val_loss: 2.8589 - val_accuracy: 0.3722
Epoch 3/10
1437/1437 [==============================] - 0s 262us/sample - loss: 2.1863 - accuracy: 0.4356 - val_loss: 1.6983 - val_accuracy: 0.5306
Epoch 4/10
1437/1437 [==============================] - 0s 280us/sample - loss: 1.4610 - accuracy: 0.5825 - val_loss: 1.2183 - val_accuracy: 0.6500
Epoch 5/10
1437/1437 [==============================] - 1s 432us/sample - loss: 1.0919 - accuracy: 0.6820 - val_loss: 0.9652 - val_accuracy: 0.7111
Epoch 6/10
1437/1437 [==============================] - 1s 492us/sample - loss: 0.8874 - accuracy: 0.7411 - val_loss: 0.7968 - val_accuracy: 0.7444
Epoch 7/10
1437/1437 [==============================] - 1s 352us/sample - loss: 0.7419 - accuracy: 0.7857 - val_loss: 0.6861 - val_accuracy: 0.7778
Epoch 8/10
1437/1437 [==============================] - 0s 281us/sample - loss: 0.6400 - accuracy: 0.8177 - val_loss: 0.5927 - val_accuracy: 0.8111
Epoch 9/10
1437/1437 [==============================] - 1s 394us/sample - loss: 0.5600 - accuracy: 0.8392 - val_loss: 0.5294 - val_accuracy: 0.8194
Epoch 10/10
1437/1437 [==============================] - 0s 292us/sample - loss: 0.4985 - accuracy: 0.8553 - val_loss: 0.4603 - val_accuracy: 0.8639
混合模型同时兼具了低阶 API 的灵活性和高阶 API 的易用性。实际应用中,我们可以自由去组合模型的结构,又可以同时使用 Keras 提供的 API 编译和训练模型,非常方便。