1.将 Tensorflow 导入
import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model
2.加载并准备 MNIST 数据集
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
# 增加一维数据
x_train = x_train[..., tf.newaxis].astype("float32")
x_test = x_test[..., tf.newaxis].astype("float32")
tf.newaxis:增加维度。位置不同,增加的维度是第几维也相对不一样,可以解决用训练好的神经网络模型进行预测时的维度不匹配问题。
原本x_train维度为(60000,28,28),加上tf.newaxis后增加成(60000,28,28,1)
astype:转换数据类型
x_train.dtype原本是float64,通过数据类型转换变成了float32,占用更少的内存
3.使用 tf.data
来将数据集切分为 batch 以及混淆数据集
train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(10000).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)
tf.data.Dataset.from_tensor_slices
假设我们有一组特征集合(features),以及这组数据集合所对应的标签集合(labels),该函数的作用就是将每个数据与其对应的标签进行组合,构成一个个完整训练数据集合([feature_1, label_1],[feature_2, label_2],........).
shuffle:将数据打乱,数值越大,混乱程度越大
batch:按照顺序取出32行数据,最后一次输出可能小于batch
4.使用 Keras 模型子类化(model subclassing) 构建 tf.keras
模型
class MyModel(Model):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = Conv2D(32, 3, activation='relu')
self.flatten = Flatten()
self.d1 = Dense(128, activation='relu')
self.d2 = Dense(10)
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
x = self.d1(x)
return self.d2(x)
# 实例化
model = MyModel()
5.为训练选择优化器与损失函数
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.Adam()
SparseCategoricalCrossentropy
将数字编码转化成one-hot编码格式,然后对one-hot编码格式的数据(真实标签值)与预测出的标签值使用交叉熵损失函数。
from_logits
from_logits=True,output为经过网络直接输出的 logits张量。
from_logits=False,output为经过softmax输出的概率值。
6.选择衡量指标来度量模型的损失值(loss)和准确率(accuracy)
train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')
test_loss = tf.keras.metrics.Mean(name='test_loss')
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')
7.使用 tf.GradientTape
来训练模型
@tf.function
def train_step(images, labels):
with tf.GradientTape() as tape:
predictions = model(images, training=True)
loss = loss_object(labels, predictions)
#得到参数的梯度
gradients = tape.gradient(loss, model.trainable_variables)
#优化器更新梯度
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_loss(loss)
train_accuracy(labels, predictions)
tf.GradientTape
gradientTape会自动求解模型中可训练的变量的梯度,然后进行训练
trainning=true/false
目前的经验来看,training=false会使预测出来的值分布更加离散,差别相对较大;training=true预测出来的数值差别比较的小,甚至无差别。
在训练时设置为true,测试时设置false
8.测试模型
EPOCHS = 5
for epoch in range(EPOCHS):
# 使用 .reset_states()来重置下一个epoch的状态
train_loss.reset_states()
train_accuracy.reset_states()
test_loss.reset_states()
test_accuracy.reset_states()
for images, labels in train_ds:
train_step(images, labels)
for test_images, test_labels in test_ds:
test_step(test_images, test_labels)
print(
f'Epoch {epoch + 1}, '
f'Loss: {train_loss.result()}, '
f'Accuracy: {train_accuracy.result() * 100}, '
f'Test Loss: {test_loss.result()}, '
f'Test Accuracy: {test_accuracy.result() * 100}'
)
Epoch 1, Loss: 0.1414075791835785, Accuracy: 95.74833679199219, Test Loss: 0.06502392143011093, Test Accuracy: 97.93999481201172
Epoch 2, Loss: 0.044439468532800674, Accuracy: 98.62999725341797, Test Loss: 0.052736375480890274, Test Accuracy: 98.27999877929688
Epoch 3, Loss: 0.024131475016474724, Accuracy: 99.2066650390625, Test Loss: 0.057969432324171066, Test Accuracy: 98.0999984741211
Epoch 4, Loss: 0.013684069737792015, Accuracy: 99.55000305175781, Test Loss: 0.06542213261127472, Test Accuracy: 98.15999603271484
Epoch 5, Loss: 0.011692428961396217, Accuracy: 99.57833099365234, Test Loss: 0.06219566985964775, Test Accuracy: 98.33999633789062
最终准确率稳定在98%