1.数据读取,然后对数据进行预处理,把 x 转化成 float-point,y 还是保留 label 形式
2.搭建模型,要指定输入的 input_shape
3.compile 里面要指定 optimizer,loss_function,metric
4.对模型进行评估,预测的输出要经过 softmax 得到 probability 形式的表示
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
predictions = model(x_train[:1])
predictions 的数据类型这里是 EagerTensor,predictions 输出的是 logits
predictions = model(x_train[:1]).numpy()
加了 .numpy() 之后,转化成 numpy 数据类型
x_train[:1].shape
x_train[0].shape
两者的数据 shape 略微有些不一样,虽然都是取的第一个元素
x_train, x_test = x_train / 255.0, x_test / 255.0
训练的 x 是 float 类型,y 是 0-9
网络模型里面有 4 层 layer
模型里面的 trainable variables
It is possible to bake the tf.nn.softmax function into the activation function for the last layer of the network. While this can make the model output more directly interpretable, this approach is discouraged as it’s impossible to provide an exact and numerically stable loss calculation for all models when using a softmax output.
Define a loss function for training using losses.SparseCategoricalCrossentropy, which takes a vector of logits and a True index and returns a scalar loss for each example.
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
This loss is equal to the negative log probability of the true class: The loss is zero if the model is sure of the correct class.
This untrained model gives probabilities close to random (1/10 for each class), so the initial loss should be close to -tf.math.log(1/10) ~= 2.3.
predictions = model(x_train[:100]).numpy()
# The tf.nn.softmax function converts these logits to probabilities for each class:
pred_prob = tf.nn.softmax(predictions).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_out = loss_fn(y_train[:100], predictions).numpy()
我们这里拿了 100 个样本,来计算他们的平均损失值,可以看到 loss_out 的值为 2.29 非常接近 2.3,从这里也可以知道 loss_fn 对总的损失值做了平均处理
Before you start training, configure and compile the model using Keras Model.compile. Set the optimizer class to adam, set the loss to the loss_fn function you defined earlier, and specify a metric to be evaluated for the model by setting the metrics parameter to accuracy.
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
The Model.evaluate method checks the models performance, usually on a “Validation-set” or “Test-set”.
model.evaluate(x_test, y_test, verbose=2)
If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:
probability_model = tf.keras.Sequential([
model,
tf.keras.layers.Softmax()
])
最后放一下完整的代码
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
predictions = model(x_train[:100]).numpy()
pred_prob = tf.nn.softmax(predictions).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_out = loss_fn(y_train[:100], predictions).numpy()
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test, verbose=2)
probability_model = tf.keras.Sequential([
model,
tf.keras.layers.Softmax()
])
print(probability_model(x_test[:5]))