目录
1. 基本图像分类-服装图像分类
# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras
# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
1.1 导入数据集
使用Fashion MNIST数据集,该数据集包含 10 个类别的 70,000 个灰度图像。这些图像以低分辨率(28x28 像素)展示了单件衣物。直接从TensorFlow中导入和加载Fashion MNIST数据:
fashion_mnist = keras.datasets.fashion_mnist
# 返回四个numpy数组
# train_images, train_labels:训练集,模型用于学习的数据
# test_images, test_labels:测试集,模型用于测试的数据
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
图像是 28x28 的 NumPy 数组,像素值介于 0 到 255 之间。标签是整数数组,介于 0 到 9 之间。
# 存储标签0~9对应的类名称
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
1.2 浏览数据
# 训练集有60000个样本
train_images.shape
# 输出 (60000, 28, 28)
len(train_labels)
# 输出 60000
train_labels
# 输出 array([9, 0, 0, ..., 3, 0, 5], dtype=uint8)
# 测试集有10000个样本
test_images.shape
# 输出 (10000, 28, 28)
len(test_labels)
# 输出 10000
1.3 预处理数据
# 检查训练集中第一个图象
plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)
plt.show()
# 将像素值缩小到0~1之间
train_images = train_images / 255.0
test_images = test_images / 255.0
查看数据集,显示训练集中前25个图像:
plt.figure(figsize=(10,10))
for i in range(25):
plt.subplot(5,5,i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(train_images[i], cmap=plt.cm.binary)
plt.xlabel(class_names[train_labels[i]])
plt.show()
1.4 构建模型
# 设置层,将简单的层链接在一起
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10)
])
tf.keras.layers.Flatten 将图像格式从二维数组(28 x 28 像素)转换成一维数组(28 x 28 = 784 像素)。将该层视为图像中未堆叠的像素行并将其排列起来。该层没有要学习的参数,它只会重新格式化数据。
tf.keras.layers.Dense 密集连接或全连接神经层。第一个Dense层有 128 个节点(或神经元)。第二个(也是最后一个)层会返回一个长度为 10 的 logits 数组。每个节点都包含一个得分,用来表示当前图像属于 10 个类中的哪一类。
# 编译模型
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
- 损失函数 loss - 用于测量模型在训练期间的准确率。通过最小化损失函数优化模型性能。
- 优化器 optimizer - 决定模型如何根据其看到的数据和自身的损失函数进行更新。
- 指标 metrics - 用于监控/衡量训练和测试步骤中的模型性能。
1.5 训练模型
# 训练模型
model.fit(train_images, train_labels, epochs=10)
# 评估准确率
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('\nTest accuracy:', test_acc)
1.6 使用模型进行预测
该模型具有线性输出,可以通过softmax函数生成一个归一化的概率向量,将线性输出转换成概率。
# 预测模型(在训练好的模型后面添加一个softmax层)
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
# 进行预测
predictions = probability_model.predict(test_images)
# 查看预测结果
predictions[0]
# array([7.7300720e-06, 3.1858748e-11, 3.0451045e-07, 2.7817364e-09,
# 1.3059016e-09, 3.1923674e-04, 3.9461247e-06, 1.5980251e-02,
# 5.8933104e-08, 9.8368847e-01], dtype=float32)
# 查看预测结果中置信度最大的种类
np.argmax(predictions[0])
# 9
可以查看预测结果 :
# Plot the first X test images, their predicted labels, and the true labels.
# Color correct predictions in blue and incorrect predictions in red.
num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
plt.subplot(num_rows, 2*num_cols, 2*i+1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(num_rows, 2*num_cols, 2*i+2)
plot_value_array(i, predictions[i], test_labels)
plt.tight_layout()
plt.show()
# 使用训练好的模型
# Grab an image from the test dataset.
img = test_images[1]
# Add the image to a batch where it's the only member.
img = (np.expand_dims(img,0))
print(img.shape)
# (1,28,28)
predictions_single = probability_model.predict(img)
print(predictions_single)
# [[3.0789899e-05 4.1240561e-12 9.9947554e-01 1.6958888e-09 3.4095356e-04 8.6128709e-14 1.5278466e-04 5.1959396e-17 5.5406429e-11 1.4665751e-13]]
np.argmax(predictions_single[0])
# 2
tf.keras模型可以同时对一批或一组样本进行预测,因此即使只使用一个图像,也需要将其添加到列表中。
2. 基本文本分类-电影评论文本分类
import tensorflow as tf
from tensorflow import keras
import numpy as np
2.1 导入数据集
二分类问题,使用IMDB数据集,其包含 50,000 条影评文本。从该数据集切割出的25,000条评论用作训练,另外 25,000 条用作测试。训练集与测试集包含相等数量的积极和消极评论(平衡)。
# 参数num_words:保留训练数据中最常出现的num_words个单词
imdb = keras.datasets.imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
2.2 浏览数据
# 每个样本都是一个表示影评中词汇的整数数组
# 每个标签都是0/1,0表示消极评论,1表示积极评论
print("Training entries: {}, labels: {}".format(len(train_data), len(train_labels)))
# Training entries: 25000, labels: 25000
# 评论文本被转换为整数值,每个整数值代表一个单词
print(train_data[0])
# [1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65, 458, 4468, 66, 3941, 4, 173, 36, 256, 5, 25, 100, 43, 838, 112, 50, 670, 2, 9, 35, 480, 284, 5, 150, 4, 172, 112, 167, 2, 336, 385, 39, 4, 172, 4536, 1111, 17, 546, 38, 13, 447, 4, 192, 50, 16, 6, 147, 2025, 19, 14, 22, 4, 1920, 4613, 469, 4, 22, 71, 87, 12, 16, 43, 530, 38, 76, 15, 13, 1247, 4, 22, 17, 515, 17, 12, 16, 626, 18, 2, 5, 62, 386, 12, 8, 316, 8, 106, 5, 4, 2223, 5244, 16, 480, 66, 3785, 33, 4, 130, 12, 16, 38, 619, 5, 25, 124, 51, 36, 135, 48, 25, 1415, 33, 6, 22, 12, 215, 28, 77, 52, 5, 14, 407, 16, 82, 2, 8, 4, 107, 117, 5952, 15, 256, 4, 2, 7, 3766, 5, 723, 36, 71, 43, 530, 476, 26, 400, 317, 46, 7, 4, 2, 1029, 13, 104, 88, 4, 381, 15, 297, 98, 32, 2071, 56, 26, 141, 6, 194, 7486, 18, 4, 226, 22, 21, 134, 476, 26, 480, 5, 144, 30, 5535, 18, 51, 36, 28, 224, 92, 25, 104, 4, 226, 65, 16, 38, 1334, 88, 12, 16, 283, 5, 16, 4472, 113, 103, 32, 15, 16, 5345, 19, 178, 32]
# 由于电影评论具有不同的长度,因此每个样本长度不一样
len(train_data[0]), len(train_data[1])
# (218, 189)
# 将整数转换回单词的方法
# 一个映射单词到整数索引的词典
word_index = imdb.get_word_index()
# 保留第一个索引
word_index = {k:(v+3) for k,v in word_index.items()}
word_index["<PAD>"] = 0
word_index["<START>"] = 1
word_index["<UNK>"] = 2 # unknown
word_index["<UNUSED>"] = 3
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
def decode_review(text):
return ' '.join([reverse_word_index.get(i, '?') for i in text])
# 显示首条评论的文本
decode_review(train_data[0])
# "<START> this film ... <UNK> ... <UNK> ... <UNK> ... <UNK> ..."
2.3 预处理数据
影评/输入整数数组(特征)需要在输入神经网络之前转换为张量。有以下两种方式可以选择:
- 采用one-hot编码,用0/1表示该位置的单词是否出现,将其作为网络的第一层(处理浮点型向量数据的Dense层)。缺点是需要大量的内存(num_words*num_reviews的矩阵)
- 填充数组保证输入数据具有相同的长度,创建一个大小为max_length*num_reviews的整型张量,使用能够处理此数据的嵌入层(Embedding)作为网络的第一层(√)
# pad_sequences使电影评论长度标准化
# 使用"<PAD>"填充,在整数数组中为0
train_data = keras.preprocessing.sequence.pad_sequences(train_data,
value=word_index["<PAD>"],
padding='post',
maxlen=256)
test_data = keras.preprocessing.sequence.pad_sequences(test_data,
value=word_index["<PAD>"],
padding='post',
maxlen=256)
2.4 构建模型
# 输入形状是用于电影评论的词汇数目(10,000 词)
vocab_size = 10000
model = keras.Sequential()
model.add(keras.layers.Embedding(vocab_size, 16))
model.add(keras.layers.GlobalAveragePooling1D())
model.add(keras.layers.Dense(16, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.summary()
- 嵌入层Embedding:该层采用整数编码的词汇表,并查找每个词索引的嵌入向量(embedding vector)。这些向量是通过模型训练学习到的。向量向输出数组增加了一个维度。得到的维度为:(batch,sequence,embedding)即(批处理评论条数,评论长度256,16)。
- GlobalAveragePooling1D:将通过对序列维度求平均值来为每个样本返回一个定长输出向量。这允许模型以尽可能最简单的方式处理变长输入。
- Dense全连接层:上一层的定长输出向量通过有16个隐藏节点的全连接层传输。
- Dense全连接层:使用Sigmoid激活函数由单个输出节点输出介于 0 与 1 之间的浮点数,表示概率或置信度。
* 嵌入层Embedding用于将一个对象表示为一个数值向量,起到降维或者升维的效果。在该场景下相当于将每批输入数据batch*256(max_length)转换为batch*256*16的数据。
* GlobalAveragePooling1D中Pooling的意义是把多维度的数据整合成一个数据,常见的整合方法有AveragePooling、MaxPooling等。GlobalAveragePooling1D通过求均值为每个样本返回一个定长的输出向量。
# 配置损失函数与优化器
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
'binary_crossentropy' 由于是一个二分类问题且模型输出概率值,因此选择更适合处理概率的'binary_crossentropy' 损失函数
2.5 训练模型
首先创建一个验证集:
# 从训练集中分离出验证集
x_val = train_data[:10000]
partial_x_train = train_data[10000:]
y_val = train_labels[:10000]
partial_y_train = train_labels[10000:]
# 训练模型
# history对象包含一个字典,包含训练阶段发生的事件
history = model.fit(partial_x_train, # 训练样本
partial_y_train, # 训练标签
epochs=40, # 训练周期数
batch_size=512, # batch大小
validation_data=(x_val, y_val), # 验证集
verbose=1)
# 评估模型
results = model.evaluate(test_data, test_labels, verbose=2)
print(results)
# [0.32977813482284546, 0.8728799819946289]
history对象中包含一个字典,其中有四个条目:训练过程的损失值loss和准确率accuracy、验证过程的损失值val_loss和准确率val_accuracy
# 查看history
history_dict = history.history
history_dict.keys()
# dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
# 绘制一个准确率accuracy和损失值loss随时间变化的图表
import matplotlib.pyplot as plt
acc = history_dict['accuracy']
val_acc = history_dict['val_accuracy']
loss = history_dict['loss']
val_loss = history_dict['val_loss']
epochs = range(1, len(acc) + 1)
# “bo”代表 "蓝点"
plt.plot(epochs, loss, 'bo', label='Training loss')
# b代表“蓝色实线”
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.clf() # 清除数字
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
3. 使用TF Hub进行文本分类
使用来自TensorFlow Hub的与训练文本嵌入向量模型进行训练。tfhub.dev
# 安装库
pip install tensorflow-hub
pip install tensorflow-datasets
import numpy as np
import tensorflow as tf
!pip install tensorflow-hub
!pip install tfds-nightly
import tensorflow_hub as hub
import tensorflow_datasets as tfds
print("Version: ", tf.__version__)
print("Eager mode: ", tf.executing_eagerly())
print("Hub version: ", hub.__version__)
print("GPU is", "available" if tf.config.experimental.list_physical_devices("GPU") else "NOT AVAILABLE")
# 导入数据集
# Split the training set into 60% and 40% to end up with 15,000 examples
# for training, 10,000 examples for validation and 25,000 examples for testing.
train_data, validation_data, test_data = tfds.load(
name="imdb_reviews",
split=('train[:60%]', 'train[60%:]', 'test'),
as_supervised=True)
# 导入已经训练好的嵌入层embedding
embedding = "https://tfhub.dev/google/nnlm-en-dim50/2"
hub_layer = hub.KerasLayer(embedding, input_shape=[],
dtype=tf.string, trainable=True)
hub_layer(train_examples_batch[:3])
# 搭建模型
model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Dense(16, activation='relu'))
model.add(tf.keras.layers.Dense(1))
model.summary() # 查看模型结构
# 编译模型
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
# 训练模型
history = model.fit(train_data.shuffle(10000).batch(512),
epochs=10,
validation_data=validation_data.batch(512),
verbose=1)
# 评估模型
results = model.evaluate(test_data.batch(512), verbose=2)
for name, value in zip(model.metrics_names, results):
print("%s: %.3f" % (name, value))
4. 回归-预测燃油效率
回归 regression 问题中预测价格或概率之类的连续纸输出,而分类 classification 问题只需要从一系列的分类中选择出一个分类。本案例使用Auto MPG数据集,构建了一个用气缸数,排量,马力以及重量来预测汽车燃油效率的模型。
# 使用 seaborn 绘制矩阵图 (pairplot)
pip install -q seaborn
import pathlib
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
4.1 导入数据集
# 下载数据集
dataset_path = keras.utils.get_file("auto-mpg.data", "http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")
dataset_path
# 使用pandas导入数据集
column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight',
'Acceleration', 'Model Year', 'Origin']
raw_dataset = pd.read_csv(dataset_path, names=column_names,
na_values = "?", comment='\t',
sep=" ", skipinitialspace=True)
4.2 浏览数据
dataset = raw_dataset.copy()
dataset.tail()
4.3 预处理数据
由于数据集中包括一些未知值:
# 检测缺失值
dataset.isna().sum()
"""
MPG 0
Cylinders 0
Displacement 0
Horsepower 6
Weight 0
Acceleration 0
Model Year 0
Origin 0
dtype: int64
"""
# 删除这些行
dataset = dataset.dropna()
“Origin”代表分类,将其转换为one-hot编码
origin = dataset.pop('Origin')
dataset['USA'] = (origin == 1)*1.0
dataset['Europe'] = (origin == 2)*1.0
dataset['Japan'] = (origin == 3)*1.0
dataset.tail()
# 拆分训练集和测试集
train_dataset = dataset.sample(frac=0.8,random_state=0)
test_dataset = dataset.drop(train_dataset.index)
# 查看训练集中样本分布
sns.pairplot(train_dataset[["MPG", "Cylinders", "Displacement", "Weight"]], diag_kind="kde")
# 查看总体的数据统计
train_stats = train_dataset.describe()
train_stats.pop("MPG")
train_stats = train_stats.transpose()
train_stats
# 从标签中分离特征
train_labels = train_dataset.pop('MPG')
test_labels = test_dataset.pop('MPG')
# 特征归一化
def norm(x):
return (x - train_stats['mean']) / train_stats['std']
normed_train_data = norm(train_dataset)
normed_test_data = norm(test_dataset)
4.4 构建模型
def build_model():
model = keras.Sequential([
layers.Dense(64, activation='relu', input_shape=[len(train_dataset.keys())]),
layers.Dense(64, activation='relu'),
layers.Dense(1)
])
optimizer = tf.keras.optimizers.RMSprop(0.001)
model.compile(loss='mse',
optimizer=optimizer,
metrics=['mae', 'mse'])
return model
model = build_model()
model.summary()
'''
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 64) 640
_________________________________________________________________
dense_1 (Dense) (None, 64) 4160
_________________________________________________________________
dense_2 (Dense) (None, 1) 65
=================================================================
Total params: 4,865
Trainable params: 4,865
Non-trainable params: 0
_________________________________________________________________
'''
- 均方误差(MSE)是用于回归问题的常见损失函数(分类问题中使用不同的损失函数)。
- 类似的,用于回归的评估指标与分类不同。 常见的回归指标是平均绝对误差(MAE)。
模型最终会输出一个dtype=float32的array
4.5 训练模型
# 通过为每个完成的时期打印一个点来显示训练进度,生成一堆.序列
class PrintDot(keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs):
if epoch % 100 == 0: print('')
print('.', end='')
EPOCHS = 1000
# 训练模型
history = model.fit(
normed_train_data, train_labels,
epochs=EPOCHS, validation_split = 0.2, verbose=0,
callbacks=[PrintDot()])
# 使用 history 对象中存储的统计信息可视化模型的训练进度
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
hist.tail()
def plot_history(history):
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Mean Abs Error [MPG]')
plt.plot(hist['epoch'], hist['mae'],
label='Train Error')
plt.plot(hist['epoch'], hist['val_mae'],
label = 'Val Error')
plt.ylim([0,5])
plt.legend()
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Mean Square Error [$MPG^2$]')
plt.plot(hist['epoch'], hist['mse'],
label='Train Error')
plt.plot(hist['epoch'], hist['val_mse'],
label = 'Val Error')
plt.ylim([0,20])
plt.legend()
plt.show()
plot_history(history)
由于在100个epochs后误差没有减小反而在上升,因此更新 model.fit 的调用,当验证集上的精度没有提高时自动停止训练。
model = build_model()
# patience 值用来检查改进 epochs 的数量
# 使用 EarlyStopping callback 来测试每个 epoch 的训练条件
early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)
history = model.fit(normed_train_data, train_labels, epochs=EPOCHS,
validation_split = 0.2, verbose=0, callbacks=[early_stop, PrintDot()])
plot_history(history)
4.6 使用模型进行预测
# 在测试集上验证模型性能
loss, mae, mse = model.evaluate(normed_test_data, test_labels, verbose=2)
print("Testing set Mean Abs Error: {:5.2f} MPG".format(mae))
'''
3/3 - 0s - loss: 5.9941 - mae: 1.8809 - mse: 5.9941
Testing set Mean Abs Error: 1.88 MPG
'''
# 使用测试集中的数据预测
test_predictions = model.predict(normed_test_data).flatten()
plt.scatter(test_labels, test_predictions)
plt.xlabel('True Values [MPG]')
plt.ylabel('Predictions [MPG]')
plt.axis('equal')
plt.axis('square')
plt.xlim([0,plt.xlim()[1]])
plt.ylim([0,plt.ylim()[1]])
_ = plt.plot([-100, 100], [-100, 100])
# 查看误差分布
error = test_predictions - test_labels
plt.hist(error, bins = 25)
plt.xlabel("Prediction Error [MPG]")
_ = plt.ylabel("Count")
- 当数字输入数据特征的值存在不同范围时,每个特征应独立缩放到相同范围。
- 如果训练数据不多,一种方法是选择隐藏层较少的小网络,以避免过度拟合。
- 早期停止是一种防止过度拟合的有效技术。
5. 过拟合和欠拟合
防止过拟合:使用更完整的数据集、使用正则化等技术。
5.1 设置
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import regularizers
!pip install git+https://github.com/tensorflow/docs
import tensorflow_docs as tfdocs
import tensorflow_docs.modeling
import tensorflow_docs.plots
from IPython import display
from matplotlib import pyplot as plt
import numpy as np
import pathlib
import shutil
import tempfile
logdir = pathlib.Path(tempfile.mkdtemp())/"tensorboard_logs"
shutil.rmtree(logdir, ignore_errors=True)
5.2 Higgs数据集
数据集中包含11000000个样本,每个样本有28个特征和一个二分类的标签
# 下载数据集
gz = tf.keras.utils.get_file('HIGGS.csv.gz', 'http://mlphysics.ics.uci.edu/data/higgs/HIGGS.csv.gz')
FEATURES = 28
# tf.data.experimental.CsvDataset 类可用于直接从 gzip 文件读取 csv 记录,无需中间解压缩步骤。
# 返回每条样本的标量列表
ds = tf.data.experimental.CsvDataset(gz,[float(),]*(FEATURES+1), compression_type="GZIP")
# 将一条样本分成特征和标签数据对
def pack_row(*row):
label = row[0]
features = tf.stack(row[1:],1)
return features, label
# TensorFlow无需单独重新打包每行,可以按照每批次10000应用pack_row函数
packed_ds = ds.batch(10000).map(pack_row).unbatch()
# 查看处理好的packed_ds
for features,label in packed_ds.batch(1000).take(1):
print(features[0])
plt.hist(features.numpy().flatten(), bins = 101)
'''
packed_ds
tf.Tensor(
[ 0.8692932 -0.6350818 0.22569026 0.32747006 -0.6899932 0.75420225
-0.24857314 -1.0920639 0. 1.3749921 -0.6536742 0.9303491
1.1074361 1.1389043 -1.5781983 -1.0469854 0. 0.65792954
-0.01045457 -0.04576717 3.1019614 1.35376 0.9795631 0.97807616
0.92000484 0.72165745 0.98875093 0.87667835], shape=(28,), dtype=float32)
'''
# 使用前1000个样本进行验证,使用接下来的10000个样本进行训练
N_VALIDATION = int(1e3)
N_TRAIN = int(1e4)
BUFFER_SIZE = int(1e4)
BATCH_SIZE = 500
STEPS_PER_EPOCH = N_TRAIN//BATCH_SIZE
# Dataset.take命令取对应数量的样本
# Dataset.skip命令跳过对应数量
# cache命令确保加载程序不需要在每个epoch从文件中重新读取数据
validate_ds = packed_ds.take(N_VALIDATION).cache()
train_ds = packed_ds.skip(N_VALIDATION).take(N_TRAIN).cache()
train_ds
# <CacheDataset element_spec=(TensorSpec(shape=(28,), dtype=tf.float32, name=None), TensorSpec(shape=(), dtype=tf.float32, name=None))>
# 使用Dataset.batch方法创建适合训练大小的批
validate_ds = validate_ds.batch(BATCH_SIZE)
train_ds = train_ds.shuffle(BUFFER_SIZE).repeat().batch(BATCH_SIZE)
5.3 演示过拟合
在训练期间逐渐降低学习率可以获得更好的训练效果,使用 tf.keras.optimizers.schedules 来降低一段时间内的学习率:
lr_schedule = tf.keras.optimizers.schedules.InverseTimeDecay(
0.001,
decay_steps=STEPS_PER_EPOCH*1000,
decay_rate=1,
staircase=False)
def get_optimizer():
return tf.keras.optimizers.Adam(lr_schedule)
# 产看随着epoch变化的学习率
step = np.linspace(0,100000)
lr = lr_schedule(step)
plt.figure(figsize = (8,6))
plt.plot(step/STEPS_PER_EPOCH, lr)
plt.ylim([0,max(plt.ylim())])
plt.xlabel('Epoch')
_ = plt.ylabel('Learning Rate')
设置早停回调 tf.keras.callbacks.EarlyStopping 避免不必要的训练时间。
def get_callbacks(name):
return [
tfdocs.modeling.EpochDots(),
tf.keras.callbacks.EarlyStopping(monitor='val_binary_crossentropy', patience=200),
tf.keras.callbacks.TensorBoard(logdir/name),
]
为所有模型设置相同的model.compile和model.fit:
def compile_and_fit(model, name, optimizer=None, max_epochs=10000):
if optimizer is None:
optimizer = get_optimizer()
model.compile(optimizer=optimizer,
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=[
tf.keras.losses.BinaryCrossentropy(
from_logits=True, name='binary_crossentropy'),
'accuracy'])
model.summary()
history = model.fit(
train_ds,
steps_per_epoch = STEPS_PER_EPOCH,
epochs=max_epochs,
validation_data=validate_ds,
callbacks=get_callbacks(name),
verbose=0)
return history
tiny_model的训练:
tiny_model = tf.keras.Sequential([
layers.Dense(16, activation='elu', input_shape=(FEATURES,)),
layers.Dense(1)
])
size_histories = {}
size_histories['Tiny'] = compile_and_fit(tiny_model, 'sizes/Tiny')
'''
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 16) 464
dense_1 (Dense) (None, 1) 17
=================================================================
Total params: 481
Trainable params: 481
Non-trainable params: 0
_________________________________________________________________
Epoch: 0, accuracy:0.4915, binary_crossentropy:0.8589, loss:0.8589, val_accuracy:0.4730, val_binary_crossentropy:0.8619, val_loss:0.8619,
...
'''
# 查看训练过程
plotter = tfdocs.plots.HistoryPlotter(metric = 'binary_crossentropy', smoothing_std=10)
plotter.plot(size_histories)
plt.ylim([0.5, 0.7])
small_model的训练:
small_model = tf.keras.Sequential([
# `input_shape` is only required here so that `.summary` works.
layers.Dense(16, activation='elu', input_shape=(FEATURES,)),
layers.Dense(16, activation='elu'),
layers.Dense(1)
])
size_histories['Small'] = compile_and_fit(small_model, 'sizes/Small')
'''
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_2 (Dense) (None, 16) 464
dense_3 (Dense) (None, 16) 272
dense_4 (Dense) (None, 1) 17
=================================================================
Total params: 753
Trainable params: 753
Non-trainable params: 0
_________________________________________________________________
Epoch: 0, accuracy:0.4831, binary_crossentropy:0.7411, loss:0.7411, val_accuracy:0.4670, val_binary_crossentropy:0.7131, val_loss:0.7131,
...
'''
medium_model的训练:
medium_model = tf.keras.Sequential([
layers.Dense(64, activation='elu', input_shape=(FEATURES,)),
layers.Dense(64, activation='elu'),
layers.Dense(64, activation='elu'),
layers.Dense(1)
])
size_histories['Medium'] = compile_and_fit(medium_model, "sizes/Medium")
'''
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_5 (Dense) (None, 64) 1856
dense_6 (Dense) (None, 64) 4160
dense_7 (Dense) (None, 64) 4160
dense_8 (Dense) (None, 1) 65
=================================================================
Total params: 10,241
Trainable params: 10,241
Non-trainable params: 0
_________________________________________________________________
Epoch: 0, accuracy:0.4828, binary_crossentropy:0.7027, loss:0.7027, val_accuracy:0.5230, val_binary_crossentropy:0.6887, val_loss:0.6887,
...
'''
large_model的训练:
large_model = tf.keras.Sequential([
layers.Dense(512, activation='elu', input_shape=(FEATURES,)),
layers.Dense(512, activation='elu'),
layers.Dense(512, activation='elu'),
layers.Dense(512, activation='elu'),
layers.Dense(1)
])
size_histories['large'] = compile_and_fit(large_model, "sizes/large")
'''
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_9 (Dense) (None, 512) 14848
dense_10 (Dense) (None, 512) 262656
dense_11 (Dense) (None, 512) 262656
dense_12 (Dense) (None, 512) 262656
dense_13 (Dense) (None, 1) 513
=================================================================
Total params: 803,329
Trainable params: 803,329
Non-trainable params: 0
_________________________________________________________________
Epoch: 0, accuracy:0.5092, binary_crossentropy:0.8567, loss:0.8567, val_accuracy:0.5720, val_binary_crossentropy:0.6973, val_loss:0.6973,
...
'''
查看四种规模的模型训练过程和验证过程的loss:
plotter.plot(size_histories)
a = plt.xscale('log')
plt.xlim([5, max(plt.xlim())])
plt.ylim([0.5, 0.7])
plt.xlabel("Epochs [Log Scale]")
5.4 防止过拟合的策略
模型训练期间的数据卸载TensorBoard日志中,复制上述模型的训练日志用于比较:
shutil.rmtree(logdir/'regularizers/Tiny', ignore_errors=True)
shutil.copytree(logdir/'sizes/Tiny', logdir/'regularizers/Tiny')
regularizer_histories = {}
regularizer_histories['Tiny'] = size_histories['Tiny']
方法一:正则化
正则化通过强制权重取较小值约束网络的复杂性,能够使权重的分布更加规则,通过在网络的损失函数中添加与大权重相关的cost完成:
- L1正则化:添加的cost与权重系数的绝对值成正比,使权重向0靠近,鼓励稀疏模型;
- L2正则化:添加的cost与权重系数的平方成正比,减小权重参数但是不会使模型变得稀疏;
# 添加L2权重正则化
l2_model = tf.keras.Sequential([
layers.Dense(512, activation='elu',
kernel_regularizer=regularizers.l2(0.001),
input_shape=(FEATURES,)),
layers.Dense(512, activation='elu',
kernel_regularizer=regularizers.l2(0.001)),
layers.Dense(512, activation='elu',
kernel_regularizer=regularizers.l2(0.001)),
layers.Dense(512, activation='elu',
kernel_regularizer=regularizers.l2(0.001)),
layers.Dense(1)
])
regularizer_histories['l2'] = compile_and_fit(l2_model, "regularizers/l2")
'''
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_14 (Dense) (None, 512) 14848
dense_15 (Dense) (None, 512) 262656
dense_16 (Dense) (None, 512) 262656
dense_17 (Dense) (None, 512) 262656
dense_18 (Dense) (None, 1) 513
=================================================================
Total params: 803,329
Trainable params: 803,329
Non-trainable params: 0
_________________________________________________________________
Epoch: 0, accuracy:0.5052, binary_crossentropy:0.8148, loss:2.3360, val_accuracy:0.4760, val_binary_crossentropy:0.6928, val_loss:2.1343,
...
'''
l2(0.001) 每个权重都将增加网络的总损耗,0.001*weight_coefficient_value**2
方法二:添加Dropout层
Dropout层在训练阶段随机丢弃权重
dropout_model = tf.keras.Sequential([
layers.Dense(512, activation='elu', input_shape=(FEATURES,)),
layers.Dropout(0.5),
layers.Dense(512, activation='elu'),
layers.Dropout(0.5),
layers.Dense(512, activation='elu'),
layers.Dropout(0.5),
layers.Dense(512, activation='elu'),
layers.Dropout(0.5),
layers.Dense(1)
])
regularizer_histories['dropout'] = compile_and_fit(dropout_model, "regularizers/dropout")
'''
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_19 (Dense) (None, 512) 14848
dropout (Dropout) (None, 512) 0
dense_20 (Dense) (None, 512) 262656
dropout_1 (Dropout) (None, 512) 0
dense_21 (Dense) (None, 512) 262656
dropout_2 (Dropout) (None, 512) 0
dense_22 (Dense) (None, 512) 262656
dropout_3 (Dropout) (None, 512) 0
dense_23 (Dense) (None, 1) 513
=================================================================
Total params: 803,329
Trainable params: 803,329
Non-trainable params: 0
_________________________________________________________________
Epoch: 0, accuracy:0.5034, binary_crossentropy:0.8014, loss:0.8014, val_accuracy:0.5600, val_binary_crossentropy:0.7065, val_loss:0.7065,
...
'''
方法三:结合L2正则化与Dropout层
combined_model = tf.keras.Sequential([
layers.Dense(512, kernel_regularizer=regularizers.l2(0.0001),
activation='elu', input_shape=(FEATURES,)),
layers.Dropout(0.5),
layers.Dense(512, kernel_regularizer=regularizers.l2(0.0001),
activation='elu'),
layers.Dropout(0.5),
layers.Dense(512, kernel_regularizer=regularizers.l2(0.0001),
activation='elu'),
layers.Dropout(0.5),
layers.Dense(512, kernel_regularizer=regularizers.l2(0.0001),
activation='elu'),
layers.Dropout(0.5),
layers.Dense(1)
])
regularizer_histories['combined'] = compile_and_fit(combined_model, "regularizers/combined")
'''
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_24 (Dense) (None, 512) 14848
dropout_4 (Dropout) (None, 512) 0
dense_25 (Dense) (None, 512) 262656
dropout_5 (Dropout) (None, 512) 0
dense_26 (Dense) (None, 512) 262656
dropout_6 (Dropout) (None, 512) 0
dense_27 (Dense) (None, 512) 262656
dropout_7 (Dropout) (None, 512) 0
dense_28 (Dense) (None, 1) 513
=================================================================
Total params: 803,329
Trainable params: 803,329
Non-trainable params: 0
_________________________________________________________________
Epoch: 0, accuracy:0.5102, binary_crossentropy:0.7920, loss:0.9501, val_accuracy:0.5270, val_binary_crossentropy:0.6840, val_loss:0.8413,
...
'''
另外还有数据扩充 data augmentation (常用于图像)和批处理规范化 batch normalization 方法可以应用。
数据扩充Data augmentation | TensorFlow Core (google.cn),批处理规范化 tf.keras.layers.BatchNormalization | TensorFlow Core v2.9.1 (google.cn)
6. 保存和加载模型
需要安装读取hdf5的功能包
pip install pyyaml h5py # Required to save models in HDF5 format
import os
import tensorflow as tf
from tensorflow import keras
得到训练好的模型:
# 导入数据集
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
train_labels = train_labels[:1000]
test_labels = test_labels[:1000]
train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0
# 构建模型
# Define a simple sequential model
def create_model():
model = tf.keras.models.Sequential([
keras.layers.Dense(512, activation='relu', input_shape=(784,)),
keras.layers.Dropout(0.2),
keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[tf.metrics.SparseCategoricalAccuracy()])
return model
# Create a basic model instance
model = create_model()
# Display the model's architecture
model.summary()
'''
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 512) 401920
dropout (Dropout) (None, 512) 0
dense_1 (Dense) (None, 10) 5130
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________
'''
1. 在训练期间保存模型(利用checkpoint)
tf.keras.callbacks.ModelCheckpoint 回调允许在训练期间和结束时保存模型
# 创建只在训练期间保存权重的 tf.keras.callbacks.ModelCheckpoint 回调
checkpoint_path = "training_1/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# 创建 callback 保存模型权重
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_weights_only=True,
verbose=1)
# 使用新的 callback 训练模型
model.fit(train_images,
train_labels,
epochs=10,
validation_data=(test_images, test_labels),
callbacks=[cp_callback])
# 创建 TensorFlow checkpoint 文件集合,在每个 epoch 结束时更新
os.listdir(checkpoint_dir)
# ['cp.ckpt.data-00000-of-00001', 'cp.ckpt.index', 'checkpoint']
# 重新创建一个未经训练的全新模型进行评估,再从checkpoint加载权重后评估
model = create_model() # Create a basic model instance
loss, acc = model.evaluate(test_images, test_labels, verbose=2) # Evaluate the model
print("Untrained model, accuracy: {:5.2f}%".format(100 * acc))
'''
32/32 - 0s - loss: 2.4002 - sparse_categorical_accuracy: 0.0930 - 261ms/epoch - 8ms/step
Untrained model, accuracy: 9.30%
'''
model.load_weights(checkpoint_path) # Loads the weights
loss, acc = model.evaluate(test_images, test_labels, verbose=2) # Re-evaluate the model
print("Restored model, accuracy: {:5.2f}%".format(100 * acc))
'''
32/32 - 0s - loss: 0.3860 - sparse_categorical_accuracy: 0.8750 - 75ms/epoch - 2ms/step
Restored model, accuracy: 87.50%
'''
checkpoint回调选项:
# 在文件名中包含 epoch (uses `str.format`)
checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
batch_size = 32
# 创建 callback 每5个 epoch 保存模型权重
cp_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_path,
verbose=1,
save_weights_only=True,
save_freq=5*batch_size)
# 创建一个新的模型
model = create_model()
# 使用 `checkpoint_path` 模式存储权重
model.save_weights(checkpoint_path.format(epoch=0))
# 使用新的 callback 训练模型
model.fit(train_images,
train_labels,
epochs=50,
batch_size=batch_size,
callbacks=[cp_callback],
validation_data=(test_images, test_labels),
verbose=0)
'''
Epoch 5: saving model to training_2/cp-0005.ckpt
Epoch 10: saving model to training_2/cp-0010.ckpt
......
Epoch 50: saving model to training_2/cp-0050.ckpt
<keras.callbacks.History at 0x7fd55c465e80>
'''
# 查看生成的 checkpoint 并选择最新的
# 默认 TensorFlow 格式只保存最近的 5 个检查点
os.listdir(checkpoint_dir)
'''
['cp-0015.ckpt.index',
'cp-0050.ckpt.index',
'cp-0025.ckpt.data-00000-of-00001',
'cp-0035.ckpt.data-00000-of-00001',
'cp-0045.ckpt.index',
'cp-0010.ckpt.data-00000-of-00001',
'cp-0045.ckpt.data-00000-of-00001',
'cp-0005.ckpt.index',
'cp-0040.ckpt.data-00000-of-00001',
'cp-0015.ckpt.data-00000-of-00001',
'cp-0000.ckpt.data-00000-of-00001',
'cp-0010.ckpt.index',
'cp-0025.ckpt.index',
'cp-0030.ckpt.index',
'cp-0000.ckpt.index',
'cp-0050.ckpt.data-00000-of-00001',
'cp-0020.ckpt.index',
'checkpoint',
'cp-0040.ckpt.index',
'cp-0020.ckpt.data-00000-of-00001',
'cp-0035.ckpt.index',
'cp-0030.ckpt.data-00000-of-00001',
'cp-0005.ckpt.data-00000-of-00001']
'''
latest = tf.train.latest_checkpoint(checkpoint_dir)
latest
'''
'training_2/cp-0050.ckpt'
'''
# 重置模型并加载最新的 checkpoint
# Create a new model instance
model = create_model()
# Load the previously saved weights
model.load_weights(latest)
# Re-evaluate the model
loss, acc = model.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100 * acc))
checkpoint的格式化文件中存储了模型权重,这些文件仅包含二进制格式的训练权重。checkpoint包含:
- 一个或多个包含模型权重的分片 shards
- 一个索引文件,指示哪些权重存储在哪个分片 shard 中
当在一台计算机上训练模型时,将获得具有如下后缀的分片:.data-00000-of-00001
2. 手动保存权重 model.save_weights
model.save_weights使用扩展名为.ckpt的检查点checkpoint格式,保存在扩展名为.h5的HDF5中
# Save the weights
model.save_weights('./checkpoints/my_checkpoint')
# Create a new model instance
model = create_model()
# Restore the weights
model.load_weights('./checkpoints/my_checkpoint')
# Evaluate the model
loss, acc = model.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100 * acc))
3. 保存整个模型 model.save
model.save 可以用来保存模型的结构,权重和训练配置保存在单个文件/文件夹中。模型可以保存为两种不同的文件格式(SavedModel格式和HDF5格式)。
SavedModel格式
SavedModel格式是另一种序列化模型的方式,包含protobuf(.pb)二进制文件和TensorFlow检查点的目录,以这种格式保存的模型可以使用 tf.keras.models.load_model 恢复。
# 创建并训练一个模型
model = create_model()
model.fit(train_images, train_labels, epochs=5)
# 以SavedModel格式存储整个模型
!mkdir -p saved_model
model.save('saved_model/my_model')
# 加载保存的模型
new_model = tf.keras.models.load_model('saved_model/my_model')
new_model.summary()
'''
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_10 (Dense) (None, 512) 401920
dropout_5 (Dropout) (None, 512) 0
dense_11 (Dense) (None, 10) 5130
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________
'''
# 评估模型
loss, acc = new_model.evaluate(test_images, test_labels, verbose=2)
print('Restored model, accuracy: {:5.2f}%'.format(100 * acc))
print(new_model.predict(test_images).shape)
'''
32/32 - 0s - loss: 0.4311 - sparse_categorical_accuracy: 0.8660 - 171ms/epoch - 5ms/step
Restored model, accuracy: 86.60%
32/32 [==============================] - 0s 1ms/step
(1000, 10)
'''
HDF5格式
# 创建并训练一个新模型
model = create_model()
model.fit(train_images, train_labels, epochs=5)
# 以HDF5格式保存整个模型
model.save('my_model.h5')
# 重新创建模型,包括权重和优化器
new_model = tf.keras.models.load_model('my_model.h5')
new_model.summary()
'''
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_12 (Dense) (None, 512) 401920
dropout_6 (Dropout) (None, 512) 0
dense_13 (Dense) (None, 10) 5130
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________
'''
# 评估模型
loss, acc = new_model.evaluate(test_images, test_labels, verbose=2)
print('Restored model, accuracy: {:5.2f}%'.format(100 * acc))
'''
32/32 - 0s - loss: 0.4390 - sparse_categorical_accuracy: 0.8540 - 167ms/epoch - 5ms/step
Restored model, accuracy: 85.40%
'''
7. 使用keras tuner调整超参数
keras tuner库可以帮助TensorFlow程序选择最佳的超参数集,这个过程称为超参数调节。超参数具有两种类型:模型超参数(影响模型的选择,例如隐藏层的数量和宽度)、算法超参数(影响学习算法的速度和质量,例如随机梯度下降SGD的学习率以及k近邻KNN分类器的近邻数)。
以服装图像分类为例:
import tensorflow as tf
from tensorflow import keras
# 安装导入keras tuner
pip install -q -U keras-tuner
import keras_tuner as kt
# 1. 下载并准备数据集
(img_train, label_train), (img_test, label_test) = keras.datasets.fashion_mnist.load_data()
# 归一化
img_train = img_train.astype('float32') / 255.0
img_test = img_test.astype('float32') / 255.0
# 2. 构建模型
# 构建用于超参数调节的模型时,需要构建模型架构和超参数搜索空间
# 可以使用模型构建工具函数或者keras tuner API的HyperModel类构建模型
# 模型构建工具函数用来返回返回已编译的模型
def model_builder(hp):
model = keras.Sequential()
model.add(keras.layers.Flatten(input_shape=(28, 28)))
# 调节第一个Dense层的神经元个数,优化值为32-512
hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
model.add(keras.layers.Dense(units=hp_units, activation='relu'))
model.add(keras.layers.Dense(10))
# 调节优化器的学习率,0.01, 0.001, 0.0001
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
# 3. 实例化调节器
# keras tuner提供了四种调节器RandomSearch、Hyperband、BayesianOptimization 和 Sklearn
# 使用Hyperband 调节器,必须指定超模型、要优化的 objective 和要训练的最大周期数 (max_epochs)
tuner = kt.Hyperband(model_builder, # 模型
objective='val_accuracy', # 优化对象
max_epochs=10, # 训练的最大周期数
factor=3,
directory='my_dir',
project_name='intro_to_kt')
# 使用早停法:创建回调在验证损失达到指定值后提前停止训练
stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
# 搜索超参数
tuner.search(img_train, label_train, epochs=50, validation_split=0.2, callbacks=[stop_early])
# 得到最优超参数
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]
print(f"""
The hyperparameter search is complete.
The optimal number of units in the first densely-connected layer is {best_hps.get('units')}
and the optimal learning rate for the optimizer is {best_hps.get('learning_rate')}.
""")
'''
Trial 30 Complete [00h 00m 39s]
val_accuracy: 0.8665833473205566
Best val_accuracy So Far: 0.8912500143051147
Total elapsed time: 00h 08m 13s
INFO:tensorflow:Oracle triggered exit
The hyperparameter search is complete. The optimal number of units in the first densely-connected
layer is 448 and the optimal learning rate for the optimizer
is 0.001.
'''
# 4. 训练模型
# 使用从搜索中获得的超参数找到训练模型的最佳周期数
model = tuner.hypermodel.build(best_hps)
history = model.fit(img_train, label_train, epochs=50, validation_split=0.2)
val_acc_per_epoch = history.history['val_accuracy']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))
# 重新实例化超模型,并使用最佳周期数对其进行训练
hypermodel = tuner.hypermodel.build(best_hps)
hypermodel.fit(img_train, label_train, epochs=best_epoch, validation_split=0.2)
# 5. 评估模型
eval_result = hypermodel.evaluate(img_test, label_test)
print("[test loss, test accuracy]:", eval_result)
pandas学习教程:
Pandas教程(非常详细) (biancheng.net)
User Guide — pandas 1.4.3 documentation (pydata.org)
TensorFlow中文文档:
TensorFlow官方文档:
keras: