内容简介
- 模型剖析
- 构建全连接神经网络
- 优化全连接神经网络模型,提高准确率
- LeNet-5模型
模型剖析
-
查看mnist数据集信息
# 引入keras from tensorflow import keras import matplotlib.pyplot as plt # 加载mnist数据集 (train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data() # 查看mnist数据集的训练集、测试集的大小,查看第一张图片的信息,并绘制第一张图片 print(train_images.shape, test_images.shape) # 查看第一张图片 print(train_images[0]) print(train_labels[0]) plt.imshow(train_images[0]) plt.show()
mnist数据集信息
第一张图片的信息(28x28的向量,5是对应标签) -
将图片展开成一维
原数据是60000张28x28大小的图片,图片展开后变成了60000个28*28=784维的向量(按行展开)
train_images = train_images.reshape((60000, 28 * 28)).astype('float') test_images = test_images.reshape((10000, 28 * 28)).astype('float')
-
one-hot编码
One-hHot编码,又称为一位有效编码,主要是采用N为状态寄存器来对N个状态进行编码,每个状态都由独立的寄存器位,并且在任意时候只有一位有效。
One-Hot编码是分类变量作为二进制向量的表示,这要求先将分类值映射到整数值,然后每个整数值被表示为二进制向量,除了整数的索引之外,其它都是零值,被标记为1。简单点说one-hot编码将N个多分类问题转化为N个独立的二分类。train_labels = keras.utils.to_categorical(train_labels) test_labels = keras.utils.to_categorical(test_labels)
训练集标签进行one-hot编码
[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
-
设置网络模型
# 搭建全连接神经网络模型 model = keras.Sequential() model.add(keras.layers.Dense(units=15, activation='relu', input_shape=(28*28, ),)) model.add(keras.layers.Dense(units=10, activation='softmax')) # 设置优化器和损失函数 model.compile(optimizer=keras.optimizers.RMSprop(lr=0.001), loss=keras.losses.categorical_crossentropy, metrics=['accuracy']) # 用fit函数训练网络,epochs表示训练的轮数,batch_size表示每次训练给多大的数据 model.fit(train_images, train_labels, epochs=20, batch_size=128, verbose=2)
-
验证模型
# 测试集上进行验证 test_loss, test_accuracy = model.evaluate(test_images, test_labels)
构建全连接神经网络
# @function: mnist数据集
# @Description:一只萤火虫
from tensorflow import keras
import matplotlib.pyplot as plt
# 加载mnist数据集
(train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data()
# # 查看mnist数据集的训练集、测试集的大小,查看第一张图片的信息,并绘制第一张图片
# print(train_images.shape, test_images.shape)
# # 查看第一张图片
# print(train_images[0])
# print(train_labels[0])
# plt.imshow(train_images[0])
# plt.show()
# 将图片由二维展开成一维,即one-hot编码
train_images = train_images.reshape((60000, 28 * 28)).astype('float')
test_images = test_images.reshape((10000, 28 * 28)).astype('float')
# print(train_images[0])
train_labels = keras.utils.to_categorical(train_labels)
test_labels = keras.utils.to_categorical(test_labels)
# print(train_labels[0])
# 搭建全连接神经网络模型
model = keras.Sequential()
model.add(keras.layers.Dense(units=15, activation='relu', input_shape=(28*28, ),))
model.add(keras.layers.Dense(units=10, activation='softmax'))
# 设置优化器和损失函数
model.compile(optimizer=keras.optimizers.RMSprop(lr=0.001), loss=keras.losses.categorical_crossentropy,
metrics=['accuracy'])
# 用fit函数训练网络,epochs表示训练的轮数,batch_size表示每次训练给多大的数据
model.fit(train_images, train_labels, epochs=20, batch_size=128, verbose=2)
# # 查看模型结构:每层神经元、训练参数的数目
# print(model.summary())
# # 查看前五张图片的预测结果
# p = model.predict(test_images[:5])
# print(p, test_labels[:5])
# 测试集上进行验证
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
训练及测试结果:
Epoch 1/20
2021-06-23 10:24:08.827897: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-06-23 10:24:09.552317: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
469/469 - 4s - loss: 2.8857 - accuracy: 0.3163
……
……
Epoch 19/20
469/469 - 1s - loss: 0.3623 - accuracy: 0.9068
Epoch 20/20
469/469 - 1s - loss: 0.3578 - accuracy: 0.9089
313/313 [==============================] - 1s 2ms/step - loss: 0.4241 - accuracy: 0.9053
优化模型,提高准确率
-
过拟合
当增加全连接层的层数和神经元数目后,训练参数也会迅速增加。
模型训练时,在到达一个临界点之后,训练集的误差下降,测试集的误差上升了,这个时候就进入了过拟合区域(模型复杂度高于实际问题,模型在训练集上表现很好,但在测试集上却表现很差),此时需要采取方法防止过拟合
-
模型优化
采用正则化和Dropout来防止过拟合。
-
模型优化后的代码:
# @function: mnist数据集 # @Description:一只萤火虫 from tensorflow import keras # 加载mnist数据集 (train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data() # 将图片由二维展开成一维,即one-hot编码 train_images = train_images.reshape((60000, 28 * 28)).astype('float') test_images = test_images.reshape((10000, 28 * 28)).astype('float') # print(train_images[0]) train_labels = keras.utils.to_categorical(train_labels) test_labels = keras.utils.to_categorical(test_labels) # print(train_labels[0]) # 搭优化神经网络模型 model = keras.Sequential() model.add(keras.layers.Dense(units=128, activation='relu', input_shape=(28*28, ), kernel_regularizer=keras.regularizers.l1(0.0001))) model.add(keras.layers.Dropout(0.01)) model.add(keras.layers.Dense(units=32, activation='relu', kernel_regularizer=keras.regularizers.l1(0.0001))) model.add(keras.layers.Dropout(0.01)) model.add(keras.layers.Dense(units=10, activation='softmax')) # 设置优化器和损失函数 model.compile(optimizer=keras.optimizers.RMSprop(lr=0.001), loss=keras.losses.categorical_crossentropy, metrics=['accuracy']) # 用fit函数训练网络,epochs表示训练的轮数,batch_size表示每次训练给多大的数据 model.fit(train_images, train_labels, epochs=20, batch_size=128, verbose=2) # 测试集上进行验证 test_loss, test_accuracy = model.evaluate(test_images, test_labels)
训练及测试结果:
Epoch 1/20 2021-06-23 10:36:08.883801: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll 2021-06-23 10:36:09.384156: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll 469/469 - 3s - loss: 2.4091 - accuracy: 0.6500 Epoch 2/20 469/469 - 2s - loss: 0.8640 - accuracy: 0.8288 …… …… Epoch 17/20 469/469 - 2s - loss: 0.2044 - accuracy: 0.9707 Epoch 18/20 469/469 - 2s - loss: 0.2053 - accuracy: 0.9704 Epoch 19/20 469/469 - 2s - loss: 0.2034 - accuracy: 0.9702 Epoch 20/20 469/469 - 2s - loss: 0.1981 - accuracy: 0.9717 313/313 [==============================] - 1s 3ms/step - loss: 0.2491 - accuracy: 0.9619
LeNet-5模型
# @function: LeNet-5卷积网络
# @Description:一只萤火虫
import time
from tensorflow import keras
# 开始时间
start = time.time()
# 加载数据集
(train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data()
# 将数据进行归一化处理,对标签施加one-hot编码
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float')/255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float')/255
train_labels = keras.utils.to_categorical(train_labels)
test_labels = keras.utils.to_categorical(test_labels)
# 构建LeNet-5网络
model = keras.Sequential([
keras.layers.Conv2D(filters=6, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
keras.layers.AveragePooling2D((2, 2)),
keras.layers.Conv2D(filters=16, kernel_size=(3, 3), activation='relu'),
keras.layers.AveragePooling2D((2, 2)),
keras.layers.Conv2D(filters=120, kernel_size=(3, 3), activation='relu'),
keras.layers.Flatten(),
keras.layers.Dense(84, activation='relu'),
keras.layers.Dense(10, activation='softmax'),
])
model.compile(optimizer=keras.optimizers.RMSprop(lr=0.001), loss=keras.losses.categorical_crossentropy,
metrics=['accuracy'])
# 训练网络,用fit函数,epochs表示训练多少个回合,batch_size表示每次训练给多大的数据
model.fit(train_images, train_labels, epochs=10, batch_size=128, verbose=2)
# 在测试集上验证模型
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print("\n", "test_loss:", test_loss, "test_accuracy:", test_accuracy)
# 结束时间
end = time.time()
print("\n", "Execution Time:", end - start, "s")
训练及测试结果:
Epoch 1/10
2021-06-23 10:50:59.639936: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-06-23 10:51:00.364340: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-06-23 10:51:00.375769: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-06-23 10:51:01.893352: I tensorflow/core/platform/windows/subprocess.cc:308] SubProcess ended with return code: 0
2021-06-23 10:51:01.963482: I tensorflow/core/platform/windows/subprocess.cc:308] SubProcess ended with return code: 0
469/469 - 9s - loss: 0.3203 - accuracy: 0.9038
……
……
Epoch 9/10
469/469 - 3s - loss: 0.0193 - accuracy: 0.9938
Epoch 10/10
469/469 - 3s - loss: 0.0169 - accuracy: 0.9945
313/313 [==============================] - 1s 4ms/step - loss: 0.0312 - accuracy: 0.9902
test_loss: 0.0312091913074255 test_accuracy: 0.9901999831199646
Execution Time: 48.035783767700195 s