LeNet-5模型及TensorFlow2.9.0的实现
LeNet-5模型
LeNet-5共5各部分组成,为两个卷积池化组+3个全连接神经网络组成
其中池化层和激活层无需要学习的参数,所以也可以称做5层神经网络
基于LeNet-5模型写了TensorFlow代码实现
第一层卷积池化组
输入为原始的图像,原始图像的尺寸为
32
∗
32
∗
1
32*32*1
32∗32∗1
卷积层的过滤器尺寸为
5
∗
5
5*5
5∗5,卷积核为6个,填充为0,步长为1
该层的输出的尺寸为
(
32
−
5
+
0
)
/
1
+
1
=
28
(32-5+0)/1+1=28
(32−5+0)/1+1=28,通道数为6
这一个卷积层总共有
1
∗
5
∗
5
∗
6
+
6
=
156
1*5*5*6+6=156
1∗5∗5∗6+6=156个参数,其中6为偏置项参数个数
该层输出为
28
∗
28
∗
6
28*28*6
28∗28∗6
该卷积层激活函数使用Relu
(子采样)该层池化使用2*2的最大池化,填充为0,步长为2,进行池化
池化后输出为
14
∗
14
∗
6
−
>
(
(
28
−
2
+
0
)
/
2
+
1
)
14*14*6->((28-2+0)/2+1)
14∗14∗6−>((28−2+0)/2+1)
第二层卷积池化组
输入为第一层输出的特征图,尺寸为
14
∗
14
∗
6
14*14*6
14∗14∗6
卷积层的过滤器尺寸为
5
∗
5
5*5
5∗5,卷积核为16个,填充为0,步长为1
该层的输出的尺寸为
(
14
−
5
+
0
)
/
1
+
1
=
10
(14-5+0)/1+1=10
(14−5+0)/1+1=10,通道数为16
这一个卷积层总共有
6
∗
5
∗
5
∗
16
+
16
=
2
,
416
6*5*5*16+16=2,416
6∗5∗5∗16+16=2,416个参数,其中16为偏置项参数个数
该层输出为
10
∗
10
∗
16
10*10*16
10∗10∗16
该卷积层激活函数使用Relu
(子采样)该层池化使用2*2的最大池化,填充为0,步长为2,进行池化
池化后输出为
5
∗
5
∗
16
−
>
(
(
10
−
2
+
0
)
/
2
+
1
)
5*5*16->((10-2+0)/2+1)
5∗5∗16−>((10−2+0)/2+1)
第三层全连接层
输入为第二层的输出的特征图,尺寸为
5
∗
5
∗
16
5*5*16
5∗5∗16
全连接层首先拉伸特征图为一维数据,然后传入全连接层,拉伸后传入特征个数为
5
∗
5
∗
16
=
400
5*5*16=400
5∗5∗16=400个
本全连接层(以下简称fc-1层)包含神经元120个
参数个数为:
400
∗
120
+
120
=
48
,
120
400*120+120=48,120
400∗120+120=48,120
该全连接层激活函数使用Relu
输出为120个特征值
第四层全连接层
输入为第三层的输出的特征值,共120个
本全连接层(以下简称fc-2层)包含神经元84个
参数个数为:
120
∗
84
+
84
=
10
,
164
120*84 + 84=10,164
120∗84+84=10,164
该卷积层激活函数使用Relu
输出为84个特征值
第五层全连接层
输入为第三层的输出的特征值,共120个
本全连接层(以下简称fc-3层)包含神经元10个(该层位输出层,神经元个数与分类个数一致0-9共10个类别)
参数个数为:
84
∗
10
+
10
=
850
84*10+10=850
84∗10+10=850
该卷积层激活函数使用softmax
输出为10个概率值,分别对应10各类别的概率情况
实现代码如下
import tensorflow as tf
from tensorflow.keras import Sequential, layers, optimizers,models
import numpy as np
print(tf.__version__)
# 加载数据集
mnist = tf.keras.datasets.mnist
(trainImage, trainLabel),(testImage, testLabel) = mnist.load_data()
for i in [trainImage,trainLabel,testImage,testLabel]:
print(i.shape)
2.9.0
(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)
MNIST 数据集由手写数字组成,每个数字都是一个灰度图像,尺寸为
28
∗
28
28*28
28∗28 像素。
这个尺寸是标准化的,主要考虑到存储空间、计算资源以及数据集的易用性。
由于本文主旨在于复现LeNet网络,故用填充或放大的方式将图片还原至
32
∗
32
32*32
32∗32尺寸
# 划分数据集,整理输入图像格式
# 扩展维度以匹配 Conv2D 层的输入 (28, 28) -> (28, 28, 1)
x_train = trainImage[..., np.newaxis]
x_test = testImage[..., np.newaxis]
# x_train = tf.reshape(trainImage,(60000,28,28,1))
# x_test = tf.reshape(testImage,(10000,28,28,1))
for i in [x_train, trainLabel, x_test, testLabel]:
print(i.shape)
(60000, 28, 28, 1)
(60000,)
(10000, 28, 28, 1)
(10000,)
# 零填充 (Zero Padding):
x_train_padded = np.pad(x_train, ((0, 0), (2, 2), (2, 2), (0, 0)), mode='constant', constant_values=0)
x_test_padded = np.pad(x_test, ((0, 0), (2, 2), (2, 2), (0, 0)), mode='constant', constant_values=0)
# 使用 TensorFlow 将 28x28 的图像缩放到 32x32
x_train_resized = tf.image.resize(x_train, [32, 32]).numpy()
x_test_resized = tf.image.resize(x_test, [32, 32]).numpy()
for i in [x_train_padded, trainLabel, x_test_padded, testLabel]:
print(i.shape)
for i in [x_train_resized, trainLabel, x_test_resized, testLabel]:
print(i.shape)
(60000, 32, 32, 1)
(60000,)
(10000, 32, 32, 1)
(10000,)
(60000, 32, 32, 1)
(60000,)
(10000, 32, 32, 1)
(10000,)
#网络定义
network = models.Sequential([
# 第一层卷积池化组
layers.Conv2D(6,
(5, 5),
activation='relu',# activation 激活函数,如 'relu', 'sigmoid', 'tanh' 等
input_shape=(32, 32, 1),# input_shape:输入形状,通常在模型的第一层需要指定。
padding='valid',# padding 边缘填充方式,可选 'valid'(不填充)或 'same'(填充以保持输入尺寸)
strides=(1, 1)),
layers.MaxPool2D((2, 2), strides=(2, 2), padding='valid'),
# 第二层卷积池化组
layers.Conv2D(16, (5, 5), activation='relu', padding='valid', strides=(1, 1)),
layers.MaxPool2D((2, 2), strides=(2, 2), padding='valid'),
# 拉伸为一维数据,方便给全连接神经网络
layers.Flatten(),
# 第一层全连接神经网络
layers.Dense(120, activation='relu'),
# 第二层全连接神经网络
layers.Dense(84, activation='relu'),
# 第三层全连接神经网络,输出层
layers.Dense(10, activation='softmax')
])
network.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_2 (Conv2D) (None, 28, 28, 6) 156
max_pooling2d_1 (MaxPooling (None, 14, 14, 6) 0
2D)
conv2d_3 (Conv2D) (None, 10, 10, 16) 2416
max_pooling2d_2 (MaxPooling (None, 5, 5, 16) 0
2D)
flatten_1 (Flatten) (None, 400) 0
dense_3 (Dense) (None, 120) 48120
dense_4 (Dense) (None, 84) 10164
dense_5 (Dense) (None, 10) 850
=================================================================
Total params: 61,706
Trainable params: 61,706
Non-trainable params: 0
_________________________________________________________________
# 编译模型
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)#定义学习率
network.compile(optimizer=optimizer, # 定义优化器
loss='sparse_categorical_crossentropy', # 定义损失函数
metrics=['accuracy'])# 定义评估指标
# 训练模型
network.fit(x_train_padded, trainLabel, epochs=5, validation_split=0.1)#使用CPU跑的,跑个5轮意思一下
Epoch 1/5
1688/1688 [==============================] - 26s 15ms/step - loss: 0.4310 - accuracy: 0.9133 - val_loss: 0.1263 - val_accuracy: 0.9650
Epoch 2/5
1688/1688 [==============================] - 28s 17ms/step - loss: 0.0927 - accuracy: 0.9711 - val_loss: 0.0691 - val_accuracy: 0.9797
Epoch 3/5
1688/1688 [==============================] - 26s 16ms/step - loss: 0.0693 - accuracy: 0.9788 - val_loss: 0.0720 - val_accuracy: 0.9790
Epoch 4/5
1688/1688 [==============================] - 25s 15ms/step - loss: 0.0582 - accuracy: 0.9822 - val_loss: 0.0555 - val_accuracy: 0.9850
Epoch 5/5
1688/1688 [==============================] - 26s 16ms/step - loss: 0.0504 - accuracy: 0.9848 - val_loss: 0.0521 - val_accuracy: 0.9867
<keras.callbacks.History at 0x1610265b320>
# 评估模型
test_loss, test_acc = network.evaluate(x_test_padded, testLabel)
print(f'Test loss: {test_loss}')
print(f'Test accuracy: {test_acc}')
313/313 [==============================] - 2s 6ms/step - loss: 0.0680 - accuracy: 0.9805
Test loss: 0.06804163753986359
Test accuracy: 0.9804999828338623
# 模型保存
network.save('./LeNet_model/lenet_mnist')
print('lenet_mnist model saved')
del network
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 2 of 2). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: ./LeNet_model/lenet_mnist\assets
INFO:tensorflow:Assets written to: ./LeNet_model/lenet_mnist\assets
lenet_mnist model saved
#加载模型
import tensorflow as tf
from tensorflow import keras
# 网络加载
model = keras.models.load_model('./LeNet_model/lenet_mnist')
model.summary()
# 继续训练加载的模型
# 假设 train_images 和 train_labels 是训练数据
# train_images 形状为 (None, 32, 32, 1)
# train_labels 形状为 (None,)
# model.fit(train_images, train_labels, epochs=3)
# 评估加载的模型
# 假设 test_images 和 test_labels 是测试数据
# test_images 形状为 (None, 32, 32, 1)
# test_labels 形状为 (None,)
# test_loss, test_acc = model.evaluate(test_images, test_labels)
# print(f'Test accuracy: {test_acc}')
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_2 (Conv2D) (None, 28, 28, 6) 156
max_pooling2d_1 (MaxPooling (None, 14, 14, 6) 0
2D)
conv2d_3 (Conv2D) (None, 10, 10, 16) 2416
max_pooling2d_2 (MaxPooling (None, 5, 5, 16) 0
2D)
flatten_1 (Flatten) (None, 400) 0
dense_3 (Dense) (None, 120) 48120
dense_4 (Dense) (None, 84) 10164
dense_5 (Dense) (None, 10) 850
=================================================================
Total params: 61,706
Trainable params: 61,706
Non-trainable params: 0
_________________________________________________________________
# 使用加载的模型预测数据
# 假设 new_images 是需要预测的新数据
# new_images 形状为 (num_samples, 32, 32, 1)
predictions = model.predict(x_test_padded)
# predictions 形状为 (num_samples, num_classes),每个样本对应的预测概率
print(predictions.shape)
print(predictions[0])
# 如果你需要具体的类别标签,可以使用以下方法
predicted_classes = tf.argmax(predictions, axis=1)
print(predicted_classes.shape)
print(predicted_classes[0])
print(testLabel[0])
313/313 [==============================] - 2s 6ms/step
(10000, 10)
[1.3117158e-14 1.0117748e-10 3.0029070e-11 7.0179540e-10 1.4930468e-13
6.9455276e-13 9.2901420e-16 1.0000000e+00 1.1821545e-09 2.3939153e-08]
(10000,)
tf.Tensor(7, shape=(), dtype=int64)
7