首先看全连接神经网络存在的问题:
- 全连接有Flatten层,把28×28的图像(在横、纵方向都有信息的图像)压平成(None,784)的一维数据,减少了信息
- 全连接神经网络层数少,但参数太大,下一层的神经元都与上一层的神经元有连接
基于卷积神经网络的手写数字识别
1、加载数据
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import SGD,Adam
%matplotlib inline
(x_train,y_train),(x_test,y_test) = tf.keras.datasets.mnist.load_data(path='mnist.npz')
# 预处理
x_train, x_test = x_train / 255.0, x_test / 255.0 # 归一到[0,1]
y_train = tf.keras.utils.to_categorical(y_train) #转one-hot
y_test = tf.keras.utils.to_categorical(y_test)
2、搭建CNN模型
model = tf.keras.models.Sequential([
# 二维卷积,卷积核的大小为3×3,激活函数为relu
tf.keras.layers.Conv2D(16,(3,3), activation='relu', input_shape = (28,28,1)),
# 最大值池化
tf.keras.layers.MaxPooling2D((2,2)),
# 二次卷积,32个卷积核
tf.keras.layers.Conv2D(32,(3,3),activation='relu'),
tf.keras.layers.MaxPooling2D((2,2)),
# 三层卷积,16个3×3的卷积核
tf.keras.layers.Conv2D(16,(3,3),activation='relu'),
# Flatten
tf.keras.layers.Flatten(),
# 20 全连接层
tf.keras.layers.Dense(20,activation='relu'),
# 10 全连接层
tf.keras.layers.Dense(10,activation='softmax')
])
print(model.summary())
# Model: "sequential"
# _________________________________________________________________
# Layer (type) Output Shape Param #
# =================================================================
# conv2d (Conv2D) (None, 26, 26, 16) 160
# _________________________________________________________________
# max_pooling2d (MaxPooling2D) (None, 13, 13, 16) 0
# _________________________________________________________________
# conv2d_1 (Conv2D) (None, 11, 11, 32) 4640
# _________________________________________________________________
# max_pooling2d_1 (MaxPooling2 (None, 5, 5, 32) 0
# _________________________________________________________________
# conv2d_2 (Conv2D) (None, 3, 3, 16) 4624
# _________________________________________________________________
# flatten (Flatten) (None, 144) 0
# _________________________________________________________________
# dense (Dense) (None, 20) 2900
# _________________________________________________________________
# dense_1 (Dense) (None, 10) 210
# =================================================================
# Total params: 12,534
# Trainable params: 12,534
# Non-trainable params: 0
# _________________________________________________________________
# None
参数计算:
- 卷积第一层(16个3×3的卷积核):16 × 3 × 3 + 16 = 160
- 最大池化层(MaxPooling2D):0
- 卷积第二层(32个3×3的卷积核):3 × 3 × 16 × 32 + 32 = 4640
- 卷积第三层(16个3×3的卷积核):3 × 3 × 16 × 32 × 16+ 16 = 4624
- Flatten层:0
- 全连接第一层:144 × 20 + 20 = 2900
- 全连接第二层:20 × 10 + 10 = 210
- 总参数量:160 + 4640 + 4624 + 2900 + 210 = 12534
print(x_train.shape)
x_train = x_train.reshape(-1,28,28,1) # -1 代表的是图像的个数
x_train.shape
# out>>>
# (60000, 28, 28)
# (60000, 28, 28, 1)
x_test = x_test.reshape(-1,28,28,1)
x_test.shape
# out>>> (10000, 28, 28, 1)
3、训练模型
# 设置优化器和损失函数
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss = tf.keras.losses.categorical_crossentropy,
metrics=['acc']
)
history = model.fit(x_train,y_train,epochs=10)
history.history['acc']
# out>>>
# [0.9101667,
# 0.9771,
# 0.9823,
# 0.98545,
# 0.9877,
# 0.98895,
# 0.99011666,
# 0.991,
# 0.99233335,
# 0.99245]
# 可视化
plt.plot(history.history['acc'])
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.show()