1.问题定义
(1)读取手写阿拉伯数字的影像,将影像中的每一个像素当作一个特征,数据为MNIST机构所收集的60000个训练数据,另外还包含了10000个测试数据,每一个为一个阿拉伯数字、宽高为(28,28)的为图形。
(2)建立神经网络模型,利用梯度下降法,求解模型的参数值,一般称为权重。
(3)依照模型推算每一个影像为0~9的概率,再以最大概率者为预测结果。
2.开始撸代码
机器学习流程步骤图如下:
机器学习流程步骤
手写阿拉伯数字识别-完整版
第一步:加载MNIST手写阿拉伯数字数据集
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
plt.switch_backend('TkAgg')
mnist = tf.keras.datasets.mnist
#导入MNIST手写阿拉伯数字数据集
(x_train,y_train),(x_test,y_text) = mnist.load_data()
#训练/测试数据的 x/y 维度
print(x_train.shape,y_train.shape,x_test.shape,y_text.shape)
执行结果:取得60000个训练数据,10000个测试数据,每个数据就是一个阿拉伯数字,宽高为(28,28)的位形图,要注意数据的维度及其大小必须与模型输入的规格契合,执行结果如下:
(60000, 28, 28) (60000,) (10000, 28, 28) (10000,)
第二步:对数据集进行探索分析
#训练数据集前10张图片的数字
print(y_train[:10])
#显示第一张图片像素数据
print(x_train[0])
执行结果如下:每个像素的值再(0,255)之间,位灰阶影像,0为白色,255为黑色。注意:这与RGB色码相反,RGB中0为黑色,255为白色。
[[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 3 18 18 18 126 136 175 26 166
255 247 127 0 0 0 0]
[ 0 0 0 0 0 0 0 0 30 36 94 154 170 253 253 253 253 253 225 172 253
242 195 64 0 0 0 0]
[ 0 0 0 0 0 0 0 49 238 253 253 253 253 253 253 253 253 251 93 82 82 56
39 0 0 0 0 0]
[ 0 0 0 0 0 0 0 18 219 253 253 253 253 253 198 182 247 241 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 80 156 107 253 253 205 11 0 43 154 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 14 1 154 253 90 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 139 253 190 2 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 11 190 253 70 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 35 241 225 160 108 1 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 81 240 253 253 119 25 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 45 186 253 253 150 27 0
0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 93 252 253 187 0
0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 249 253 249 64
0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 46 130 183 253 253 207 2
0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 39 148 229 253 253 253 250 182 0
0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 24 114 221 253 253 253 253 201 78 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 23 66 213 253 253 253 253 198 81 2 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 18 171 219 253 253 253 253 195 80 9 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 55 172 226 253 253 253 253 244 133 11 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 136 253 253 253 212 135 132 16 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]]
将非0的数字转为1,变为黑白图
#将非0的数字转为1
data = x_train[0].copy()
data[data>0] = 1
#将转换后二维内容显示出来
text_image = []
for i in range(data.shape[0]):
text_image.append(''.join(str(data[i])))
print(text_image)
['[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0]',
'[0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0]',
'[0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0]',
'[0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]',
'[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]']
显示第一个训练数据的图像
#第一批数据
x2 = x_train[0,:,:]
#绘制点阵图‘camp=gray’:灰阶
plt.imshow(x2.reshape(28,28),cmap='gray')
#隐藏刻度
plt.axis('off')
plt.show()
第三步:进行特征工程,将特征缩放至(0,1)区间,特征缩放可以提高模型准确度,并且加快收敛速度。特征缩放采用正态化公(Normalization)式
#特征缩放,使用正态化,公式=(x-min)/(max-min)
x_train_norm,x_test_norm = x_train / 255.0,x_test / 255.0
print(x_test_norm[0])
[[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0.32941176 0.7254902
0.62352941 0.59215686 0.23529412 0.14117647 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0.87058824 0.99607843
0.99607843 0.99607843 0.99607843 0.94509804 0.77647059 0.77647059 0.77647059 0.77647059
0.77647059 0.77647059 0.77647059 0.77647059 0.66666667 0.20392157 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0.2627451 0.44705882
0.28235294 0.44705882 0.63921569 0.89019608 0.99607843 0.88235294 0.99607843 0.99607843
0.99607843 0.98039216 0.89803922 0.99607843 0.99607843 0.54901961 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.06666667 0.25882353 0.05490196 0.2627451 0.2627451
0.2627451 0.23137255 0.08235294 0.9254902 0.99607843 0.41568627 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0.3254902 0.99215686 0.81960784 0.07058824 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0.08627451 0.91372549 1. 0.3254902 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0.50588235 0.99607843 0.93333333 0.17254902 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0.23137255 0.97647059 0.99607843 0.24313725 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0.52156863 0.99607843 0.73333333 0.01960784 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.03529412
0.80392157 0.97254902 0.22745098 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.49411765
0.99607843 0.71372549 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0.29411765 0.98431373
0.94117647 0.22352941 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0.0745098 0.86666667 0.99607843
0.65098039 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.01176471 0.79607843 0.99607843 0.85882353
0.1372549 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.14901961 0.99607843 0.99607843 0.30196078
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.12156863 0.87843137 0.99607843 0.45098039 0.00392157
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.52156863 0.99607843 0.99607843 0.20392157 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0.23921569 0.94901961 0.99607843 0.99607843 0.20392157 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0.4745098 0.99607843 0.99607843 0.85882353 0.15686275 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0.4745098 0.99607843 0.81176471 0.07058824 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
[0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]]
Process finished with exit code 0
第四步:数据集分割为训练集和测试集,因为MNIST已经将数据集分割好了,所以这一步不需要进行。
第五步:建立模型结构
Keras提供了两类模型,包括顺序型模型(Sequential Model)及Functional API模型。顺序型模型函数为tf.keras.models.Sequential,适用于简单的结构,神经层一层接一层地顺序执行;使用Function API模型可以设计较复杂的模型结构,包括多个输入层或多个输出层,也允许分叉。本次使用的顺序型模型。
#建立模型
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28,28)),
tf.keras.layers.Dense(128,activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10,activation='softmax')
])
①扁平层(Flatten Layers):将宽高各28像素的图压扁成为一维数组(28*28=784个特征)
②全连接层(Dense Layers):输入为上一层的输出,输出为128个神经元,即构成128条回归线,每一条回归线有784个特征。输出通常定为4的倍数。
③Dropout Layers:类似与正则化,希望避免过度拟合,在训练周期随机丢弃一定比例(0.2)的神经元,一方面可以估计较少的参数,另一方面能够取得多个模型的均值,避免收到极端值影响,借以矫正过度拟合的现象。
④第二个全连接层:为输出层,因为要辨识0~9这十个数字,故输出要设为10,通过Softmax Activation Function,可以将输出转为概率形式,即预测0~9的个别概率,再从中选择最大概率者作为预测值。
第六步:定义优化器、损失函数、效果衡量指标类别
#设定损失函数、优化器、效果衡量指标
model.compile(optimizer='adam',
loss = 'sparse_categorical_crossentropy',
metrics=['accuracy'])
第七步:模型训练
#模型训练
history = model.fit(x_train_norm,y_train,epochs=5,validation_split=0.2)
① validation split:将训练集切割一部分为验证数据集,设为0.2,在训练过程中通过验证集计算准确度以及损失函数值。
② epochs:设定训练要执行的周期,所有数据及经过一次正向和反向传导称为一个执行周期。
③ 执行结果如下:每一个执行周期都包含训练的损失、准确率以及验证集的损失值、准确率。
Epoch 1/5
1500/1500 [==============================] - 8s 5ms/step - loss: 0.3246 - accuracy: 0.9071 - val_loss: 0.1548 - val_accuracy: 0.9568
Epoch 2/5
1500/1500 [==============================] - 7s 5ms/step - loss: 0.1555 - accuracy: 0.9538 - val_loss: 0.1183 - val_accuracy: 0.9648
Epoch 3/5
1500/1500 [==============================] - 7s 4ms/step - loss: 0.1172 - accuracy: 0.9646 - val_loss: 0.1081 - val_accuracy: 0.9686
Epoch 4/5
1500/1500 [==============================] - 6s 4ms/step - loss: 0.0956 - accuracy: 0.9706 - val_loss: 0.0949 - val_accuracy: 0.9721
Epoch 5/5
1500/1500 [==============================] - 6s 4ms/step - loss: 0.0802 - accuracy: 0.9755 - val_loss: 0.0862 - val_accuracy: 0.9747
对训练过程的准确率绘图
plt.figure(figsize=(8,6))
plt.plot(history.history['accuracy'],'r',label='训练准确率')
plt.plot(history.history['val_accuracy'],'g',label='验证准确率')
plt.legend()
plt.show()
对训练过程的损失绘图
plt.figure(figsize=(8,6))
plt.plot(history.history['accuracy'],'r',label='训练准确率')
plt.plot(history.history['val_accuracy'],'g',label='验证准确率')
plt.legend()
plt.show()
第八步:评分,使用evaluate()函数,输入测试集数据,会计算出损失及准确率
#评分
score = model.evaluate(x_test_norm,y_text,verbose=0)
for i,x in enumerate(score):
print(f'{model.metrics_names[i]}:{score[i]:.4f}')
执行结果为:loss:0.0763 accuracy:0.9781
完整代码:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
plt.switch_backend('TkAgg')
mnist = tf.keras.datasets.mnist
#导入MNIST手写阿拉伯数字数据集
(x_train,y_train),(x_test,y_text) = mnist.load_data()
#训练/测试数据的 x/y 维度
print(x_train.shape,y_train.shape,x_test.shape,y_text.shape)
#训练数据集前10张图片的数字
# print(y_train[:10])
# print(x_train[0])
#将非0的数字转为1
data = x_train[0].copy()
data[data>0] = 1
#将转换后二维内容显示出来
text_image = []
for i in range(data.shape[0]):
text_image.append(''.join(str(data[i])))
print(text_image)
#第一批数据
x2 = x_train[0,:,:]
#绘制点阵图‘camp=gray’:灰阶
plt.imshow(x2.reshape(28,28),cmap='gray')
#隐藏刻度
plt.axis('off')
plt.show()
#特征缩放,使用正态化,公式=(x-min)/(max-min)
x_train_norm,x_test_norm = x_train / 255.0,x_test / 255.0
print(x_test_norm[0])
#建立模型
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28,28)),
tf.keras.layers.Dense(128,activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10,activation='softmax')
])
#设定损失函数、优化器、效果衡量指标
model.compile(optimizer='adam',
loss = 'sparse_categorical_crossentropy',
metrics=['accuracy'])
#模型训练
history = model.fit(x_train_norm,y_train,epochs=5,validation_split=0.2)
#对训练过程的准确率绘图
plt.rcParams['font.sans-serif'] = ['Microsoft JhengHei']
plt.rcParams['axes.unicode_minus'] = False
#准确率绘图
plt.figure(figsize=(8,6))
plt.plot(history.history['loss'],'r',label='训练准确率')
plt.plot(history.history['val_loss'],'g',label='验证准确率')
plt.legend()
plt.show()
#损失值绘图
# plt.figure(figsize=(8,6))
# plt.plot(history.history['loss'],'r',label='训练损失')
# plt.plot(history.history['val_loss'],'g',label='验证损失')
# plt.legend()
# plt.show()
#评分
score = model.evaluate(x_test_norm,y_text,verbose=0)
for i,x in enumerate(score):
print(f'{model.metrics_names[i]}:{score[i]:.4f}')
好啦,以上就是基于Tensorflow识别手写阿拉伯数字的案例,大家可以相互讨论和学习,里面有什么错误的地方也欢迎纠错指正~~~