Tensorflow mnist手写体识别

最新推荐文章于 2022-12-23 23:45:26 发布

这个昵称叫什么好呢

最新推荐文章于 2022-12-23 23:45:26 发布

阅读量193

点赞数

文章标签： tensorflow 神经网络深度学习

本文链接：https://blog.csdn.net/weixin_36342174/article/details/107857621

版权

数据集介绍：

mnist数据集是有纽约大学杨乐昆（Yann LeCun）教授在文章：
Gradient-based learning applied to document recognition.
中提出的，是机器学习/模式识别的最经典的入门级数据集之一，相当于深度学习界的Hello World。数据集官网地址：下载。数据集有四个文档：在这里插入图片描述
分别是：训练集图像60000张及相应标签，测试集图像10000张及相应标签。

数据集加载：

1、利用tensorflow封装代码读取

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/home/xxxx/Downloads/Mnist", one_hot=True)

"/home/xxxx/Downloads/Mnist"是下载的数据集存放路径，读入数据如下图：
在这里插入图片描述
可见训练集55000、测试集10000和验证集5000，数据类型为ndarray。
取前15张图片可视化：

import matplotlib.pyplot as plt
import numpy as np
for i in range(15):
     plt.subplot(3,5,i+1)
     plt.imshow(mnist.train.images[i].reshape(28,28),cmap = 'Greys')
     plt.title('Label:'+str(np.nonzero(mnist.train.labels[i])[0][0]))
     plt.xticks([])
     plt.yticks([])
plt.show()

结果如下：
在这里插入图片描述

2、自定义函数读取

def load_data(data_folder):

  files = [
      'train-labels-idx1-ubyte.gz', 'train-images-idx3-ubyte.gz',
      't10k-labels-idx1-ubyte.gz', 't10k-images-idx3-ubyte.gz'
  ]

  paths = []
  for fname in files:
    paths.append(os.path.join(data_folder,fname))

  with gzip.open(paths[0], 'rb') as lbpath:
    y_train = np.frombuffer(lbpath.read(), np.uint8, offset=8)

  with gzip.open(paths[1], 'rb') as imgpath:
    x_train = np.frombuffer(
        imgpath.read(), np.uint8, offset=16).reshape(len(y_train), 28, 28)

  with gzip.open(paths[2], 'rb') as lbpath:
    y_test = np.frombuffer(lbpath.read(), np.uint8, offset=8)

  with gzip.open(paths[3], 'rb') as imgpath:
    x_test = np.frombuffer(
        imgpath.read(), np.uint8, offset=16).reshape(len(y_test), 28, 28)

  return (x_train, y_train), (x_test, y_test)

(train_images, train_labels), (test_images, test_labels) = load_data('/home/xxxx/Downloads/Mnist')

'/home/xxxx/Downloads/Mnist’是下载的数据集存放路径，读入数据分为训练集60000和测试集10000，取前15张图片可视化：

 fig = plt.figure()
 for i in range(15):
     plt.subplot(3,5,i+1)
     plt.imshow(train_images[i], cmap='Greys')       # 使用灰度图
     plt.title("Label:" + str(train_labels[i]))      # 设置标签
     # 删除x, y轴标记
     plt.xticks([])
     plt.yticks([])
 plt.show()

结果如下：
在这里插入图片描述
看起来两张图片并不相同，那是因为自定义的数据集读取函数中没有验证集，如果将验证集的前15张图片可视化一下：

for i in range(15):
     plt.subplot(3,5,i+1)
     plt.imshow(mnist.validation.images[i].reshape(28,28),cmap = 'Greys')
     plt.title('Label:'+str(np.nonzero(mnist.validation.labels[i])[0][0]))
     plt.xticks([])
     plt.yticks([])
plt.show()

结果如下：
在这里插入图片描述
是不是就相同了，哈哈

还有其他方式自定义函数读取数据集，要注意的是将读取数据转化为ndarray类型，可视化图片须是（28x28）的，而神经网络需要的是28x28=784维的数据

数据规范化

将训练图像转为28×28=784维，即60000×784的矩阵，测试集图像转为10000×784的矩阵，将标签转为独热吗（one-hot encode）,如果采用方法一，这个步骤不需要：

train_images = train_images.reshape(60000, 784).astype('float32')
test_images = test_images.reshalabel, count = np.unique(train_labels, 

n_classes = 10
train_labels = utils.to_categorical(train_labels, n_classes)
test_labels = utils.to_categorical(test_labels, n_classes)

数据归一化（可选）：

train_images /= 255
test_images /= 255

至此，数据准备工作就完成了，接下来就是建立模型并训练了

定义网络

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras import regularizers

model = Sequential()
model.add(Dense(500, activation='relu', kernel_regularizer=regularizers.l2(0.001), name="Dense_1",input_shape=(784,)))
model.add(Dropout(0.1))
model.add(Dense(10, activation='softmax', kernel_regularizer=regularizers.l2(0.001), name="Dense_2"))

采用Sequential定义网络，Dense表示全链接，输入层784个神经元，隐藏层500个神经元，激活函数采用relu，regularizers以及Dropout定义等，这里要定义一个name，以便模型保存和读取，输出层使用softmax函数，来统计输出的概率

训练网络

编译模型

model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.001))

训练模型

history = model.fit(train_images, train_labels, batch_size=128, epochs=5, verbose=2, validation_data=[test_images,test_labels])

模型训练次数epochs，为了降低损失值，可以多训练几次，这里损失值定义为交叉熵

可视化网络指标

 fig = plt.figure()
 plt.subplot(2,1,1)
 plt.plot(history.history['acc'])
 plt.plot(history.history['val_acc'])
 plt.title('Model Accuracy')
 plt.ylabel('accuracy')
 plt.xlabel('epoch')
 plt.legend(['train','test'], loc='lower right')

 plt.subplot(2,1,2)
 plt.plot(history.history['loss'])
 plt.plot(history.history['val_loss'])
 plt.title('Model Loss')
 plt.ylabel('loss')
 plt.xlabel('epoch')
 plt.legend(['train','test'], loc='upper right')
 plt.tight_layout()

 plt.show()

如下图所示：
在这里插入图片描述

模型保存

model_dir = '/home/xxxx/Documents/my_model_weights-01.h5'
model.save(model_dir)

'/home/xxxx/Documents/my_model_weights-01.h5’为保存路径，保存为.h5文件，可使用HDFView来查看，如下图：
在这里插入图片描述

预测

测试图像读入：

def read_images(dir):
    img = Image.open(dir).convert('L')
    img1 = np.array(img.resize((28,28))).astype(np.float32)
    threshold = 50
    for i in range(28):
        for j in range(28):
            img1[i][j] = 255 - img1[i][j]
            if(img1[i][j]<threshold):
                img1[i][j]=0
            else:
                img1[i][j]=255
    nm_img = img1.reshape([1,784]).astype(np.float32)
    img_ready = np.multiply(nm_img,1.0/255.0)

    return img_ready

image_dir = ['/home/xxxx/Documents/pics/0.png',
        '/home/xxxx/Documents/pics/2.bmp',
        '/home/xxxx/Documents/pics/2.png',
        '/home/xxxx/Documents/pics/3.png',
        '/home/xxxx/Documents/pics/4.bmp',
        '/home/xxxx/Documents/pics/4.png',
        '/home/xxxx/Documents/pics/5.png',
        '/home/xxxx/Documents/pics/6.bmp',
        '/home/xxxx/Documents/pics/7.png',
        '/home/xxxx/Documents/pics/9.png']

image = np.zeros((10,784))
for i in range(len(image_dir)):
    image[i] = read_images(image_dir[i]).astype(np.float32)

    if (i == 1 or i == 4 or i == 7):        # .bmp格式图片处理
        image[i] = 1- image[i]

读取的图像为白底黑字，只有0、1两种数据，而训练数据为0～1之间的小数（因为除了一个255），我觉得可以将训练数据也处理成为0、1这种数据，来提高准确率。
因为我的测试图片中有三张是.bmp格式，即黑底白字，所以做了一个反转。

模型导入：

from tensorflow.keras.models import load_model
load_model = load_model(model_dir)

预测：

predict = load_model.predict(image)

得到的是测试图片对应输出的概率值，取概率最大的值为预测值

idx = np.argmax(predict,axis=1)

输出预测结果

print('第一张图片0被识别为：',idx[0])
print('第二张图片2被识别为：',idx[1])
print('第三张图片2被识别为：',idx[2])
print('第四张图片3被识别为：',idx[3])
print('第五张图片4被识别为：',idx[4])
print('第六张图片4被识别为：',idx[5])
print('第七张图片5被识别为：',idx[6])
print('第八张图片6被识别为：',idx[7])
print('第九张图片7被识别为：',idx[8])
print('第十张图片9被识别为：',idx[9])

结果如下图：
在这里插入图片描述
这个预测结果是训练500次的结果。

整个过程就这样，刚入门的大白写得不好的地方毋怪

这个昵称叫什么好呢

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Tensorflow mnist手写体识别

数据集介绍：mnist数据集是有纽约大学杨乐昆（Yann LeCun）教授在文章：Gradient-based learning applied to document recognition.中提出的，是机器学习/模式识别的最经典的入门级数据集之一，相当于深度学习界的Hello World。数据集官网地址：下载。数据集有四个文档：分别是：训练集图像60000张及相应标签，测试集图像10000张及相应标签。数据集加载：1、利用tensorflow封装代码读取from tensorflow.e
复制链接

扫一扫