全连接神经网络——MNIST手写数字识别

最新推荐文章于 2024-02-03 23:07:46 发布

爱吃菠菜

最新推荐文章于 2024-02-03 23:07:46 发布

阅读量2k

点赞数 1

分类专栏： tensorflow 文章标签： mnist手写数字识别 Tensorflow 卷积神经网络

本文链接：https://blog.csdn.net/weixin_43765314/article/details/86770741

版权

tensorflow 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

MNIST数据集

mnist数据集：包含7万张黑底白字手写图片，其中55000张为训练集，5000张为验证集，1000张为测试集。每张图片大小为28×28像素，图片中纯黑像素值为0,纯白像素值为1。0~1之间的浮点数越接近1,颜色越白。数据集的标签是长度为10的一维数组，数组中每个元素索引号表示对应数字出现的概率。

先把每张图片变为长度为28×28=784的一维数组输入神经网络，每张图片有对应的长度为10的一维数组(784组特征，10个标签)

比如：

图片[0. 0. 0. 0. 0.231 0.235 0.459.......0.219 0. 0.0.0.] 对应的标签为 [0,0,0,0,0,0,1,0,0,0],标签中索引号为6的元素为1,表示数字6出现的概率为100%，则图片对应的识别结果为6.

一些用到的知识：

#以读热码形式读入数据集，并将mnist分为训练集train、验证集validation、测试集test存放:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./data/", one_hot=True)

tf.get_collection("")	从collection集合中取出全部变量生成一个列表
tf.add_n([ ])	列表内对应元素相加
tf.cast(x,dtype)	把x转换为dtype类型
tf.argmax(x,axis)	返回最大值索引号，axis为维度
os.path.join()	把参数字符串按路径命名规则拼接
字符串.splite("")	按括号中的拆分符拆分字符串，并生成一个列表
tf.Graph().as_default() as g:	将在Graph()内定义的节点加入到计算图g中

在反向传播中，间隔一定轮数保存一次神经网络模型，并产生三个文件（保存当前图结构的.meta文件、保存当前参数名的.index文件、保存当前参数的.data文件）可表示为:

saver = tf.train.Saver()#实例化saver，保存模型
    with tf.Session() as sess:       
        for i in range(STEPS):         
            if i % 1000 == 0:
                saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=global_step)

测试网络效果时，需要将训练好的神经网络模型加载，可表示为：

ckpt = tf.train.get_checkpoint_state(存储路径)
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess, ckpt.model_checkpoint_path)

加载参数中的滑动平均值：

ema = tf.train.ExponentialMovingAverage(滑动平均基数)
ema_restore = ema.variables_to_restore()
saver = tf.train.Saver(ema_restore)

通过计算在一组数据上的准确率，评估神经网络的效果：

correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

分析一下这个过程：tf.argmax()返回了y与y_的最大值索引号，即预测结果和标准答案，tf.equal()比较二者，相同返回1,不同返回0，tf.reduce_mean()对处理后(布尔类型转为浮点型)的值求平均，就得出了准确率。

举个例子：如果有两组数据，一组预测错了，tf.equal()返回0,另一组预测对了，tf.equal()返回1,对他们求平均为0.5,正确率为0.5.

全连接神经网络

前向传播模块mnist_forward.py:

import tensorflow as tf

INPUT_NODE = 784   #图像有784组特征
OUTPUT_NODE = 10   #输出为0～9，共10个标签
LAYER1_NODE = 500  #隐藏层有500个节点

#定义参数
def get_weight(shape, regularizer):
    w = tf.Variable(tf.truncated_normal(shape,stddev=0.1))#生成去掉过大偏离点的正态分布随机数，标准差为0.1
    if regularizer != None: tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w))#如果有正则化系数，对参数实施L2正则化（参数绝对值的平方和）。
    return w   
#函数返回的w是该层网络的参数列表

#定义偏置
def get_bias(shape):  
    b = tf.Variable(tf.zeros(shape))#生成全0数组  
    return b         
#函数返回的b是该层的偏置列表

#搭建计算图	
def forward(x, regularizer):
    w1 = get_weight([INPUT_NODE, LAYER1_NODE], regularizer)
    b1 = get_bias([LAYER1_NODE])
    y1 = tf.nn.relu(tf.matmul(x, w1) + b1)     #过激活函数，提高泛化性

    w2 = get_weight([LAYER1_NODE, OUTPUT_NODE], regularizer)
    b2 = get_bias([OUTPUT_NODE])
    y = tf.matmul(y1, w2) + b2              #输出不过激活函数
    return y 
#函数返回的y是形状为[x的行数,10]的列表

反向传播模块mnist_backward.py:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import mnist_forward
import os

BATCH_SIZE = 200
LEARNING_RATE_BASE = 0.1          #学习率基数
LEARNING_RATE_DECAY = 0.99        #学习率衰减率
REGULARIZER = 0.0001              #正则化系数
STEPS = 50000                     #训练总轮数
MOVING_AVERAGE_DECAY = 0.99       #滑动平均衰减率
MODEL_SAVE_PATH="./model/"        #模型保存路径
MODEL_NAME="mnist_model"          #模型保存文件名


def backward(mnist):

    x = tf.placeholder(tf.float32, [None, mnist_forward.INPUT_NODE])    #前节点数占位，后节点数为784
    y_ = tf.placeholder(tf.float32, [None, mnist_forward.OUTPUT_NODE])  #标准答案，有10个标签
    y = mnist_forward.forward(x, REGULARIZER)                           #y返回的是形状为[x的行数,10]的列表
    global_step = tf.Variable(0, trainable=False)                       #训练的轮数，设定为不可训练
    
    #求交叉熵、定义损失函数	
    ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1))#让输出经过softmax函数，求得输出分类的概率分布，
    cem = tf.reduce_mean(ce)                                                              #再与标准答案对比，求出交叉熵
    loss = cem + tf.add_n(tf.get_collection('losses'))                                    #损失函数等于交叉熵与正则化参数后的和

    #定义指数衰减学习率
    learning_rate = tf.train.exponential_decay(         
        LEARNING_RATE_BASE,                             #学习率基数
        global_step,                                    #当前训练的轮数
        mnist.train.num_examples / BATCH_SIZE,          #mnist.train.num_examples为训练集总样本数，共55000，(mnist.train.num_examples/BATCH_SIZE)轮batch_size后更新一次学习率
        LEARNING_RATE_DECAY,                            #学习率衰减率
        staircase=True)

    #定义训练过程
    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)###训练过程使用梯度下降优化器
    
    #定义滑动平均
    ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step) #求滑动平均
    ema_op = ema.apply(tf.trainable_variables())  #tf.trainable_variables()把所有待训练参数汇总为列表
    with tf.control_dependencies([train_step, ema_op]):  #将滑动平均和训练过程同步进行
        train_op = tf.no_op(name='train')

    saver = tf.train.Saver()#实例化saver，保存模型
    with tf.Session() as sess:
        init_op = tf.global_variables_initializer()    
        sess.run(init_op)

        ckpt = tf.train.get_checkpoint_state(MODEL_SAVE_PATH)
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess, ckpt.model_checkpoint_path)   #加载保存的模型，实现断点续训

        for i in range(STEPS):
            xs, ys = mnist.train.next_batch(BATCH_SIZE)       #将batch_size组样本的像素值和标签分别赋值给xs和ys
            _, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: xs, y_: ys})
            if i % 1000 == 0:
                print("After %d training step(s), loss on training batch is %g." % (step, loss_value))
                saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=global_step)


def main():
    mnist = input_data.read_data_sets("./data/", one_hot=True)#以读热码形式读入数据集，并将mnist分为训练集train，验证集validation，测试集test存放。
    backward(mnist)

if __name__ == '__main__':
    main()

运行可以看到loss的降低过程：

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting ./data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting ./data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting ./data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting ./data/t10k-labels-idx1-ubyte.gz
After 1 training step(s), loss on training batch is 2.94652.
After 1001 training step(s), loss on training batch is 0.290871.
After 2001 training step(s), loss on training batch is 0.252061.
After 3001 training step(s), loss on training batch is 0.296243.
After 4001 training step(s), loss on training batch is 0.252405.
...
...
...
After 46001 training step(s), loss on training batch is 0.129093.
After 47001 training step(s), loss on training batch is 0.13056.
After 48001 training step(s), loss on training batch is 0.128999.
After 49001 training step(s), loss on training batch is 0.12736.

测试模块mnist_test.py:

#coding:utf-8
import time
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import mnist_forward
import mnist_backward
TEST_INTERVAL_SECS = 5 #延时

def test(mnist):
    with tf.Graph().as_default() as g:                                              #复现计算图
        x = tf.placeholder(tf.float32, [None, mnist_forward.INPUT_NODE])            #给x，y_占位
        y_ = tf.placeholder(tf.float32, [None, mnist_forward.OUTPUT_NODE])
        y = mnist_forward.forward(x, None)                                          #前向传播得到y

        ema = tf.train.ExponentialMovingAverage(mnist_backward.MOVING_AVERAGE_DECAY)#实例化可还原滑动平均的saver，加载模型中参数的滑动平均值
        ema_restore = ema.variables_to_restore()
        saver = tf.train.Saver(ema_restore)
		
        correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))            #计算模型识别准确率
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))		

        while True:
            with tf.Session() as sess:
                ckpt = tf.train.get_checkpoint_state(mnist_backward.MODEL_SAVE_PATH)      #把滑动平均值赋给各个参数
                if ckpt and ckpt.model_checkpoint_path:
                    saver.restore(sess, ckpt.model_checkpoint_path)                       #不再初始化所有参数，加载训练好的模型
                    global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]#恢复训练的轮数
                    accuracy_score = sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})#计算准确率，x:测试数据，y_:测试数据标签
                    print("After %s training step(s), test accuracy = %g" % (global_step, accuracy_score))
                else:
                    print('No checkpoint file found')
                    return
            time.sleep(TEST_INTERVAL_SECS)

def main():
    mnist = input_data.read_data_sets("./data/", one_hot=True)
    test(mnist)

if __name__ == '__main__':
    main()

运行结果为：

Extracting ./data/train-images-idx3-ubyte.gz
Extracting ./data/train-labels-idx1-ubyte.gz
Extracting ./data/t10k-images-idx3-ubyte.gz
Extracting ./data/t10k-labels-idx1-ubyte.gz
After 49001 training step(s), test accuracy = 0.9805
After 49001 training step(s), test accuracy = 0.9805
After 49001 training step(s), test accuracy = 0.9805
After 49001 training step(s), test accuracy = 0.9805

应用模块app.py,实现手写数字识别：

#coding:utf-8

import tensorflow as tf
import numpy as np
from PIL import Image
import mnist_backward
import mnist_forward

def restore_model(testPicArr):
	with tf.Graph().as_default() as tg:
		x = tf.placeholder(tf.float32, [None, mnist_forward.INPUT_NODE])#仅需要给x占位
		y = mnist_forward.forward(x, None)                              #计算输出y
		preValue = tf.argmax(y, 1)                                      #预测结果

		variable_averages = tf.train.ExponentialMovingAverage(mnist_backward.MOVING_AVERAGE_DECAY)
		variables_to_restore = variable_averages.variables_to_restore()
		saver = tf.train.Saver(variables_to_restore)                    #实例化带有滑动平均值的saver

		with tf.Session() as sess:
			ckpt = tf.train.get_checkpoint_state(mnist_backward.MODEL_SAVE_PATH)#加载模型
			if ckpt and ckpt.model_checkpoint_path:
				saver.restore(sess, ckpt.model_checkpoint_path)             #恢复w等信息到当前会话
		
				preValue = sess.run(preValue, feed_dict={x:testPicArr})     #喂入
				return preValue
			else:
				print("No checkpoint file found")
				return -1

def pre_pic(picName):
	img = Image.open(picName)                            #打开图片
	reIm = img.resize((28,28), Image.ANTIALIAS)          #变成28×28像素点，用Image.ANTIALIAS消除锯齿
	im_arr = np.array(reIm.convert('L'))                 #用reIm.convert()变成灰度图，用np.array()转成矩阵
	threshold = 50                                       #阈值是50,50以下的认为是0，大于阈值的认为是255
	for i in range(28):                                  #输入的是白底黑字，需要反色
		for j in range(28):
			im_arr[i][j] = 255 - im_arr[i][j]
			if (im_arr[i][j] < threshold):
				im_arr[i][j] = 0
			else: im_arr[i][j] = 255

	nm_arr = im_arr.reshape([1, 784])                  #整理为1行784列
	nm_arr = nm_arr.astype(np.float32)                 #变为浮点型像素点
	img_ready = np.multiply(nm_arr, 1.0/255.0)         #从0～255之间的数变为1.0～255.0之间的浮点数

	return img_ready                                   #待识别图片

def application():
	testNum = input("input the number of test pictures:")  #输入要识别几张图片
	for i in range(int(testNum)):
		testPic = input("the path of test picture:")   #给出识别图片的路径和名称
		testPicArr = pre_pic(testPic)                  #对图片预处理
		preValue = restore_model(testPicArr)           #喂入
		print ("The prediction number is:", preValue)

def main():
	application()

if __name__ == '__main__':
	main()

识别结果：

input the number of test pictures:10
the path of test picture:pic/0.png
The prediction number is: [0]
the path of test picture:pic/1.png
The prediction number is: [1]
the path of test picture:pic/2.png
The prediction number is: [2]
the path of test picture:pic/3.png
The prediction number is: [3]
the path of test picture:pic/4.png
The prediction number is: [4]
the path of test picture:pic/5.png
The prediction number is: [5]
the path of test picture:pic/6.png
The prediction number is: [6]
the path of test picture:pic/7.png
The prediction number is: [7]
the path of test picture:pic/8.png
The prediction number is: [8]
the path of test picture:pic/9.png
The prediction number is: [9]

爱吃菠菜

关注

1
点赞
踩
17

收藏

觉得还不错? 一键收藏
0
评论
全连接神经网络——MNIST手写数字识别

MNIST数据集mnist数据集：包含7万张黑底白字手写图片，其中55000张为训练集，5000张为验证集，1000张为测试集。每张图片大小为28×28像素，图片中纯黑像素值为0,纯白像素值为1。0~1之间的浮点数越接近1,颜色越白。数据集的标签是长度为10的一维数组，数组中每个元素索引号表示对应数字出现的概率。先把每张图片变为长度为28×28=784的一维数组输入神经网络，每张图片有对应...
复制链接

扫一扫