基于tensorflow和lenet-5模型实现mnist手写数字识别

最新推荐文章于 2024-05-22 03:16:52 发布

AI大杂烩

最新推荐文章于 2024-05-22 03:16:52 发布

阅读量8k

点赞数 7

分类专栏： tensorflow

本文链接：https://blog.csdn.net/yanchujian88/article/details/80559936

版权

tensorflow 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

前言-很久没有写博客了，之前答应的将图像识别一些小项目实现也没有实现诺言，由于四月到五月初都在找实习，所以才会将写博客的事情落下了，其实中间一直在努力学习，虽说缺少了一丢丢动力，但该学的都没有放弃。这不还是来乖乖更新了嘛。另改天给大家建议一些怎么找一个适合自己的实习的方法。

-----------------------------------我是一道靓丽的风景线-------------------------------------

tensorflow简介：（一句话说完）tensorflow是由Jeff Dean领头的谷歌大脑团队基于谷歌内部第一代深度学习系统DistBelief改进而来的通用计算框架。

lenet-5模型简介：LeNet-5模型是Yann LeCun教授于1988年在论文Gradient-based learning applied to document recognition中提出的，并且是第一个成功应用于数字识别问题的卷积神经网络。LeNet-5模型总共有七层。分别是卷积层-池化层-卷积层-池化层-全连接层-全连接层-全连接层。各个层的介绍你们可以看其他大神的博客，这里不复述。

mnist简介：mnist是一个非常有名的手写数字识别数据集，是nist数据集的一个子集，包含了60000张图片作为训练数据，10000张图片作为测试数据。在mnist数据集中的每一张图片都代表0~9中的一个数字。利用上面所说的lenet-5模型实现mnist手写数字识别可以达到大约99.2%的正确率。这在21世纪之前就能达到这么高的正确率怎么看都很厉害。

话不多说，直接上代码：

（1）第一步，实现卷积神经网络的前向传播过程

这个不要不要不要

##实现神经卷积网络的前向传播过程

import tensorflow as tf

INPUT_NODE=784
OUTPUT_NODE=10

IMAGE_SIZE=28
NUM_CHANNELS=1
NUM_LABELS=10

#第一层卷积层的尺寸和深度
CONV1_DEEP=32
CONV1_SIZE=5

#第二层卷积层的尺寸和深度
CONV2_DEEP=64
CONV2_SIZE=5

#全连接层的节点个数
FC_SISE=512

#定义卷积神经网络的前向传播过程
def inference(input_tensor,train,regularizer):
    with tf.variable_scope('layer1-conv1'):
        conv1_weights=tf.get_variable("weights",[CONV1_SIZE,CONV1_SIZE,NUM_CHANNELS,CONV1_DEEP],
                                      initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv1_biases=tf.get_variable("bias",[CONV1_DEEP],initializer=tf.constant_initializer(0.0))

        #使用边长为5，深度为32的过滤器，过滤器移动的步长为1，且使用全0填充.
        conv1=tf.nn.conv2d(input_tensor,conv1_weights,strides=[1,1,1,1],padding='SAME')
        relu1=tf.nn.relu(tf.nn.bias_add(conv1,conv1_biases))

    ##实现第二层（池化层的前向传播过程），这里选用最大池化层，池化层的过滤器的边长为2
    with tf.name_scope('layer2-poll'):
        pool1=tf.nn.max_pool(relu1,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')

    ##声明第三层卷积层的变量并实现前向传播过程，这一层的输入为14*14*32的矩阵
    with tf.variable_scope('layer3-conv2'):
        conv2_weights=tf.get_variable("weights",[CONV2_SIZE,CONV2_SIZE,CONV1_DEEP,CONV2_DEEP],
                                      initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv2_biases=tf.get_variable("bias",[CONV2_DEEP],initializer=tf.constant_initializer(0.0))

        #使用边长为5，深度为64的过滤器，过滤器移动的步长为1，且使用全0填充.
        conv2=tf.nn.conv2d(pool1,conv2_weights,strides=[1,1,1,1],padding='SAME')
        relu2=tf.nn.relu(tf.nn.bias_add(conv2,conv2_biases))

    ##实现第四层池化层的前向传播过程,
    with tf.name_scope('layer4-pool2'):
         pool2=tf.nn.max_pool(relu2,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')

    pool_shape=pool2.get_shape().as_list()
    nodes=pool_shape[1]*pool_shape[2]*pool_shape[3]

    #通过tf.reshape函数将第四层的输出编程一个batch的向量
    reshaped=tf.reshape(pool2,[pool_shape[0],nodes])

    #声明第五层全连接层的变量并实现前向传播过程
    with tf.variable_scope('layer5-fc1'):
        fc1_weights=tf.get_variable("weight",[nodes,FC_SISE],
                                    initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer!=None:
            tf.add_to_collection('losses',regularizer(fc1_weights))
        fc1_biases=tf.get_variable("bias",[FC_SISE],initializer=tf.constant_initializer(0.1))

        fc1=tf.nn.relu(tf.matmul(reshaped,fc1_weights)+fc1_biases)
        if train:fc1=tf.nn.dropout(fc1,0.5)

    ##声明第六层全连接层的变量并实现前向传播过程
    with tf.variable_scope('layer6-fc2'):
        fc2_weights=tf.get_variable("weight",[FC_SISE,NUM_LABELS],
                                    initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer!=None:
            tf.add_to_collection('losses',regularizer(fc2_weights))
        fc2_biases=tf.get_variable("bias",[NUM_LABELS],
                                   initializer=tf.constant_initializer(0.1))
        logit=tf.matmul(fc1,fc2_weights)+fc2_biases


    ##返回第六层的输出

    return logit

（2）第二步：用数据集训练该神经网络

这个也不要不要不要

##lenet-5训练过程
import os
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

import mnist_inference

##配置神经网络的参数
BATCH_SIZE=100
LEARNING_RATE_BASE=0.01  #基础学习率
LEARNING_RATE_DECAY=0.99
REGULARAZTION_RATE=0.0001
TRAINING_STEPS=30000
MOVING_AVERAGE_DECAY=0.99
##模型保存的路径和文件名
MODEL_SAVE_PATH="/path/to/model/"
MODEL_NAME="model.ckpt"

##定义训练过程
def train(mnist):
    x=tf.placeholder(tf.float32,[
    BATCH_SIZE,
    mnist_inference.IMAGE_SIZE,
    mnist_inference.IMAGE_SIZE,
    mnist_inference.NUM_CHANNELS],
                 name='x-input')

    
    y_ = tf.placeholder(tf.float32, [None, mnist_inference.OUTPUT_NODE] , name='y-input')  
    regularizer=tf.contrib.layers.l2_regularizer(REGULARAZTION_RATE)
    y=mnist_inference.inference(x,True,regularizer)
    global_step=tf.Variable(0,trainable=False)
    #给定滑动平均衰减率和训练轮数的变量，初始化滑动平均类
    variable_averages=tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step)

    #在所有代表神经网络参数的变量上使用滑动平均。
    variables_averages_op=variable_averages.apply(tf.trainable_variables())

    #计算使用了滑动平均之后的前向传播结果。
    #average_y=inference(x,variable_average2,weights1,biases1,weights2,biases2)

    #计算交叉熵作为刻画预测值和真实值之间差距的损失函数
    #cross_entropy=tf.nn.sparse_softmax_cross_entropy_with_logits(y,tf.argmax(y_,1))
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_,1))  
    #计算在当前batch中所有样例的交叉熵平均值

    cross_entropy_mean=tf.reduce_mean(cross_entropy)

    #计算L2正则化损失函数
    #regularizer=tf.contrib.layers.l2_regularizer(REGULARIZATION_RATE)
    #计算模型的正则化损失
    #regularization=regularizer(weights1)+regularizer(weights2)
    #总损失等于交叉熵损失和正则化损失的和
    loss=cross_entropy_mean+tf.add_n(tf.get_collection('losses'))                                            #regularization
    #设置指数衰减的学习率
    learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE,global_step,mnist.train.num_examples/BATCH_SIZE,LEARNING_RATE_DECAY,staircase=True)  
    train_step=tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step)
    with tf.control_dependencies([train_step,variables_averages_op]):train_op=tf.no_op(name='train')
    
    #初始化tensorflow持久化类
    saver=tf.train.Saver()
    ##初始化会话并开始训练过程
    with tf.Session() as sess:
        tf.global_variables_initializer().run()
        print("****************开始训练************************")  
       # validate_feed={x:mnist.validation.images,y_:mnist.validation.labels}

        #准备测试数据.
        #test_feed={x:mnist.test.images,y_:mnist.test.labels}
       

        #迭代地训练神经网络
        for  i in range(TRAINING_STEPS):
            xs,ys=mnist.train.next_batch(BATCH_SIZE)
            reshaped_xs = np.reshape(xs, (BATCH_SIZE,  
                                          mnist_inference.IMAGE_SIZE,  
                                          mnist_inference.IMAGE_SIZE,  
                                          mnist_inference.NUM_CHANNELS))
            train_op_renew,loss_value, step=sess.run([train_op,loss,global_step],
                                       feed_dict={x:reshaped_xs,y_:ys})

            if i%1000==0:
                print("After %d training step(s),loss on training batch is %g."%(step,loss_value))

                saver.save(sess,os.path.join(MODEL_SAVE_PATH,MODEL_NAME),global_step=global_step)
                           
def main(argv=None):
    mnist=input_data.read_data_sets("MNIST_data/",one_hot=True)
    train(mnist）

if __name__=='__main__':
    tf.app.run()

训练该模型我花了一个多小时的时间（因为是cpu版本的所以较慢，你们可尝试gpu版本的），你们可以把迭代次数改小一点，看看效果。训练结果得到的损失有点失策。（多次检查代码是没问题的）

（3）第三步：用数据集测试该神经网络。代码如下：

#测试该网路
import time  
import tensorflow as tf  
import numpy as np  
from tensorflow.examples.tutorials.mnist import input_data  
  
###加载mnist_inference.py和mnist_train.py中定义的常量和前向传播的函数########  
import mnist_inference  
import mnist_train  
  
#每10秒加载一次最新的模型，并在测试数据上测试最新模型的正确率  
EVAL_INTERVAL_SECS = 10  
  
def evaluate( mnist ):  
    with tf.Graph().as_default() as g:   #将默认图设为g  
        #定义输入输出的格式  
        x = tf.placeholder(tf.float32, [mnist.validation.images.shape[0],  
                                        mnist_inference.IMAGE_SIZE,  
                                        mnist_inference.IMAGE_SIZE,  
                                        mnist_inference.NUM_CHANNELS], name='x-input1')  
        y_ = tf.placeholder(tf.float32, [None, mnist_inference.OUTPUT_NODE], name='y-input')  
  
        xs = mnist.validation.images  
        # 类似地将输入的测试数据格式调整为一个四维矩阵  
        reshaped_xs = np.reshape(xs, (mnist.validation.images.shape[0],  
                                      mnist_inference.IMAGE_SIZE,  
                                      mnist_inference.IMAGE_SIZE,  
                                      mnist_inference.NUM_CHANNELS))  
        validate_feed = {x: reshaped_xs, y_: mnist.validation.labels}  
  
        #直接通过调用封装好的函数来计算前向传播的结果  
        #测试时不关注过拟合问题，所以正则化输入为None  
        y = mnist_inference.inference(x,None, None)  
  
        #使用前向传播的结果计算正确率，如果需要对未知的样例进行分类  
        #使用tf.argmax(y, 1)就可以得到输入样例的预测类别  
        correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))  
        # 首先将一个布尔型的数组转换为实数，然后计算平均值  
        # 平均值就是网络在这一组数据上的正确率  
        #True为1，False为0  
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))  
  
        #通过变量重命名的方式来加载模型  
        variable_averages = tf.train.ExponentialMovingAverage(mnist_train.MOVING_AVERAGE_DECAY)  
        variable_to_restore = variable_averages.variables_to_restore()  
        # 所有滑动平均的值组成的字典，处在/ExponentialMovingAverage下的值  
        # 为了方便加载时重命名滑动平均量，tf.train.ExponentialMovingAverage类  
        # 提供了variables_to_store函数来生成tf.train.Saver类所需要的变量  
        saver = tf.train.Saver(variable_to_restore) #这些值要从模型中提取  
  
        #每隔EVAL_INTERVAL_SECS秒调用一次计算正确率的过程以检测训练过程中正确率的变化  
        #while True:  
        for i in range(2):    # 为了降低个人电脑的压力，此处只利用最后生成的模型对测试数据集做测试  
            with tf.Session() as sess:  
                #tf.train.get_checkpoint_state函数  
                # 会通过checkpoint文件自动找到目录中最新模型的文件名  
                ckpt = tf.train.get_checkpoint_state( mnist_train.MODEL_SAVE_PATH)  
                if ckpt and ckpt.model_checkpoint_path:  
                    #加载模型  
                    saver.restore(sess, ckpt.model_checkpoint_path)  
                    #得到所有的滑动平均值  
                    #通过文件名得到模型保存时迭代的轮数  
                    global_step = ckpt.model_checkpoint_path.split('-')[-1]  
                    accuracy_score = sess.run(accuracy, feed_dict = validate_feed)           #使用此模型检验  
                    #没有初始化滑动平均值，只是调用模型的值，inference只是提供了一个变量的接口，完全没有赋值  
                    print("After %s training steps, validation accuracy = %g" %(global_step, accuracy_score))  
                else:  
                    print("No checkpoint file found")  
                    return  
                time.sleep(EVAL_INTERVAL_SECS)  
                # time sleep()函数推迟调用线程的运行，可通过参数secs指秒数，表示进程挂起的时间。  
  
def main( argv=None ):  
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)  
    evaluate(mnist)  
  
if __name__=='__main__':  
    tf.app.run()

得到的结果为：得到的结果还算好。

至此，lenet-5模型介绍到此结束。鉴于篇幅太长，代码中涉及的优化函数，损失函数，还有激活函数，这些知识可能希望大家自己慢慢地去了解。

声明：所写的博客皆是自己认真所写，转载请注明出处。

AI大杂烩

关注

7
点赞
踩
28

收藏

觉得还不错? 一键收藏
9
评论
基于tensorflow和lenet-5模型实现mnist手写数字识别

前言-很久没有写博客了，之前答应的将图像识别一些小项目实现也没有实现诺言，由于四月到五月初都在找实习，所以才会将写博客的事情落下了，其实中间一直在努力学习，虽说缺少了一丢丢动力，但该学的都没有放弃。这不还是来乖乖更新了嘛。另改天给大家建议一些怎么找一个适合自己的实习的方法。 ----------------------------------...
复制链接

扫一扫