《深度学习之TensorFlow》reading notes（2）—— MNIST手写数字识别

最新推荐文章于 2024-06-13 16:42:36 发布

子涣_new

最新推荐文章于 2024-06-13 16:42:36 发布

阅读量375

点赞数

分类专栏： deep learning 文章标签： MNIST TENSORFLOW 手写数字识别

本文链接：https://blog.csdn.net/bufanwangzi/article/details/89233373

版权

deep learning 专栏收录该内容

24 篇文章 0 订阅

订阅专栏

文章目录

MNIST手写数字识别
准备数据
建立模型（绘图）

MNIST手写数字识别

这简直就是机器学习或者说人工智能领域的“hello world！”
《深度学习之TensorFlow》这本书也是拿这个例子作为讲完tensorflow基本语法后的第一个直接训练，基础语法我也没有仔细看，这种东西就当字典看看就行，又不会的随时查，网上也可以查，上一篇读书笔记末尾也给出了tensorflow的一个很详细可搜索的中文datasheet，可以查阅全部函数。附上，上篇笔记的链接。
《深度学习之TensorFlow》reading notes（1）—— y=2x

准备数据

import pylab 
from tensorflow.examples.tutorials.mnist import input_data
number = input_data.read_data_sets("MNIST_data/", one_hot=True)

print ('输入数据:', number.train.images)
print ('输入数据打shape:', number.train.images.shape)

im = number.train.images[1]	# 取出第一个图片，其实就是第一行数据
im = im.reshape(-1, 28)	# 将第一行数据重新改写为28*28的矩阵
pylab.imshow(im) # 简单看了一眼，是MATLAB里的函数，应该是建立窗口的意思，可figure差不多，绘图内容不同
pylab.show()

print ('输入数据打shape:', number.test.images.shape)
print ('输入数据打shape:', number.validation.images.shape)

理解： 直接联网下载手写图片数据集，并将第一个数据进行以图片形式进行显示，会打印数据的数量，最后的是打印测试集和验证集数据数量。
这里，每个图片都是28×28个像素，整理为一维数组就是784×1个数。
同时，标签用one_hot方式进行表示，及将10种数字情况，0~9分别用[1,0,0,0,0,0,0,0,0,0]…[0,0,0,0,0,0,0,0,0,1]表示。

建立模型（绘图）

正向模型

import tensorflow as tf #导入tensorflow库
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
import pylab 

tf.reset_default_graph()
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data维度 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 数字=> 10 classes

# Set model weights
W = tf.Variable(tf.random_normal([784, 10]))
b = tf.Variable(tf.zeros([10]))

# 构建模型
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax分类

理解： x和y是输入，分别是图形数据x，和标签数据y（one_hot后需要十位表示一个标签）。
W为权重参数矩阵，b为偏置矩阵。

我在实际练习时将上边模型展开写的：

s1 = tf.matmul(x, W) + b
pred = tf.nn.softmax(s1)

这里用到两个主要函数：tf.matmul和tf.nn.softmax。

tf.matmul

其中matmul是矩阵相乘，也就是第一个参量的列数要与第二个参量的行数相等，不然无法计算出预期结果。
这里链接专门介绍其和tf.multiply的区别。tf.matmul() 和tf.multiply() 的区别

tf.nn.softmax

tf.nn.softmax函数，是N个输入，N个输出的函数。其中N个输入没有数据范围限制，N个输出数据范围为[0,1]内。主要就是将输入转为占全部清空的概率后进行输出。
具体公式为：
$S_j = \frac{e^{a_j}}{\sum_{k=1}^Te^{a_k}}$
结合公式就很容易理解了，就是算全部数据的e指数和，再求当前输出占总数的多少。求e主要是为了保证所有数据均为正，同时拓宽差距。附上详细讲解链接：对tf.nn.softmax的理解

反向传播模型

cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), axis=1))

#参数设置
learning_rate = 0.01
# 使用梯度下降优化器
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

这里对pred和y进行了一次交叉熵计算，在上边的连接中也进行了介绍。
由于自己看第二遍的时候就基本忘了交叉熵是什么了，这里简单记录一下：
首先，交叉熵是一个专有名词，有确定的公式：
$H(x,y)=-\sum_{j=1}^nx_ilog(y_i)$
具体就是为了描述x和y这两种分布的接近程度，值越小越接近。具体过程就是，将每种情况的真实分布于计算求得的分布的log乘积，再逐项求和。
这里就是用来评价求得的分类与原图的标签的接近程度。
参考：归一化(softmax)、信息熵、交叉熵
下边用的是梯度下降法找到cost最小的情况。

tf.reduce.xxx

是对目标tensor某一维度进行的数据操作。具体可以查阅连接：tensorflow中 tf.reduce_mean函数

进行训练

training_epochs = 25
batch_size = 100
display_step = 1
saver = tf.train.Saver()
model_path = "log/521model.ckpt"  # 保存模型路径
# 启动session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())# Initializing OP

    # 启动循环开始训练
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # 遍历全部数据集
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                          y: batch_ys})
            # Compute average loss
            avg_cost += c / total_batch	#这里竟然每次都除total_batch，用惯了单片机的人表示，这浮点数除法耗时到不能忍受，居然每次都要除，这里建议改成
        	# avg_cost += c
        # avg_cost += avg_cost / total_batch
        # 显示训练中的详细信息
        if (epoch+1) % display_step == 0:
            print ("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print( " Finished!")

至此，运行后将得到运行结果：

Instructions for updating:
Use tf.cast instead.
Epoch: 0001 cost= 7.358892203
Epoch: 0002 cost= 3.799467236
Epoch: 0003 cost= 2.743201640
Epoch: 0004 cost= 2.231650459
Epoch: 0005 cost= 1.925649457
Epoch: 0006 cost= 1.720448702
Epoch: 0007 cost= 1.572598832
Epoch: 0008 cost= 1.460496307
Epoch: 0009 cost= 1.371913151
Epoch: 0010 cost= 1.300113221
Epoch: 0011 cost= 1.240353869
Epoch: 0012 cost= 1.189888259
Epoch: 0013 cost= 1.146416261
Epoch: 0014 cost= 1.108545423
Epoch: 0015 cost= 1.075053699
Epoch: 0016 cost= 1.045288327
Epoch: 0017 cost= 1.018502454
Epoch: 0018 cost= 0.994177655
Epoch: 0019 cost= 0.972133954
Epoch: 0020 cost= 0.951984188
Epoch: 0021 cost= 0.933389442
Epoch: 0022 cost= 0.916102866
Epoch: 0023 cost= 0.900071669
Epoch: 0024 cost= 0.885050849
Epoch: 0025 cost= 0.871106860
 Finished!

可以看到，经过25次迭代，得到的cost值为0.87，从我们构建的模型可以看出，cost值越小越好，从迭代过程也可以看出，随着迭代次数增加，cost值越来越小。
但是，这里的0.87并不代表准确率，准确率需要使用test数据集进行测试，具体模型的测试，以及模型保存、读取和使用的方法，在下一篇博客里细说吧，这篇修修补补太多次了~
附上链接：《深度学习之TensorFlow》reading notes（3）—— MNIST手写数字识别之二

附上完整代码：

import tensorflow as tf  # 导入tensorflow库
import pylab
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

#print ('输入数据:',mnist.train.images)
#print ('输入数据打shape:',mnist.train.images.shape)
#
#import pylab 
#im = mnist.train.images[1]
#im = im.reshape(-1,28)
## pylab.imshow(im)
#pylab.show()
#
#
#print ('输入数据打shape:',mnist.test.images.shape)
#print ('输入数据打shape:',mnist.validation.images.shape)




tf.reset_default_graph()
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784])  # mnist data维度 28*28=784
y = tf.placeholder(tf.float32, [None, 10])  # 0-9 数字=> 10 classes

# Set model weights
W = tf.Variable(tf.random_normal([784, 10]))
b = tf.Variable(tf.zeros([10]))

# 构建模型
s1 = tf.matmul(x, W) + b
pred = tf.nn.softmax(s1)
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), axis=1))
#参数设置
learning_rate = 0.01
# 使用梯度下降优化器
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
training_epochs = 50
batch_size = 100
display_step = 1
saver = tf.train.Saver()
model_path = "log/521model.ckpt"

# 启动session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())# Initializing OP

    # 启动循环开始训练
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # 遍历全部数据集
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                          y: batch_ys})
            # Compute average loss
            avg_cost += c
            
        # 显示训练中的详细信息
        avg_cost = avg_cost / total_batch
        if (epoch+1) % display_step == 0:
            print ("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print( " Finished!")

子涣_new

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
《深度学习之TensorFlow》reading notes（2）—— MNIST手写数字识别

文章目录MNIST手写数字识别准备数据建立模型（绘图）正向模型tf.matmultf.nn.softmax反向传播模型tf.reduce.xxx进行训练MNIST手写数字识别这简直就是机器学习或者说人工智能领域的“hello world！”《深度学习之TensorFlow》这本书也是拿这个例子作为讲完tensorflow基本语法后的第一个直接训练，基础语法我也没有仔细看，这种东西就当字典看看...
复制链接

扫一扫

专栏目录