使用tensorflow建立对MNIST手写数字的LSTM训练模型

最新推荐文章于 2023-02-16 09:26:34 发布

yangblyaa

最新推荐文章于 2023-02-16 09:26:34 发布

阅读量561

点赞数

分类专栏：人工智能神经网络 RNN LSTM 交叉熵 softmax 文章标签：人工智能 RNN MNIST LSTM tensorflow

本文链接：https://blog.csdn.net/yangblyaa/article/details/82832374

版权

神经网络同时被 3 个专栏收录

3 篇文章 0 订阅

订阅专栏

人工智能

2 篇文章 0 订阅

订阅专栏

RNN

1 篇文章 0 订阅

订阅专栏

这两天进入rnn实操阶段。
首先学习了MNIST手写数字图片数据。
60000张训练图片，被保存成idx3-ubyte格式。
数据结构分为三部分（从头开始）：
第一部分
4bytes：magic number（可以理解成文件名）
第二部分：图片像素和数量
4bytes：number of images
4bytes：number of rows（图片的行像素数）
4bytes：number of columes（图片的列像素数）
第三部分：像素数据
1bytes表示一个像素，一张图片28X28像素。

数据以16进制表示，例：
0000  0803  0000  ea60  0000  001C  0000  001C
magic2051  images数60000 图行像素28  图列像素28


使用LSTM模型如下：

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt

#1导入数据
MNIST_data="H:\\工作\\算法及深度学习\\PY_code\\2-9月神经网络编程\\手写数据集"
mnist=input_data.read_data_sets(MNIST_data,one_hot=True)    #导入mnist数据，数据格式未知，待学习。后面通过mnist.train.next_batch(batch_size)把数据提取出来，为[batch_size,784]矩阵

#2.定义超参数：
lr=0.001
epoch=2   #所有样本循环训练次数，本样本共60000
n_samples=60000
batch_size=128
n_hidden=128
n_input=28 #每个时间步的输入为[1,28]向量，即为图片的某行28个像素
n_step=28 #一共设置为28个时间步，即一个图片有28行
n_class=10

#3.定义placeholder
with tf.name_scope('Inputs'):
    X=tf.placeholder(tf.float32,[None,n_step,n_input])
    Y=tf.placeholder(tf.float32,[None,n_class])   #根据下文输入，Y是one hot.

#4.定义参数
with tf.name_scope('Weight'):
    W={'in':tf.Variable(tf.random_normal([n_input,n_hidden])),'out':tf.Variable(tf.random_normal([n_hidden,n_class]))}   #使用了tf.random_normal()来初始化W.这里不需要输入shape=，因为此函数定义为(shape,)
with tf.name_scope('Bias'):
    b={'in':tf.Variable(tf.constant(0.1,shape=[1,n_hidden])),'out':tf.Variable(tf.constant(0.1,shape=[1,n_class]))}     #使用常数0.1来初始化，矩阵形状为[n_hidden,]与[1,n_hidden]效果一样.为什么这里要输入shape=, 因为此函数定义tf.constant(,shape=None,)

#5.建立RNN模型，包含一个线性层，一个RNN层，和一个线性层（输出），在softmax之前
def RNN(X,W,b):
    #5.1.输入数据的reshape：
    with tf.name_scope('layer_1_Wx_plus_b'):
        X=tf.reshape(X,[-1,n_input]) #将shape从[128,28,28]改成[128*28,28]以方便与矩阵W相乘。

    # 5.2.第一层（线性层），计算cell的输入X_cell：                     吴恩达示教学案例中没有这一层。
        X_cell=tf.matmul(X,W['in'])+b['in']  #shape为[128*28,128]
        X_cell=tf.reshape(X_cell,[-1,n_step,n_hidden])  #X_cell是cell的输入，用于计算RNN层lstm的各个门和states.

    #5.3.第二层，Cell层：
    with tf.name_scope('cell'):
        lstm_cell=tf.nn.rnn_cell.BasicLSTMCell(n_hidden,forget_bias=1.0,state_is_tuple=True)
        #初始遗忘门偏置项(即b)设置为1（个人理解，可参考官方解释）。State=True，即两个状态函数分开放置在tuple里，=False，则concatenated起来。莫烦使用的是tf.contrib.rnn.BasicLSTMCell

        #5.3.1.定义初始state(即吴恩达中的a<0>, 0向量)
    with tf.name_scope('initial_state'):
        init_state=lstm_cell.zero_state(batch_size,tf.float32)


        #5.3.2.cell的输出(获取cell的所有步的输出和最后一步的state)：
    with tf.name_scope('layer_2_RNN'):
        outputs,final_state=tf.nn.dynamic_rnn(lstm_cell,X_cell,initial_state=init_state,dtype=tf.float32)
        #outputs的shape是[batch_size,n_time,cell_state_size]),cell_size此例中就是n_hidden.
        #final_state就是最后一步的state,shape[2,batch_size,cell_state_size].
        #state[1,batch_size,:]=outputs[batch_size,-1,:].  如果为GRU，那么state==h<t>, state==outputs[batch_size,-1,:]


    #5.4.第三层，线性输出层，最终预测结果final_state.W+b
    with tf.name_scope('layer_3_Wst_plus_b'):
        results=tf.matmul(final_state[1],W['out'])+b['out']   #初始state的tuple是(c<t>,h<t>),所以矩阵final_state[1]就是h[t].

    return results

#6.定义loss
pred=RNN(X,W,b)
with tf.name_scope('Loss'):
    loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y,logits=pred))      #考察下是（Y，pred）还是（pred,Y）

#7.定义back-propagation，即优化函数
with tf.name_scope('Train'):
    train_opti=tf.train.AdamOptimizer(lr).minimize(loss)

#8.定义accuracy
correct_pre_matrix=tf.equal(tf.argmax(pred,1),tf.argmax(Y,1))        #equal函数得到的举证是[True,True,False,True,...,False]，True代表两个向量同一索引的元素相同。是bool类型数据，要转化成float32数据
with tf.name_scope('accuracy'):
    accuracy=tf.reduce_mean(tf.cast(correct_pre_matrix,tf.float32))

#9.全局初始化
init = tf.global_variables_initializer()

accuracylist=[]

#10.运行
with tf.Session() as sess:
    Writer = tf.summary.FileWriter("output", sess.graph)  # 必须写在sess后面
    sess.run(init)
    step=0
    STEP=[]
    while step*batch_size<epoch*n_samples:
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        batch_xs = batch_xs.reshape([batch_size, n_step, n_input])
        sess.run(train_opti,feed_dict={X:batch_xs,Y:batch_ys})

        if step%50==0:
            accu=sess.run(accuracy,feed_dict={X:batch_xs,Y:batch_ys})
            d=int(step/50-1)
            print(accu)
            accuracylist.append(accu)
            STEP.append(step)
        step += 1


plt.plot(STEP,accuracylist,'r-',lw=2)
plt.show()
print('ok')

计算accuracy的结果：

0.1875
0.796875
0.84375
0.8671875
0.90625
0.890625
0.953125
0.921875
0.9140625
0.984375
0.9453125
0.953125
0.9609375
0.96875
0.9765625
0.984375
0.9765625
0.96875
0.9765625
ok

yangblyaa

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
使用tensorflow建立对MNIST手写数字的LSTM训练模型

这两天进入rnn实操阶段。首先学习了MNIST手写数字图片数据。60000张训练图片，被保存成idx3-ubyte格式。数据结构分为三部分（从头开始）：第一部分4bytes：magic number（可以理解成文件名）第二部分：图片像素和数量4bytes：number of images4bytes：number of rows（图片的行像素数）4bytes：number of...
复制链接

扫一扫