【TensorFlow】验证LSTM是否可以选择性遗忘

最新推荐文章于 2022-11-12 15:55:32 发布

duanlianvip

最新推荐文章于 2022-11-12 15:55:32 发布

阅读量484

点赞数

分类专栏： TensorFlow 深度学习文章标签： LSTM RNN 循环神经网络长短记忆神经网络

本文链接：https://blog.csdn.net/duanlianvip/article/details/98206321

版权

TensorFlow 同时被 2 个专栏收录

45 篇文章 14 订阅

订阅专栏

深度学习

23 篇文章 4 订阅

订阅专栏

概述

长短期记忆网络（LSTM，Long Short-Term Memory）是一种时间循环神经网络，是为了解决一般的RNN（循环神经网络）存在的长期依赖问题而专门设计出来的。LSTM可以解决RNN的梯度爆炸问题，时间序列可以比RNN更长。想详细了解循环神经网络的原理可以参考：

LSTM可以对时间序列选择性遗忘、选择性保存，在此，写一个简单的例子对此进行验证。

验证

训练数据为：

    train_x = [[[1], [2], [5], [6]],
               [[5], [7], [7], [8]],
               [[3], [4], [5], [7]]]
    train_y = [[1, 3, 7, 11],
               [5, 12, 14, 15],
               [3, 7, 9, 12]]

此处的对应关系是：train_y元素为train_x相邻元素之和。1=0+1、3=1+2、7=2+5、11=5+6等。

利用train_x和train_y对模型进行训练，使用训练好的模型对test_x进行预测，看LSTM网络能否找出train_x和train_y之间的规律。

具体实现代码rnn-example.py：

import tensorflow as tf
from tensorflow.contrib import rnn


class SeriesPredictor:

    def __init__(self, input_dim, seq_size, hidden_dim=10):
        # Hyperparameters
        self.input_dim = input_dim
        self.seq_size = seq_size
        self.hidden_dim = hidden_dim

        # Weight variables and input placeholders
        self.W_out = tf.Variable(tf.random_normal([hidden_dim, 1]), name='W_out')
        self.b_out = tf.Variable(tf.random_normal([1]), name='b_out')
        # (batchsize, 4个阶段, 每个阶段1个值)
        self.x = tf.placeholder(tf.float32, [None, seq_size, input_dim])
        # 四个标签
        self.y = tf.placeholder(tf.float32, [None, seq_size])

        # Cost optimizer
        self.cost = tf.reduce_mean(tf.square(self.model() - self.y))
        self.train_op = tf.train.AdamOptimizer().minimize(self.cost)

        # Auxiliary ops
        self.saver = tf.train.Saver()

    def model(self):
        """
        :param x: Input of size [T, batch_size, input_size]
        :param W: Matrix of fully-connected output layer weights
        :param b: Vector of fully-connected output layer biases
        """
        cell = rnn.BasicLSTMCell(self.hidden_dim)
        # 初始化cell单元。outputs维度（？， 4， 10）
        outputs, states = tf.nn.dynamic_rnn(cell, self.x, dtype=tf.float32)
        # 当前样本个数
        num_examples = tf.shape(self.x)[0]
        # tf.expand_dims,增加一个新的维度
        # W_repeated = tf.tile(tf.expand_dims(self.W_out, 0), [num_examples, 1, 1])

        # tf_expand维度（1,10,1）
        tf_expand = tf.expand_dims(self.W_out, 0)
        # tf_tile维度（？，10,1）
        tf_tile = tf.tile(tf_expand, [num_examples, 1, 1])
        # out维度(?, 4, 1)
        out = tf.matmul(outputs, tf_tile) + self.b_out
        # tf.squeeze 删除所有维度是1的
        out = tf.squeeze(out)
        return out

    def train(self, train_x, train_y):
        with tf.Session() as sess:
            tf.get_variable_scope().reuse_variables()
            sess.run(tf.global_variables_initializer())
            # 迭代次数
            for i in range(1000):
                _, mse = sess.run([self.train_op, self.cost], feed_dict={self.x: train_x, self.y: train_y})
                if i % 100 == 0:
                    print(i, mse)
            save_path = self.saver.save(sess, './model')
            print('Model saved to {}'.format(save_path))

    def test(self, test_x):
        with tf.Session() as sess:
            tf.get_variable_scope().reuse_variables()
            self.saver.restore(sess, './model')
            output = sess.run(self.model(), feed_dict={self.x: test_x})
            return output


if __name__ == '__main__':
    predictor = SeriesPredictor(input_dim=1, seq_size=4, hidden_dim=10)
    train_x = [[[1], [2], [5], [6]],
               [[5], [7], [7], [8]],
               [[3], [4], [5], [7]]]
    train_y = [[1, 3, 7, 11],
               [5, 12, 14, 15],
               [3, 7, 9, 12]]
    predictor.train(train_x, train_y)

    test_x = [[[1], [2], [3], [4]],
              [[4], [5], [6], [7]]]
    actual_y = [[1, 3, 5, 7],
                [4, 9, 11, 13]]
    pred_y = predictor.test(test_x)

    print("\nlets run some tests!\n")

    for i, x in enumerate(test_x):
        print('when the inputs is {}'.format(x))
        print('the ground truth output should be {}'.format(actual_y[i]))
        print('and the model thinks it is {}\n'.format(pred_y[i]))

结果：

0 93.94602
100 37.57643
200 12.853694
300 5.2746544
400 3.4352033
500 2.7549746
600 2.1920292
700 1.6691321
800 1.2889558
900 0.9742842
Model saved to ./model

lets run some tests!

when the inputs is [[1], [2], [3], [4]]
the ground truth output should be [1, 3, 5, 7]
and the model thinks it is [0.86799127 2.593877   4.958681   6.8895497 ]

when the inputs is [[4], [5], [6], [7]]
the ground truth output should be [4, 9, 11, 13]
and the model thinks it is [ 4.114984  9.15321  11.73832  12.63355 ]

由运行结果可知，预测的值和标签值比较接近，由于测试数据集比较小，所以预测值还是有一定误差的，但是可以看出LSTM已经找到训练数据的规律了。

duanlianvip

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【TensorFlow】验证LSTM是否可以选择性遗忘

概述长短期记忆网络（LSTM，Long Short-Term Memory）是一种时间循环神经网络，是为了解决一般的RNN（循环神经网络）存在的长期依赖问题而专门设计出来的。LSTM可以解决RNN的梯度爆炸问题，时间序列可以比RNN更长。想详细了解循环神经网络的原理可以参考：https://zhuanlan.zhihu.com/p/45289691 https://www.jianshu...
复制链接

扫一扫

专栏目录