TensorFlow 中 RNN&LSTM 的使用

最新推荐文章于 2023-10-19 11:38:10 发布

man_world

最新推荐文章于 2023-10-19 11:38:10 发布

阅读量3.7k

点赞数 2

分类专栏： # TensorFLow # 深度学习

本文链接：https://blog.csdn.net/mzpmzk/article/details/80573338

版权

TensorFLow 同时被 2 个专栏收录

29 篇文章 2 订阅

订阅专栏

深度学习

19 篇文章 8 订阅

订阅专栏

一、RNN&LSTM 基类

1、RNN 基类

class tf.contrib.rnn.BasicRNNCell(num_units, activation=None, reuse=None, name=None)
输入参数：

num_units： int, the number of units in the RNN cell.
activation： Nonlinearity to use. Default: tanh.
reuse： (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.
name： String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases.

输出：

一个隐层神经元数量为 num_units 的 RNN 基本单元（实例化的 cell）

常用属性：

state_size：size(s) of state(s) used by this cell，等于隐层神经元数量
output_size： size of outputs produced by this cell
注意： 在此函数中，state_size 永远等于 output_size

常用方法：

call(inputs, state)： 返回两个一模一样的隐层状态值
zero_state(batch_size, dtype)： 返回一个形状为 [batch_size, state_size] 的全零张量

代码示例

import tensorflow as tf

cell = tf.contrib.rnn.BasicRNNCell(num_units=128)
print(cell.state_size) # 128

inputs = tf.placeholder(shape=[32, 100], dtype=tf.float32)  # 32 是 batch_size
h0 = cell.zero_state(batch_size=32, dtype=tf.float32) # 通过 zero_state 得到一个全 0 的初始状态，形状为(batch_size, state_size)

output, h1 = cell.call(inputs=inputs, state=h0)   # 调用 call 函数, 在时间序列上推进一步
print(h1.shape) # (32, 128)
output == h1  # True

2、LSTM 基类

class tf.contrib.rnn.BasicLSTMCell(num_units, forget_bias=1.0, state_is_tuple=True, activation=None, reuse=None, name=None)
输入参数：

num_units： int, the number of units in the RNN cell.
forget_bias: float, The bias added to forget gates. Must set to 0.0 manually when restoring from CudnnLSTM-trained checkpoints.
state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state
activation： Nonlinearity to use. Default: tanh.
reuse： (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.
name： String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases.

输出：

一个隐层神经元数量为 num_units 的 LSTM 基本单元（实例化的 lstm_cell）
state_size：size(s) of state(s) used by this cell，等于隐层神经元数量
output_size： size of outputs produced by this cell.
注意： 在此函数中，state_size 永远等于 output_size

常用方法：

call(inputs, state)： 返回一个是 new_h，一个是 new_state（LSTMStateTuple：包含 c 和 h）
zero_state(batch_size, dtype)： 返回一个形状为 [batch_size, state_size] 的全零张量，注意此时state_size 是 LSTMStateTuple(c=num_units , h=num_units)

BasicLSTMCell 的 call 函数定义
- 返回的隐状态是 new_c 和 new_h 的组合，而 output 就是单独的 new_h
- 如果我们处理的是分类问题，那么我们还需要对 new_h 添加单独的 Softmax 层才能得到最后的分类概率输出

new_c = c * sigmoid(f + self._forget_bias) + sigmoid(i) * self._activation(j)
new_h = self._activation(new_c) * sigmoid(o)

if self._state_is_tuple:
  new_state = LSTMStateTuple(new_c, new_h)
else:
  new_state = array_ops.concat([new_c, new_h], 1)
return new_h, new_state

代码示例

import tensorflow as tf

lstm_cell = tf.contrib.rnn.BasicRNNCell(num_units=128)
print(lstm_cell.output_size)  # 128
print(lstm_cell.state_size)   # LSTMStateTuple(c=128, h=128)  

inputs = tf.placeholder(shape=[32, 100], dtype=tf.float32)  # 32 是 batch_size
h0 = lstm_cell.zero_state(batch_size=32, dtype=tf.float32) 
print(h0)
# LSTMStateTuple(c=<tf.Tensor 'BasicLSTMCellZeroState/zeros:0' shape=(32, 128) dtype=float32>, h=<tf.Tensor 'BasicLSTMCellZeroState/zeros_1:0' shape=(32, 128) dtype=float32>)


new_h, new_state = lstm_cell.call(inputs=inputs, state=h0)   # 调用 call 函数, 在时间序列上推进一步
print(new_h.shape)  # (32, 128)
print(new_state.h)  # Tensor("mul_2:0", shape=(32, 128), dtype=float32)
print(new_state.c)  # Tensor("add_1:0", shape=(32, 128), dtype=float32)

二、一次执行多步：tf.nn.dynamic_rnn

目的：解决基础的 RNNCell 每次只能在时间上前进了一步的缺点。
函数：TF 提供了一个 tf.nn.dynamic_rnn 函数，使用该函数就相当于调用了 n 次call函数。即通过 ${(h_0,x_1, x_2, …., x_n)}$ 直接得 ${(h_1, h_2…, h_n)}$ 。

1、 RNN

tf.nn.dynamic_rnn(cell, inputs, initial_state=None, sequence_length=None, dtype=None, parallel_iterations=None, swap_memory=False, time_major=False, scope=None)

输入参数：

cell： 一个 RNNCell 实例对象
inputs： RNN 的输入序列
initial_state： RNN 的初始状态， If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
sequence_length： 形状为 [batch_size]，其中的每一个值为 sequence length（即 time_steps）， eg：sequence_length=tf.fill([batch_size], time_steps)
time_major： 默认为 False，输入和输出张量的形状为 [batch_size, max_time, depth]；当取 True 的时候， it avoids transposes at the beginning and end of the RNN calculation，输入和输出张量的形状为 [max_time, batch_size, depth]
scope： VariableScope for the created subgraph; defaults to “rnn”.

输出 (outputs, state) ：

outputs：是 time_steps 步里所有的输出，形状为 [batch_size, max_time, cell.output_size]
state：是最后一步的隐状态，形状为batch_size, cell.state_size

time_major=False 时计算图中的 transpose 可视化：

2、 BLSTM

tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, inputs, initial_state_fw=None, initial_state_bw=None, sequence_length=None, dtype=None, parallel_iterations=None, swap_memory=False, time_major=False, scope=None)

输入参数：

只比上面 1 中多了一个反向的 LSTMCell 实例对象和反向的初始状态；输入 inputs 相同，只是信息是双向传递的

输出 (outputs, output_states) ：

outputs：

输出是 time_steps 步里所有的输出，它是一个元组 (output_fw, output_bw) 包含了前向和后向的输出结果，每一个结果的形状为 [batch_size, max_time, cell_fw.output_size]
It returns a tuple instead of a single concatenated Tensor. If the concatenated one is preferred, the forward and backward outputs can be concatenated as tf.concat(outputs, 2)
output_states：: 是一个元组 (output_state_fw, output_state_bw) ，包含前向和后向的最后一步的状态

三、堆叠多层：MultiRNNCell

很多时候，单层 RNN 的能力有限，我们需要多层的 RNN，在 TensorFlow 中，可以使用 tf.nn.rnn_cell.MultiRNNCell 函数对RNNCell 进行堆叠。

# 创建 2 个 LSTMCell，隐层神经元的数量分别为 128 和 256
rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [128, 256]]

# create a RNN cell composed sequentially of a number of RNNCells
multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)

# 'outputs' is a tensor of shape [batch_size, max_time, 256]
# 'state' is a N-tuple where N is the number of LSTMCells containing a
# tf.contrib.rnn.LSTMStateTuple for each cell
outputs, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell,
                                   inputs=data,
                                   dtype=tf.float32)

四、参考资料

1、TensorFlow中RNN实现的正确打开方式
 2、https://www.tensorflow.org/api_guides/python/contrib.rnn
3、https://www.tensorflow.org/api_guides/python/nn#Recurrent_Neural_Networks

man_world

关注

2
点赞
踩
14

收藏

觉得还不错? 一键收藏
1
评论
TensorFlow 中 RNN&LSTM 的使用

一、RNN&amp;amp;amp;amp;amp;amp;LSTM 基类1、RNN 基类 class tf.contrib.rnn.BasicRNNCell(num_units, activation=None, reuse=None, name=None) 输入参数： num_units： int, the number of units in the RNN cell. activat...
复制链接

扫一扫