循环神经网络模块函数

最新推荐文章于 2020-06-05 23:10:49 发布

且行且安~

最新推荐文章于 2020-06-05 23:10:49 发布

阅读量512

点赞数

分类专栏：深度学习文章标签： tf.nn.rnn_cell.BasicLSTMCell tf.nn.rnn_cell.MultiRNNCell tf.nn.dynamic_rnn tf.contrib.layers.fully_connec

本文链接：https://blog.csdn.net/qq_20412595/article/details/83148192

版权

深度学习专栏收录该内容

14 篇文章 1 订阅

订阅专栏

前言：在厘清了循环神经网络的基本原理之后，很少会自己去写代码来实现的，我们更多的是直接调用TensorFlow中现有封装好的函数模块，那接下来的问题就是要弄清循环神经网络函数中的一些具体参数的含义。

一、tf.nn.rnn_cell.BasicLSTMCell

__init__(
    num_units,
    forget_bias=1.0,
    state_is_tuple=True,
    activation=None,
    reuse=None,
    name=None,
    dtype=None,
    **kwargs)

Args:

num_units: int, The number of units in the LSTM cell.
forget_bias: float, The bias added to forget gates (see above). Must set to 0.0 manually when restoring from CudnnLSTM-trained checkpoints.
state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state. If False, they are concatenated along the column axis. The latter behavior will soon be deprecated.
activation: Activation function of the inner states. Default: tanh. It could also be string that is within Keras activation function names.
reuse: (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.
name: String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases.
dtype: Default dtype of the layer (default of None means use the type of the first input). Required when build is called before call.
**kwargs: Dict, keyword named properties for common layer attributes, like trainable etc when constructing the cell from configs of get_config().

在介绍LSTM中，这个函数模块也已经详细的介绍过，就不做展开了，这是从官网摘取下来的解释，是不会有错的。

二、tf.nn.rnn_cell.MultiRNNCell

__init__(
    cells,
    state_is_tuple=True)

Create a RNN cell composed sequentially of a number of RNNCells.这里说的很清楚了，这个模块是用来创建多个RNNCell顺序组成的RNN细胞，即是深层的循环神经网络。

Args:

cells: list of RNNCells that will be composed in this order.
state_is_tuple: If True, accepted and returned states are n-tuples, where n = len(cells). If False, the states are all concatenated along the column axis. This latter behavior will soon be deprecated.

三、tf.nn.dynamic_rnn

tf.nn.dynamic_rnn(
    cell,
    inputs,
    sequence_length=None,
    initial_state=None,
    dtype=None,
    parallel_iterations=None,
    swap_memory=False,
    time_major=False,
    scope=None)

Example1:create a BasicRNNCell

# create a BasicRNNCell
rnn_cell = tf.nn.rnn_cell.BasicRNNCell(hidden_size)

# 'outputs' is a tensor of shape [batch_size, max_time, cell_state_size]

# defining initial state
initial_state = rnn_cell.zero_state(batch_size, dtype=tf.float32)

# 'state' is a tensor of shape [batch_size, cell_state_size]
outputs, state = tf.nn.dynamic_rnn(rnn_cell, input_data,
                                   initial_state=initial_state,
                                   dtype=tf.float32)

Example2:create 2 LSTMCells

# create 2 LSTMCells
rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [128, 256]]

# create a RNN cell composed sequentially of a number of RNNCells
multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)

# 'outputs' is a tensor of shape [batch_size, max_time, 256]
# 'state' is a N-tuple where N is the number of LSTMCells containing a
# tf.contrib.rnn.LSTMStateTuple for each cell
outputs, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell,
                                   inputs=data,
                                   dtype=tf.float32)

Args:官网摘取下来的解释

cell: An instance of RNNCell.
inputs: The RNN inputs. If time_major == False (default), this must be a Tensor of shape:[batch_size, max_time, ...], or a nested tuple of such elements. If time_major == True, this must be a Tensor of shape: [max_time, batch_size, ...], or a nested tuple of such elements. This may also be a (possibly nested) tuple of Tensors satisfying this property. The first two dimensions must match across all the inputs, but otherwise the ranks and other shape components may differ. In this case, input to cell at each time-step will replicate the structure of these tuples, except for the time dimension (from which the time is taken). The input to cell at each time step will be a Tensor or (possibly nested) tuple of Tensors each with dimensions [batch_size, ...].
sequence_length: (optional) An int32/int64 vector sized [batch_size]. Used to copy-through state and zero-out outputs when past a batch element's sequence length. So it's more for performance than correctness.
initial_state: (optional) An initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
dtype: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
parallel_iterations: (Default: 32). The number of iterations to run in parallel. Those operations which do not have any temporal dependency and can be run in parallel, will be. This parameter trades off time for space. Values >> 1 use more memory but take less time, while smaller values use less memory but computations take longer.
swap_memory: Transparently swap the tensors produced in forward inference but needed for back prop from GPU to CPU. This allows training RNNs which would typically not fit on a single GPU, with very minimal (or no) performance penalty.
time_major: The shape format of the inputs and outputs Tensors. If true, these Tensors must be shaped [max_time, batch_size, depth]. If false, these Tensors must be shaped [batch_size, max_time, depth]. Using time_major = True is a bit more efficient because it avoids transposes at the beginning and end of the RNN calculation. However, most TensorFlow data is batch-major, so by default this function accepts input and emits output in batch-major form.
scope: VariableScope for the created subgraph; defaults to "rnn".

这里有个十分重要的结论，就是tf.nn.dynamic_rnn的输出结果（outputs和state）之间是有一定关联性的，自己也十分感谢这位博主写的文章示例，在这里附上他的链接：https://blog.csdn.net/u010960155/article/details/81707498，就直接借鉴过来他的观点

如果cell为LSTM，那 state是个tuple，分别代表 $C_{t}$ 和 $h_{t}$ ，其中 $h_{t}$ 与outputs中的对应的最后一个时刻的输出相等，假设state形状为[ 2，batch_size, cell.output_size ]，outputs形状为 [ batch_size, max_time, cell.output_size ]，那么state[ 1, batch_size, : ] == outputs[ batch_size, -1, : ]

示例如下：

import tensorflow as tf
import numpy as np
def dynamic_rnn(rnn_type='lstm'):
    # 创建输入数据,3代表batch size,6代表输入序列的最大步长(max time),8代表每个序列的维度
    X = np.random.randn(3, 6, 4)
    # 第二个输入的实际长度为4
    X[1, 4:] = 0
    #记录三个输入的实际步长
    X_lengths = [6, 4, 6]
    rnn_hidden_size = 5
    if rnn_type == 'lstm':
        cell = tf.contrib.rnn.BasicLSTMCell(num_units=rnn_hidden_size, state_is_tuple=True)
    else:
        cell = tf.contrib.rnn.GRUCell(num_units=rnn_hidden_size)
    outputs, last_states = tf.nn.dynamic_rnn(
        cell=cell,
        dtype=tf.float64,
        sequence_length=X_lengths,
        inputs=X) 
    with tf.Session() as session:
        session.run(tf.global_variables_initializer())
        o1, s1 = session.run([outputs, last_states])
        print(np.shape(o1))
        print(o1)
        print(np.shape(s1))
        print(s1)
if __name__ == '__main__':
    dynamic_rnn(rnn_type='lstm')

(3, 6, 5)
[[[ 0.0146346 -0.04717453 -0.06930042 -0.06065602 0.02456717]
[-0.05580321 0.08770171 -0.04574306 -0.01652854 -0.04319528]
[ 0.09087799 0.03535907 -0.06974291 -0.03757408 -0.15553619]
[ 0.10003044 0.10654698 0.21004055 0.13792148 -0.05587583]
[ 0.13547596 -0.014292 -0.0211154 -0.10857875 0.04461256]
[ 0.00417564 -0.01985144 0.00050634 -0.13238986 0.14323784]]

[[ 0.04893576 0.14289175 0.17957205 0.09093887 -0.0507192 ]
[ 0.17696126 0.09929577 0.21185635 0.20386451 0.11664373]
[ 0.15658667 0.03952745 -0.03425637 0.00773833 -0.03546742]
[-0.14002582 -0.18578786 -0.08373584 -0.25964601 0.04090167]
[ 0. 0. 0. 0. 0. ]
[ 0. 0. 0. 0. 0. ]]

[[ 0.18564152 0.01531695 0.13752453 0.17188506 0.19555427]
[ 0.13703949 0.14272294 0.21313036 0.07417354 0.0477547 ]
[ 0.23021792 0.04455495 0.10204565 0.17159792 0.34148467]
[ 0.0386402 0.0387848 0.02134559 0.00110381 0.08414687]
[ 0.01386241 -0.02629686 -0.0733538 -0.03194245 0.13606553]
[ 0.01859433 -0.00585316 -0.04007138 0.03811594 0.21708331]]]
(2, 3, 5)
LSTMStateTuple(

c=array([[ 0.00909146, -0.03747076, 0.0008946 , -0.23459786, 0.29565899],
[-0.18409266, -0.30463044, -0.28033809, -0.49032542, 0.12597639],
[ 0.04494702, -0.01359631, -0.06706629, 0.06766361, 0.40794032]]),

h=array([[ 0.00417564, -0.01985144, 0.00050634, -0.13238986, 0.14323784],
[-0.14002582, -0.18578786, -0.08373584, -0.25964601, 0.04090167],
[ 0.01859433, -0.00585316, -0.04007138, 0.03811594, 0.21708331]]))

四、tf.contrib.layers.fully_connected

tf.contrib.layers.fully_connected(
    inputs,
    num_outputs,
    activation_fn=tf.nn.relu,
    normalizer_fn=None,
    normalizer_params=None,
    weights_initializer=initializers.xavier_initializer(),
    weights_regularizer=None,
    biases_initializer=tf.zeros_initializer(),
    biases_regularizer=None,
    reuse=None,
    variables_collections=None,
    outputs_collections=None,
    trainable=True,
    scope=None)

Args:

inputs: A tensor of at least rank 2 and static value for the last dimension; i.e. [batch_size, depth], [None, None, None, channels].
num_outputs: Integer or long, the number of output units in the layer.
activation_fn: Activation function. The default value is a ReLU function. Explicitly set it to None to skip it and maintain a linear activation.
normalizer_fn: Normalization function to use instead of biases. If normalizer_fn is provided then biases_initializer and biases_regularizer are ignored and biases are not created nor added. default set to None for no normalizer function
normalizer_params: Normalization function parameters.
weights_initializer: An initializer for the weights.
weights_regularizer: Optional regularizer for the weights.
biases_initializer: An initializer for the biases. If None skip biases.
biases_regularizer: Optional regularizer for the biases.
reuse: Whether or not the layer and its variables should be reused. To be able to reuse the layer scope must be given.
variables_collections: Optional list of collections for all the variables or a dictionary containing a different list of collections per variable.
outputs_collections: Collection to add the outputs.
trainable: If True also add variables to the graph collection GraphKeys.TRAINABLE_VARIABLES (see tf.Variable).
scope: Optional scope for variable_scope.

暂时先列举这几个常用的循环神经网络函数模块，以后再用到更高级的循环神经网络在来补充。

参考资料：

https://blog.csdn.net/u010960155/article/details/81707498

且行且安~

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
循环神经网络模块函数

前言：在厘清了循环神经网络的基本原理之后，很少会自己去写代码来实现的，我们更多的是直接调用TensorFlow中现有封装好的函数模块，那接下来的问题就是要弄清循环神经网络函数中的一些具体参数的含义。一、tf.nn.rnn_cell.BasicLSTMCell__init__( num_units, forget_bias=1.0, state_is_tuple=T...
复制链接

扫一扫