循环神经网络的关键库函数（tensorflow）

飞行codes

已于 2022-02-10 14:33:15 修改

阅读量450

点赞数 1

分类专栏： T型牌坊文章标签： rnn lstm gru

于 2019-08-15 14:29:40 首次发布

本文链接：https://blog.csdn.net/Arctic_Beacon/article/details/99583809

版权

T型牌坊专栏收录该内容

45 篇文章 2 订阅

订阅专栏

tf.nn.rnn_cell.BasicLSTMCell

__init__(
    num_units,
    forget_bias=1.0,
    state_is_tuple=True,
    activation=None,
    reuse=None,
    name=None,
    dtype=None,
    **kwargs
)

Args:

num_units: int, The number of units in the LSTM cell.
forget_bias: float, The bias added to forget gates (see above). Must set to 0.0 manually when restoring from CudnnLSTM-trained checkpoints.
state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state. If False, they are concatenated along the column axis. The latter behavior will soon be deprecated.
activation: Activation function of the inner states. Default: tanh. It could also be string that is within Keras activation function names.
reuse: (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.
name: String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases.
dtype: Default dtype of the layer (default of None means use the type of the first input). Required when build is called before call.
**kwargs: Dict, keyword named properties for common layer attributes, like trainable etc when constructing the cell from configs of get_config(). When restoring from CudnnLSTM-trained checkpoints, must use CudnnCompatibleLSTMCell instead.

tf.nn.rnn_cell.MultiRNNCell

num_units = [128, 64]
cells = [BasicLSTMCell(num_units=n) for n in num_units]
stacked_rnn_cell = MultiRNNCell(cells)

tf.nn.dynamic_rnn

tf.nn.dynamic_rnn(
    cell,
    inputs,
    sequence_length=None,
    initial_state=None,
    dtype=None,
    parallel_iterations=None,
    swap_memory=False,
    time_major=False,
    scope=None
)

Args:

cell: An instance of RNNCell.
inputs: The RNN inputs. If time_major == False (default), this must be a Tensor of shape:[batch_size, max_time, ...], or a nested tuple of such elements. If time_major == True, this must be a Tensor of shape: [max_time, batch_size, ...], or a nested tuple of such elements. This may also be a (possibly nested) tuple of Tensors satisfying this property. The first two dimensions must match across all the inputs, but otherwise the ranks and other shape components may differ. In this case, input tocell at each time-step will replicate the structure of these tuples, except for the time dimension (from which the time is taken). The input to cell at each time step will be a Tensor or (possibly nested) tuple of Tensors each with dimensions [batch_size, ...].
sequence_length: (optional) An int32/int64 vector sized [batch_size]. Used to copy-through state and zero-out outputs when past a batch element's sequence length. This parameter enables users to extract the last valid state and properly padded outputs, so it is provided for correctness.
initial_state: (optional) An initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
dtype: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
parallel_iterations: (Default: 32). The number of iterations to run in parallel. Those operations which do not have any temporal dependency and can be run in parallel, will be. This parameter trades off time for space. Values >> 1 use more memory but take less time, while smaller values use less memory but computations take longer.
swap_memory: Transparently swap the tensors produced in forward inference but needed for back prop from GPU to CPU. This allows training RNNs which would typically not fit on a single GPU, with very minimal (or no) performance penalty.
time_major: The shape format of the inputs and outputs Tensors. If true, these Tensors must be shaped [max_time, batch_size, depth]. If false, these Tensors must be shaped [batch_size, max_time, depth]. Using time_major = True is a bit more efficient because it avoids transposes at the beginning and end of the RNN calculation. However, most TensorFlow data is batch-major, so by default this function accepts input and emits output in batch-major form.
scope: VariableScope for the created subgraph; defaults to "rnn".

Returns:

A pair (outputs, state) where:

outputs: The RNN output Tensor.

If time_major == False (default), this will be a Tensor shaped: [batch_size, max_time, cell.output_size].

If time_major == True, this will be a Tensor shaped: [max_time, batch_size, cell.output_size].

Note, if cell.output_size is a (possibly nested) tuple of integers or TensorShape objects, then outputs will be a tuple having the same structure as cell.output_size, containing Tensors having shapes corresponding to the shape data in cell.output_size.
state: The final state. If cell.state_size is an int, this will be shaped [batch_size, cell.state_size]. If it is a TensorShape, this will be shaped [batch_size] + cell.state_size. If it is a (possibly nested) tuple of ints or TensorShape, this will be a tuple having the corresponding shapes. If cells are LSTMCells state will be a tuple containing a LSTMStateTuple for each cell.

点评，设max_time = 1时，outputs 就等于 state的h 。

上图是LSTM（Long-Short Term Memory），下图是GRU（Gate Recurrent Unit），可见，GRU的state只有h没有c。

对LSTM网络的输出再做加一层全链接层作为预测或计算损失

tf.contrib.layers.fully_connected

tf.contrib.layers.fully_connected(
    inputs,
    num_outputs,
    activation_fn=tf.nn.relu,
    normalizer_fn=None,
    normalizer_params=None,
    weights_initializer=initializers.xavier_initializer(),
    weights_regularizer=None,
    biases_initializer=tf.zeros_initializer(),
    biases_regularizer=None,
    reuse=None,
    variables_collections=None,
    outputs_collections=None,
    trainable=True,
    scope=None
)

Args:

inputs: A tensor of at least rank 2 and static value for the last dimension; i.e. [batch_size, depth], [None, None, None, channels].
num_outputs: Integer or long, the number of output units in the layer.
activation_fn: Activation function. The default value is a ReLU function. Explicitly set it to None to skip it and maintain a linear activation.
normalizer_fn: Normalization function to use instead of biases. If normalizer_fn is provided then biases_initializer and biases_regularizer are ignored and biases are not created nor added. default set to None for no normalizer function
normalizer_params: Normalization function parameters.
weights_initializer: An initializer for the weights.
weights_regularizer: Optional regularizer for the weights.
biases_initializer: An initializer for the biases. If None skip biases.
biases_regularizer: Optional regularizer for the biases.
reuse: Whether or not the layer and its variables should be reused. To be able to reuse the layer scope must be given.
variables_collections: Optional list of collections for all the variables or a dictionary containing a different list of collections per variable.
outputs_collections: Collection to add the outputs.
trainable: If True also add variables to the graph collection GraphKeys.TRAINABLE_VARIABLES (see tf.Variable).
scope: Optional scope for variable_scope.

Returns:

The tensor variable representing the result of the series of operations.

tf.nn.rnn_cell.DropoutWrapper

Args:

cell: an RNNCell, a projection to output_size is added to it.
input_keep_prob: unit Tensor or float between 0 and 1, input keep probability; if it is constant and 1, no input dropout will be added.
output_keep_prob: unit Tensor or float between 0 and 1, output keep probability; if it is constant and 1, no output dropout will be added.
state_keep_prob: unit Tensor or float between 0 and 1, output keep probability; if it is constant and 1, no output dropout will be added. State dropout is performed on the outgoing states of the cell. Note the state components to which dropout is applied when state_keep_prob is in (0, 1) are also determined by the argument dropout_state_filter_visitor (e.g. by default dropout is never applied to the c component of an LSTMStateTuple).
variational_recurrent: Python bool. If True, then the same dropout pattern is applied across all time steps per run call. If this parameter is set, input_size must be provided.
input_size: (optional) (possibly nested tuple of) TensorShape objects containing the depth(s) of the input tensors expected to be passed in to the DropoutWrapper. Required and used iffvariational_recurrent = True and input_keep_prob < 1.
dtype: (optional) The dtype of the input, state, and output tensors. Required and used iffvariational_recurrent = True.
seed: (optional) integer, the randomness seed.
dropout_state_filter_visitor: (optional), default: (see below). Function that takes any hierarchical level of the state and returns a scalar or depth=1 structure of Python booleans describing which terms in the state should be dropped out. In addition, if the function returns True, dropout is applied across this sublevel. If the function returns False, dropout is not applied across this entire sublevel. Default behavior: perform dropout on all terms except the memory (c) state of LSTMCellState objects, and don't try to apply dropout to TensorArray objects: def dropout_state_filter_visitor(s): if isinstance(s, LSTMCellState): # Never perform dropout on the c state. return LSTMCellState(c=False, h=True) elif isinstance(s, TensorArray): return False return True

lstm_cells = [tf.nn.rnn_cell.DropoutWrapper(
                tf.nn.rnn_cell.BasicLSTMCell(HIDDEN_SIZE),
                output_keep_prob=dropout_keep_prob) for _ in range(NUM_LAYERS)]

cell = tf.nn.rnn_cell.MultiRNNCell(lstm_cells)

综合举例

cell = tf.contrib.rnn.GRUCell(num_units=rnn_hidden_size)

outputs, last_states = tf.nn.dynamic_rnn(
    cell=cell,
    dtype=tf.float64,
    inputs=X)

predictions = tf.contrib.layers.fully_connected(outputs, 1, activation_fn=None)
loss = tf.losses.mean_squared_error(labels=y, predictions=predictions)