Tensorflow/nmt里构造网络的核心代码

最新推荐文章于 2024-08-07 00:16:08 发布

xiewenbo

最新推荐文章于 2024-08-07 00:16:08 发布

阅读量796

点赞数

分类专栏： attention model Tensorflow

本文链接：https://blog.csdn.net/xiewenbo/article/details/80336846

版权

Tensorflow 同时被 2 个专栏收录

24 篇文章 0 订阅

订阅专栏

attention model

14 篇文章 0 订阅

订阅专栏

tf.contrib.rnn.BasicLSTMCell

基本的LSTM循环网络单元

实现基于http://arxiv.org/abs/1409.2329

我们添加 forget_bias （默认值为1）到遗忘门的偏置，为了减少在开始训练时遗忘的规模。

它不允许单元有一个剪裁，映射层，不允许有peep-hole 连接：这是基准。

对于更高级的模型，请使用 full LSTMCell

def __init__(self, num_units, forget_bias=1.0, input_size=None):

初始化基本LSTM 单元

参数：

num_units: int, 在LSTM cell中unit 的数目
forget_bias: float, 添加到遗忘门中的偏置
input_size: int, 输入到LSTM cell 中输入的维度。默认等于 num_units

tf.contrib.rnn.LayerNormBasicLSTMCell

batch normalization 和 layer normalization 在tensorflow lstm中的实现

tf.contrib.rnn.NASCell

tf.contrib.rnn.DropoutWrapper

def dropout(x, keep_prob, noise_shape=None, seed=None, name=None)
#x: 输入
#keep_prob: 名字代表的意思, keep_prob 参数可以为 tensor，意味着，训练时候 feed 为0.5，测试时候 feed 为 1.0 就 OK。
#return：包装了dropout的x。训练的时候用，test的时候就不需要dropout了
#例：
w = tf.get_variable("w1",shape=[size, out_size])
x = tf.placeholder(tf.float32, shape=[batch_size, size])
x = tf.nn.dropout(x, keep_prob=0.5)
y = tf.matmul(x,w)

def __init__(self, cell, input_keep_prob=1.0, output_keep_prob=1.0, seed=None)

创建一个输入或（且）输出上有dropout 操作的cell

state 上不应用dropout

参数：

cell: 一个RNNCell，添加一个到output_size 的映射
input_keep_prob: unit Tensor 或0到1之间的一个float 值，保留输入的概率；如果是float 类型的1，表示输入上不添加dropout 操作。
output_keep_prob: unit Tensor 或0到1之间的一个float值，保留输出的概率；如果是float类型的1，表示输出上不添加dropout 操作。
seed: 可选 integer，随机种子

tf.contrib.rnn.MultiRNNCell

RNN cell 有多个简单的cell 循序组成。（按照序列组成）

def __init__(self, cells):

创建一个由大量的RNNCell 循序组成的一个RNN cell

参数：

cells: RNNCell 的列表，将按照这个顺序组成

cells[i+1].input_size==cells[i].output_size

def __call__(self, inputs, state, scope=None):

def get_variable(self, name, shape=None, dtype=dtypes.float32, initializer=None, regularizer=None, reuse=None, trainable=True, collection=None, caching_device=None):

用这些参数得到一个已存在的变量或创建一个新的变量。

如果给定名字的变量已存在，则返回一个已存在的变量。否则，创建一个新的变量。

设置'reuse' 为 ‘True’，当你仅仅想要重用已经存在的变量。

设置'reuse'为‘False’，当你仅仅想要创建一个新的变量。

如果'reuse' 为 'None'（默认），会返回一个新的变量和一个已经存在的变量。

如果初始化器是'None'（默认），使用构造器中传入的初始化器。如果构造函数中的初始化器也是'None‘，我使用一个新的'UniformUnitScalingInitializer’。如果初始化器是一个Tensor，我们使用它作为一个值，并从初始化器中获得shape.

参数：

name: 新的或已存在的变量的名字。
shape: 新的或已存在的变量的shape

tf.contrib.rnn.ResidualWrapper

Methods
__init__
__init__(
cell,
residual_fn=None
)
Constructs a ResidualWrapper for cell.

Args:
cell: An instance of RNNCell.
residual_fn: (Optional) The function to map raw cell inputs and raw cell outputs to the actual cell outputs of the residual network. Defaults to calling nest.map_structure on (lambda i, o: i + o), inputs and outputs.
__call__
__call__(
inputs,
state,
scope=None
)
Run the cell and then apply the residual_fn on its inputs to its outputs.

Args:
inputs: cell inputs.
state: cell state.
scope: optional cell scope.
Returns:
Tuple of cell outputs and new state.

tf.contrib.rnn.AttentionCellWrapper

Methods

`init`

__init__(    cell,    attn_length,    attn_size=None,    attn_vec_size=None,    input_size=None,    state_is_tuple=True,    reuse=None)Create a cell with attention.
Args: 
 cell: an RNNCell, an attention is added to it.
attn_length: integer, the size of an attention window.
attn_size: integer, the size of an attention vector. Equal to cell.output_size by default.
attn_vec_size: integer, the number of convolutional features calculated on attention state and a size of the hidden layer built from base cell state. Equal attn_size to by default.
input_size: integer, the size of a hidden linear layer, built from inputs and attention. Derived from the input tensor by default.
state_is_tuple: If True, accepted and returned states are n-tuples, where n = len(cells). By default (False), the states are all concatenated along the column axis.
reuse: (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.
tf.contrib.rnn.DeviceWrapper

def _single_cell(unit_type, num_units, forget_bias, dropout, mode,
                 residual_connection=False, device_str=None, residual_fn=None):
  """Create an instance of a single RNN cell."""
  # dropout (= 1 - keep_prob) is set to 0 during eval and infer
  dropout = dropout if mode == tf.contrib.learn.ModeKeys.TRAIN else 0.0

  # Cell Type
  if unit_type == "lstm":
    utils.print_out("  LSTM, forget_bias=%g" % forget_bias, new_line=False)
    single_cell = tf.contrib.rnn.BasicLSTMCell(
        num_units,
        forget_bias=forget_bias)
  elif unit_type == "gru":
    utils.print_out("  GRU", new_line=False)
    single_cell = tf.contrib.rnn.GRUCell(num_units)
  elif unit_type == "layer_norm_lstm":
    utils.print_out("  Layer Normalized LSTM, forget_bias=%g" % forget_bias,
                    new_line=False)
    single_cell = tf.contrib.rnn.LayerNormBasicLSTMCell(
        num_units,
        forget_bias=forget_bias,
        layer_norm=True)
  elif unit_type == "nas":
    utils.print_out("  NASCell", new_line=False)
    single_cell = tf.contrib.rnn.NASCell(num_units)
  else:
    raise ValueError("Unknown unit type %s!" % unit_type)

  # Dropout (= 1 - keep_prob)
  if dropout > 0.0:
    single_cell = tf.contrib.rnn.DropoutWrapper(
        cell=single_cell, input_keep_prob=(1.0 - dropout))
    utils.print_out("  %s, dropout=%g " %(type(single_cell).__name__, dropout),
                    new_line=False)

  # Residual
  if residual_connection:
    single_cell = tf.contrib.rnn.ResidualWrapper(
        single_cell, residual_fn=residual_fn)
    utils.print_out("  %s" % type(single_cell).__name__, new_line=False)

  # Device Wrapper
  if device_str:
    single_cell = tf.contrib.rnn.DeviceWrapper(single_cell, device_str)
    utils.print_out("  %s, device=%s" %
                    (type(single_cell).__name__, device_str), new_line=False)

  return single_cell