static_rnn与dynamic_rnn的区别

最新推荐文章于 2020-06-08 21:51:51 发布

sgdd123

最新推荐文章于 2020-06-08 21:51:51 发布

阅读量817

点赞数

分类专栏： python 文章标签： static_rnn dynamic_rnn

本文链接：https://blog.csdn.net/sgdd123/article/details/99722745

版权

python 专栏收录该内容

6 篇文章

订阅专栏

static_rnn与dynamic_rnn都可以将训练数据与一个RNN网络关联起来。两者之间的不同在于，dynamic_rnn支持不同时间步的数据，而static_rnn必须要求所有输入的数据时间步是相同的。
RNN的网络结构是多变的，可以一对多，多对多，多对一。虽然网络结构多变但是并不影响RNN的参数。因为RNN中的参数应用的都是相同的一套参数。通过迭代次数来表示输入的序列时间步长度（与输入的维度不同，例如连续10个时间点的3维数据，时间步长度为10，输入维度为3）。通过损失函数中应用的输出来表示输出的数据维度。实际上每次迭代都有一个输出，但如果程序在损失函数中只应用了最后一次的输出，那么这就是一个多对一的网络。
所以应用BasicLSTMCell函数中的num_hidden确定参数维度后，理论上可以应对任意时间步的输入，无非是迭代次数的变化而已。但是在static中是不支持的，应当应用dynamic_rnn来建立输入与输出的关系。
两者的不同在输入与输出数据的维度上也有不同。我们通常用来训练的数据是一个三维数据集，各维度的含义是[batch_size, time_step, input_size]。这种维度定义方法与dynamic_rnn所要求的输入维度是相统一的，所以可以直接输入。如程序：

x = np.array([[[11,12,13,14],[21,22,23,24],[31,32,33,34]],\
	[[111,112,113,114],[121,122,123,124],[131,132,133,134]]])
batch_size = 2
time_step = 3
input_size = 4

xs = tf.placeholder(tf.float32,shape=[batch_size, time_step, input_size])

num_hidden = 128
num_class = 2
weights = tf.Variable(tf.random_normal([num_hidden,num_class]))
biases = tf.Variabel(tf.random_normal([num_class]))

lstm_cell =tf.nn.rnn_cell.BasicLSTMCell(num_hidden)
outputs, states =tf.nn.dynamic_rnn(lstm_cell, xs, dtype=tf.float32)
outputs = tf.transpose(outputs, [1, 0, 2])

y = tf.matmul(outputs[-1], weights) + biases)

我们应当特别注意无论是static_rnn还是dynamic_rnn，输入输出的数据维度是相同的。与dynamic_rnn相对应的输出结果outputs的数据维度是[batch_size, time_step, output_size]。那么outputs[-1]得到的数据维度是[ time_step, output_size]，而我们期望的数据维度是[batch_size, output_size]，所以需要应用tf.transpose(outputs, [1, 0, 2])将outputs的第一维与第二维交换顺序。

而static_rnn所需要的数据由维度为是[ batch_size, output_size]的二维数组构成的list数据集。有两种方法可以得到这种数据集。

A = tf.transpose(xs, [1, 0, 2]) # permute time_step and batch_size
A = tf.reshape(A, [-1, input_size]) # (time_step*batch_size, input_size)
A = tf.split(A, time_step ,0 ) # time_step * (batch_size, input_size)

第二种方法

A = tf.unstack(xs,num= 3, axis=1)

完整的程序为

x = np.array([[[11,12,13,14],[21,22,23,24],[31,32,33,34]],[[111,112,113,114],[121,122,123,124],[131,132,133,134]]])
batch_size = 2
time_step = 3
input_size = 4

xs = tf.placeholder(tf.float32,shape=[batch_size, time_step, input_size])

A = tf.transpose(xs, [1, 0, 2]) # permute time_step and batch_size
A = tf.reshape(A, [-1, input_size]) # (time_step*batch_size, input_size)
A = tf.split(A, time_step ,0 ) # time_step * (batch_size, input_size)

num_hidden = 128
num_class = 2
weights = tf.Variable(tf.random_normal([num_hidden,num_class]))
biases = tf.Variabel(tf.random_normal([num_class]))

lstm_cell =tf.nn.rnn_cell.BasicLSTMCell(num_hidden, forget_bias=1.0)
_init_state=lstm_cell.zero_state(batch_size,dtype=tf.float32)
outputs, states =tf.nn.static_rnn(lstm_cell, A, initial_state=_init_state)

y = tf.matmul(outputs[-1], weights) + biases)