2016年8月10日作。
State 不是 Tuple 时,代码如下。一个epoch的最开始,我们要把一个 全零array,输入到 initial_state 中,然后输出 finial_state 覆盖到 state 中,然后执行下一个迭代。
但是当LSTM设置为 state is tuple, TensorFlow是返回一个LSTMStateTuple的,是不可以通过eval() 来得到参数的,而且也不能被feed!!
下面的代码中,tl.layers.initialize_rnn_state 使用eval 得到initial state 的全零array。
来源:https://github.com/zsdonghao/tensorlayer/blob/master/tutorial_ptb_lstm.py
state1 = tl.layers.initialize_rnn_state(lstm1.initial_state) | |
state2 = tl.layers.initialize_rnn_state(lstm2.initial_state) | |
for step, (x, y) in enumerate(tl.iterate.ptb_iterator(train_data, | |
batch_size, num_steps)): | |
feed_dict = {input_data: x, targets: y, | |
lstm1.initial_state: state1, | |
lstm2.initial_state: state2, | |
} | |
# For training, enable dropout | |
feed_dict.update( network.all_drop ) | |
_cost, state1, state2, _ = sess.run([cost, | |
lstm1.final_state, | |
lstm2.final_state, | |
train_op], | |
feed_dict=feed_dict | |
) | |
costs += _cost; iters += num_steps | |
为了解决这个问题,当State 是Tuple 时 ,tl.layers.initialize_rnn_state 会返回一个 tuple 里面有对应 cell 和 hidden state 的全零 array。
然后,当feed in 时,还要分别对 cell 和 hidden state feed in。
来源:https://github.com/zsdonghao/tensorlayer/blob/master/tutorial_ptb_lstm_state_is_tuple.py
state1 = tl.layers.initialize_rnn_state(lstm1.initial_state) | |
state2 = tl.layers.initialize_rnn_state(lstm2.initial_state) | |
for step, (x, y) in enumerate(tl.iterate.ptb_iterator(train_data, | |
batch_size, num_steps)): | |
feed_dict = {input_data: x, targets: y, | |
lstm1.initial_state.c: state1[0], | |
lstm1.initial_state.h: state1[1], | |
lstm2.initial_state.c: state2[0], | |
lstm2.initial_state.h: state2[1], | |
} | |
# For training, enable dropout | |
feed_dict.update( network.all_drop ) | |
_cost, state1_c, state1_h, state2_c, state2_h, _ = \ | |
sess.run([cost, | |
lstm1.final_state.c, | |
lstm1.final_state.h, | |
lstm2.final_state.c, | |
lstm2.final_state.h, | |
train_op], | |
feed_dict=feed_dict | |
) | |
state1 = (state1_c, state1_h) | |
state2 = (state2_c, state2_h) |