-
关键字:
记忆单元
,循环神经网络
-
问题描述:使用
fluid.layers.DynamicRNN
创建一个循环神经网络时,在执行训练的时候出现错误,错误提示:第一个矩阵的宽度必须等于第二个矩阵的高度。 -
报错信息:
<ipython-input-7-fd22a596e844> in train(use_cuda, train_program, params_dirname)
41 event_handler=event_handler,
42 reader=train_reader,
---> 43 feed_order=feed_order)
/opt/conda/envs/py35-paddle1.0.0/lib/python3.5/site-packages/paddle/fluid/contrib/trainer.py in train(self, num_epochs, event_handler, reader, feed_order)
403 else:
404 self._train_by_executor(num_epochs, event_handler, reader,
--> 405 feed_order)
406
407 def test(self, reader, feed_order):
/opt/conda/envs/py35-paddle1.0.0/lib/python3.5/site-packages/paddle/fluid/contrib/trainer.py in _train_by_executor(self, num_epochs, event_handler, reader, feed_order)
481 exe = executor.Executor(self.place)
482 reader = feeder.decorate_reader(reader, multi_devices=False)
--> 483 self._train_by_any_executor(event_handler, exe, num_epochs, reader)
484
485 def _train_by_any_executor(self, event_handler, exe, num_epochs, reader):
/opt/conda/envs/py35-paddle1.0.0/lib/python3.5/site-packages/paddle/fluid/contrib/trainer.py in _train_by_any_executor(self, event_handler, exe, num_epochs, reader)
510 fetch_list=[
511 var.name
--> 512 for var in self.train_func_outputs
513 ])
514 else:
/opt/conda/envs/py35-paddle1.0.0/lib/python3.5/site-packages/paddle/fluid/executor.py in run(self, program, feed, fetch_list, feed_var_name, fetch_var_name, scope, return_numpy, use_program_cache)
468
469 self._feed_data(program, feed, feed_var_name, scope)
--> 470 self.executor.run(program.desc, scope, 0, True, True)
471 outs = self._fetch_data(fetch_list, fetch_var_name, scope)
472 if return_numpy:
EnforceNotMet: Enforce failed. Expected x_mat_dims[1] == y_mat_dims[0], but received x_mat_dims[1]:64 != y_mat_dims[0]:128.
First matrix's width must be equal with second matrix's height. at [/paddle/paddle/fluid/operators/mul_op.cc:59]
PaddlePaddle Call Stacks:
- 问题复现:使用
rnn.block
定义一个循环神经网络块,使用rnn.memory
定义一个记忆单元,大小设置为128,接着使用fluid.layers.fc
创建要给全连接层,大小设置为64。然后在执行训练的时就会出现上述的错误。错误代码如下:
emb = fluid.layers.embedding(input=ipt, size=[input_dim, 128], is_sparse=True)
sentence = fluid.layers.fc(input=emb, size=128, act='tanh')
rnn = fluid.layers.DynamicRNN()
with rnn.block():
word = rnn.step_input(sentence)
prev = rnn.memory(shape=[128])
hidden = fluid.layers.fc(input=[word, prev], size=64, act='relu')
rnn.update_memory(prev, hidden)
rnn.output(hidden)
last = fluid.layers.sequence_last_step(rnn())
out = fluid.layers.fc(input=last, size=2, act='softmax')
- 解决问题:上面的错误是因为记忆单元的大小和全连接层的大小不一致,有些用户会错误理解上一层和下一层的大小是互补相干的,这是错误的,它们的大小必须是一样的。正确代码如下:
emb = fluid.layers.embedding(input=ipt, size=[input_dim, 128], is_sparse=True)
sentence = fluid.layers.fc(input=emb, size=128, act='tanh')
rnn = fluid.layers.DynamicRNN()
with rnn.block():
word = rnn.step_input(sentence)
prev = rnn.memory(shape=[128])
hidden = fluid.layers.fc(input=[word, prev], size=128, act='relu')
rnn.update_memory(prev, hidden)
rnn.output(hidden)
last = fluid.layers.sequence_last_step(rnn())
out = fluid.layers.fc(input=last, size=2, act='softmax')