在训练seq2seq网络时,单个gpu没有问题,但是当处理多个gpu并行时,问题来了:
在train时,总会报错:
InvalidArgumentError (see above for traceback): logits and labels must have the same first dimension, got logits shape [12,737] and labels shape [14]
[[{
{node GPU_2/sequence_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}} = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:2"](GPU_2/sequence_loss/Reshape, GPU_2/sequence_loss/Reshape_1/_1659)]]
[[{
{node update_parameters/clip_by_global_norm/mul_468/_19885}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name=“edge_148755_update_parameters/clip_by_global_norm/mul_468”, tens