lstm 根据前文预测词_用LSTM-ptb模型张量流examp预测下一个词

我试图使用tensorflowLSTM model来进行下一个单词的预测。

如本related question(没有可接受的答案)中所述,示例包含伪代码以提取下一个单词的概率:lstm = rnn_cell.BasicLSTMCell(lstm_size)

# Initial state of the LSTM memory.

state = tf.zeros([batch_size, lstm.state_size])

loss = 0.0

for current_batch_of_words in words_in_dataset:

# The value of state is updated after processing each batch of words.

output, state = lstm(current_batch_of_words, state)

# The LSTM output can be used to make next word predictions

logits = tf.matmul(output, softmax_w) + softmax_b

probabilities = tf.nn.softmax(logits)

loss += loss_function(probabilities, target_words)

我对如何解释概率向量感到困惑。我修改了ptb_word_lm.py中PTBModel函数的__init__以存储概率和登录:class PTBModel(object):

"""The PTB model."""

def __init__(self, is_training, config):

# General definition of LSTM (unrolled)

# identical to tensorflow example ...

# omitted for brevity ...

# computing the logits (also from example code)

logits = tf.nn.xw_plus_b(output,

tf.get_variable("softmax_w", [size, vocab_size]),

tf.get_variable("softmax_b", [vocab_size]))

loss = seq2seq.sequence_loss_by_example([logits],

[tf.reshape(self._targets, [-1])],

[tf.ones([batch_size * num_steps])],

vocab_size)

self._cost = cost = tf.reduce_sum(loss) / batch_size

self._final_state = states[-1]

# my addition: storing the probabilities and logits

self.probabilities = tf.nn.softmax(logits)

self.logits = logits

# more model definition ...

然后在run_epoch函数中打印一些关于它们的信息:def run_epoch(session, m, data, eval_op, verbose=True):

"""Runs the model on the given data."""

# first part of function unchanged from example

for step, (x, y) in enumerate(reader.ptb_iterator(data, m.batch_size,

m.num_steps)):

# evaluate proobability and logit tensors too:

cost, state, probs, logits, _ = session.run([m.cost, m.final_state, m.probabilities, m.logits, eval_op],

{m.input_data: x,

m.targets: y,

m.initial_state: state})

costs += cost

iters += m.num_steps

if verbose and step % (epoch_size // 10) == 10:

print("%.3f perplexity: %.3f speed: %.0f wps, n_iters: %s" %

(step * 1.0 / epoch_size, np.exp(costs / iters),

iters * m.batch_size / (time.time() - start_time), iters))

chosen_word = np.argmax(probs, 1)

print("Probabilities shape: %s, Logits shape: %s" %

(probs.shape, logits.shape) )

print(chosen_word)

print("Batch size: %s, Num steps: %s" % (m.batch_size, m.num_steps))

return np.exp(costs / iters)

这样会产生如下输出:0.000 perplexity: 741.577 speed: 230 wps, n_iters: 220

(20, 10000) (20, 10000)

[ 14 1 6 589 1 5 0 87 6 5 3 5 2 2 2 2 6 2 6 1]

Batch size: 1, Num steps: 20

我希望probs向量是一个概率数组,在词汇表中每个单词都有一个(例如形状为(1, vocab_size)),这意味着我可以使用np.argmax(probs, 1)得到预测的单词,正如在另一个问题中所建议的那样。

然而,向量的第一个维度实际上等于展开的LSTM中的步骤数(如果使用小的配置设置,则为20),我不确定该如何处理。要访问预测的单词,我是否只需要使用最后一个值(因为它是最后一步的输出)?或者我还缺什么?

我试图通过查看seq2seq.sequence_loss_by_example的实现来理解如何进行预测和评估,该实现必须执行此评估,但最终调用gen_nn_ops._sparse_softmax_cross_entropy_with_logits,该实现似乎不包含在github repo中,所以我不确定还能在哪里查找。

我对tensorflow和LSTMs都很陌生,所以非常感谢您的帮助!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值