修正ptb_word_lm.py示例中的问题

最新推荐文章于 2022-08-16 10:05:47 发布

jsmlay

最新推荐文章于 2022-08-16 10:05:47 发布

阅读量1.5k

点赞数

CC 4.0 BY-SA版权

分类专栏：数据挖掘机器学习

本文链接：https://blog.csdn.net/jsmlay/article/details/54882997

在Windows 7环境下，使用Python 3和TensorFlow 1.0.0 alpha运行ptb_word_lm.py时遇到错误，包括找不到rnn_cell和seq2seq模块，以及reader代码中的TypeError。解决方案是导入tensorflow.contrib.rnn模块，使用相应类，并修改reader代码中的读取方式，将字节对象转换为字符串。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

在win7 + python3 + tf 1.0.0 alpha下执行ptb_word_lm.py运行时出现以下问题：

1、无法找到rnn_cell；

2、无法找到seq2seq；

3、其他。

修正方法为：

from tensorflow.contrib import rnn

rnn.BasicLSTMCell

rnn.DropoutWrapper

rnn.MultiRNNCell

tf.contrib.legacy_seq2seq.sequence_loss_by_example

另，reader代码中出现错误：

TypeError: a bytes-like object is required, not 'str'

修正方法：

line30 修改为f.read().decode("utf-8").replace("\n", "<eos>").split()

贴ptb_word_lm修正后源码如下：

# -*- coding: utf-8 -*-

# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

"""Example / benchmark for building a PTB LSTM model.

Trains the model described in:
(Zaremba, et. al.) Recurrent Neural Network Regularization
http://arxiv.org/abs/1409.2329

There are 3 supported model configurations:
===========================================
| config | epochs | train | valid  | test
===========================================
| small  | 13     | 37.99 | 121.39 | 115.91
| medium | 39     | 48.45 |  86.16 |  82.07
| large  | 55     | 37.87 |  82.62 |  78.29
The exact results may vary depending on the random initialization.

The hyperparameters used in the model:
- init_scale - the initial scale of the weights
- learning_rate - the initial value of the learning rate
- max_grad_norm - the maximum permissible norm of the gradient
- num_layers - the number of LSTM layers
- num_steps - the number of unrolled steps of LSTM
- hidden_size - the number of

最低0.47元/天解锁文章