Tensorflow2.0之文本生成莎士比亚作品


我们将使用 Andrej Karpathy 在《循环神经网络不合理的有效性》一文中提供的莎士比亚作品数据集。给定此数据中的一个字符序列 (“Shakespear”),训练一个模型以预测该序列的下一个字符(“e”)。通过重复调用该模型,可以生成更长的文本序列。

1、导入数据

请参考Tensorflow2.0加载和预处理数据的方法汇总中的第八部分:导入文本(用于文本生成)。

2、创建模型

使用 tf.keras.Sequential 定义模型。在这个例子中,我们使用了三个层来定义模型:

  • tf.keras.layers.Embedding:输入层。一个可训练的对照表,它会将每个字符的数字映射到一个 embedding_dim 维度的向量。
  • tf.keras.layers.GRU:一种 RNN 的类型,其大小由 units=rnn_units 指定。
  • tf.keras.layers.Dense:输出层,带有 vocab_size 个输出。
vocab_size = len(vocab)  # 词集的长度
embedding_dim = 256  # 嵌入的维度
rnn_units = 1024  # RNN 的单元数量

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.GRU(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model
model = build_model(vocab_size=len(vocab),
  					embedding_dim=embedding_dim,
  					rnn_units=rnn_units,
  					batch_size=BATCH_SIZE)
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (64, None, 256)           16640     
_________________________________________________________________
gru (GRU)                    (64, None, 1024)          3938304   
_________________________________________________________________
dense (Dense)                (64, None, 65)            66625     
=================================================================
Total params: 4,021,569
Trainable params: 4,021,569
Non-trainable params: 0
_________________________________________________________________

对于每个字符,模型会查找嵌入,把嵌入当作输入运行 GRU 一个时间步,并用 Dense 层生成逻辑回归 ,预测下一个字符的对数可能性。
在这里插入图片描述

3、训练

3.1 编译模型

model.compile(optimizer='adam',
              loss=tf.keras.losses.sparse_categorical_crossentrop(from_logits=True))

3.2 配置检查点

# 检查点保存至的目录
checkpoint_dir = './training_checkpoints'

# 检查点的文件名
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

3.3 训练模型

history = model.fit(dataset, epochs=examples_per_epoch, callbacks=[checkpoint_callback])

4、预测

下面的代码块用来生成文本:

  • 首先设置起始字符串,初始化 RNN 状态并设置要生成的字符个数。
  • 用起始字符串和 RNN 状态,获取下一个字符的预测分布。
  • 然后,用分类分布计算预测字符的索引。把这个预测字符当作模型的下一个输入。
  • 模型返回的 RNN 状态被输送回模型。现在,模型有更多上下文可以学习,而非只有一个字符。在预测出下一个字符后,更改过的 RNN 状态被再次输送回模型。模型就是这样,通过不断从前面预测的字符获得更多上下文,进行学习。
    在这里插入图片描述

如上图所示,这里我们希望实现的功能是输入一个样本,设有n个字符,模型将输出每个输入字符后的一个字符,即共输出n个字符,然后将新得到的n个字符作为输入,再次输入模型,共重复这个步骤指定次数。

由于设置 GRU 隐藏状态的时候必须指定批次大小,所以模型建立好之后只能接受固定的批次大小。

若要使用不同的 batch_size 来运行模型,我们需要重建模型并从检查点中恢复权重。

4.1 重建模型

tf.train.latest_checkpoint(checkpoint_dir)
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)

model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))

model.build(tf.TensorShape([1, None]))
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (1, None, 256)            16640     
_________________________________________________________________
gru_1 (GRU)                  (1, None, 1024)           3938304   
_________________________________________________________________
dense_1 (Dense)              (1, None, 65)             66625     
=================================================================
Total params: 4,021,569
Trainable params: 4,021,569
Non-trainable params: 0
_________________________________________________________________

4.2 生成文本

def generate_text(model, start_string):
  # 评估步骤(用学习过的模型生成文本)

  # 要生成的字符个数
  num_generate = 1000

  # 将起始字符串转换为数字(向量化)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # 空字符串用于存储结果
  text_generated = []

  # 这里批大小为 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # 删除批次的维度
      predictions = tf.squeeze(predictions, 0)

      # 用分类分布预测模型返回的字符
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # 把预测字符和前面的隐藏状态一起传递给模型作为下一个输入
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

print(generate_text(model, start_string=u"ROMEO: "))

在这里输入了7个字符,那么每次模型输出的形状为 (1, 7, 65),我们只需要后面两个维度 (7, 65),所以使用 tf.squeeze 函数去掉第一个维度;使用 tf.random.categorical 函数来确定每一行中最大概率所对应的索引,然后将此索引(再增加第一维度后)重新作为输入,重复以上步骤直到预测完指定的1000个字符为止。

最终得到结果:

ROMEO: it may be see, I say.
Elong where I have sea loved for such heart
As of all desperate in your colls?
On how much to purwed esumptrues as we,
But taker appearing our great Isabel,;
Of your brother's needs.
I cannot but one hour, by nimwo and ribs
After 't? O Pedur, break our manory,
The shadot bestering eyes write; onfility;
Indeed I am possips And feated with others and throw it?

CAPULET:
O, not the ut with mine own sort.
But, with your souls, sir, well we would he,
And videwith the sungesoy begins, revell;
Much it in secart.

PROSPERO:
Villain, I darry record;
In sea--lodies, nor that I do I were stir,
You appointed with that sed their o tailor and hope left fear'd,
I so; that your looks stand up,
Comes I truly see this last weok not the
sul us.

CAMILLO:
You did and ever sea,
Into these hours: awake! Ro with mine enemies,
Were werx'd in everlawacted man been to alter
As Lewis could smile to his.

Farthus:
Marry! I'll do lose a man see me
To no drinking often hat back on an illing mo
  • 4
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 10
    评论
下面是使用Python中的TensorFlow生成更加优美的莎士比亚风格诗句的示例代码: ```python import tensorflow as tf import numpy as np # 定义模型参数 num_epochs = 50 batch_size = 64 rnn_size = 256 num_layers = 2 learning_rate = 0.01 keep_prob = 0.5 # 读取数据 with open('shakespeare.txt', 'r') as f: text = f.read() # 构建字符映射表 vocab = set(text) vocab_to_int = {c: i for i, c in enumerate(vocab)} int_to_vocab = {i: c for i, c in enumerate(vocab)} encoded = np.array([vocab_to_int[c] for c in text], dtype=np.int32) # 构建输入数据和标签 seq_length = 100 num_seqs = len(encoded) // seq_length inputs = np.zeros((num_seqs, seq_length), dtype=np.int32) labels = np.zeros((num_seqs, seq_length), dtype=np.int32) for i in range(num_seqs): inputs[i] = encoded[i * seq_length:(i + 1) * seq_length] labels[i] = encoded[i * seq_length + 1:(i + 1) * seq_length + 1] # 构建模型 inputs_placeholder = tf.placeholder(tf.int32, [None, None], name='inputs') labels_placeholder = tf.placeholder(tf.int32, [None, None], name='labels') keep_prob_placeholder = tf.placeholder(tf.float32, name='keep_prob') embedding_size = 128 rnn_inputs = tf.contrib.layers.embed_sequence(inputs_placeholder, len(vocab), embedding_size) cell = tf.contrib.rnn.BasicLSTMCell(rnn_size) drop = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=keep_prob_placeholder) stacked_rnn = tf.contrib.rnn.MultiRNNCell([drop] * num_layers) initial_state = stacked_rnn.zero_state(batch_size, tf.float32) outputs, final_state = tf.nn.dynamic_rnn(stacked_rnn, rnn_inputs, initial_state=initial_state) logits = tf.contrib.layers.fully_connected(outputs, len(vocab), activation_fn=None) # 定义损失函数和优化器 loss = tf.contrib.seq2seq.sequence_loss( logits, labels_placeholder, tf.ones([batch_size, seq_length], dtype=tf.float32), average_across_timesteps=False, average_across_batch=True ) optimizer = tf.train.AdamOptimizer(learning_rate) train_op = optimizer.minimize(loss) # 训练模型 with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for epoch in range(num_epochs): state = sess.run(initial_state) for i in range(num_batches): x = inputs[i * batch_size:(i + 1) * batch_size] y = labels[i * batch_size:(i + 1) * batch_size] feed = {inputs_placeholder: x, labels_placeholder: y, initial_state: state, keep_prob_placeholder: keep_prob} batch_loss, state, _ = sess.run([loss, final_state, train_op], feed_dict=feed) print('Epoch {}/{}...'.format(epoch + 1, num_epochs), 'Batch Loss: {:.4f}'.format(batch_loss)) # 生成新的文本 gen_length = 500 prime_words = 'To be or not to be:' gen_sentences = prime_words prev_state = sess.run(initial_state, feed_dict={batch_size: 1}) for word in prime_words.split(): x = np.zeros((1, 1)) x[0, 0] = vocab_to_int[word] feed = {inputs_placeholder: x, initial_state: prev_state, keep_prob_placeholder: 1.0} prev_state = sess.run(final_state, feed_dict=feed) for i in range(gen_length): feed = {inputs_placeholder: x, initial_state: prev_state, keep_prob_placeholder: 1.0} preds, prev_state = sess.run([probs, final_state], feed_dict=feed) pred = preds[0] next_index = np.random.choice(len(pred), p=pred) next_char = int_to_vocab[next_index] gen_sentences += next_char x = np.zeros((1, 1)) x[0, 0] = next_index print(gen_sentences) ``` 这段代码中,我们首先读取了莎士比亚的诗歌作为训练数据,并构建了字符映射表。然后,我们使用TensorFlow搭建了一个LSTM模型,并对模型进行了训练。最后,我们使用训练好的模型生成了新的莎士比亚风格的诗句。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

cofisher

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值