Transformers代码笔记系列2(GenerationMixin生成模块）

最新推荐文章于 2025-04-06 11:16:13 发布

真炎破天

最新推荐文章于 2025-04-06 11:16:13 发布

阅读量2.5k

点赞数 2

分类专栏： nlp 文章标签：深度学习人工智能

本文链接：https://blog.csdn.net/u012409283/article/details/121991586

版权

nlp 专栏收录该内容

10 篇文章

订阅专栏

本文详细介绍了使用Transformers库进行文本生成时所涉及的关键参数及其作用。这些参数包括input_ids、decoder_input_ids等，覆盖了从基本配置到高级调整的所有方面。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

应用实例

（添加transformers代码）

参数介绍

input_ids：对于encoder-decoder模型（例如：T5），input_ids用于encoder输入，生成encoder_outputs特征；对于decoder模型（例如：GPT3），input_ids用于提示词，如果input_ids为None，则初始化为bos_token_id
decoder_input_ids：用于encoder-decoder模型中decoder模块的输入
logits_processor：used to modify the prediction scores of the language modeling head applied at each generation step
max_new_tokens：最大可以生成的新词，和max_length作用一致，不同时使用
max_length
do_sample：bool, 是否使用采样策略，默认使用贪婪搜索
early_stopping:
num_beam
temperature
top_k
top_p
repetition_penalty: 重复度惩罚参数，1表示没有惩罚，详见：https://arxiv.org/pdf/1909.05858.pdf
length_penalty：Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer sequences.
no_repeat_ngram_size：不能重复出现的ngram长度
encoder_no_repeat_ngram_size：出现在encoder_input_ids中的n-gram不能出现在decoder_input_ids中
bad_words_ids: 不被允许生成的词的list
use_cache：Whether or not the model should use the past last key/values attentions to speed up decoding
num_beam_groups: https://arxiv.org/pdf/1610.02424.pdf
diversity_penalty
output_scores：是否返回预测序列的打分
synced_gpus
model_kwargs：将被加到forward函数，如果是encoder-decoder model,则encoder参数不需要加前缀，而decoder参数需要加前缀‘decoder_’