beam_output = model.generate(**model_inputs,
max_new_tokens=40,
num_beams=5,# generation is finished when all beam hypotheses reached the EOS token
early_stopping=True)
语言模型在生成时经常会出现重复生成的问题,CTRL 这篇论文提供了一个简单的解决方案,就是记录之前已经生成过的 token,当预测下一个 token 时,人为降低已经生成过的 token 的分数,使其被采样到的概率降低
p
i
=
exp
(
x
i
/
(
T
⋅
I
(
i
∈
g
)
)
∑
j
exp
(
x
j
/
(
T
⋅
I
(
j
∈
g
)
)
I
(
c
)
=
θ
if
c
is True else
1
p_i=\frac{\exp \left(x_i /(T \cdot I(i \in g))\right.}{\sum_j \exp \left(x_j /(T \cdot I(j \in g))\right.} \quad I(c)=\theta \text { if } \mathrm{c} \text { is True else } 1
pi=∑jexp(xj/(T⋅I(j∈g))exp(xi/(T⋅I(i∈g))I(c)=θ if c is True else 1其中,
T
T
T 为 temperature,
g
g
g 为已生成的 token 列表,
θ
≥
1
\theta\geq 1
θ≥1 为惩罚系数