【Huggingface】DataCollatorForSeq2Seq中的pad_to_multiple_of选项

最新推荐文章于 2024-12-26 15:34:34 发布

Deno_V

最新推荐文章于 2024-12-26 15:34:34 发布

阅读量1.3k

点赞数 2

文章标签：深度学习自然语言处理 python

本文链接：https://blog.csdn.net/weixin_44839047/article/details/133873072

版权

【Huggingface】DataCollatorForSeq2Seq中的pad_to_multiple_of选项

官方的解释是

pad_to_multiple_of (int, optional):
If set will pad the sequence to a multiple of the provided value.
This is especially useful to enable the use of Tensor Cores on NVIDIA hardware with compute capability >=7.5 (Volta).

代码长这样

# DataCollatorForSeq2Seq实现中相关的部分
if self.pad_to_multiple_of is not None:
    max_label_length = (
       (max_label_length + self.pad_to_multiple_of - 1)
       // self.pad_to_multiple_of
       * self.pad_to_multiple_of
    )
features = self.tokenizer.pad(
        features,
        padding=self.padding,
        max_length=self.max_length,
        pad_to_multiple_of=self.pad_to_multiple_of,
        return_tensors=return_tensors,
)