处理成对比较数据和训练-CSDN博客

本文链接：https://blog.csdn.net/apple_63504356/article/details/139743898

pairwise库包含了几个关键组件，主要用于处理成对比较数据和训练。这些组件包括数据整理器（PairwiseDataCollatorWithPadding）和一个专门的训练器（PairwisePeftTrainer），这些组件主要用于训练模型进行成对选择，比如在推荐系统或排序任务中评估哪个项目更优。下面是对文件中代码的编写思路、编写目的和具体作用的详细分析：

编写思路

继承和扩展：PairwiseDataCollatorWithPadding类继承了DynamicDataCollatorWithPadding，添加了针对成对数据处理的逻辑。PairwisePeftTrainer类则继承自PeftTrainer，专门计算成对比较的损失函数。
自定义数据整理器：修改数据整理器的__call__方法来适应成对数据的处理需求，确保每个batch中包含两组数据——被选择的数据和被拒绝的数据。
损失函数的定制：在PairwisePeftTrainer中重写compute_loss方法，使用成对逻辑来计算损失，即比较被选择项与被拒绝项的评分差异。

编写目的

支持成对数据训练：针对需要成对比较的任务（如排序或推荐），提供特定的数据处理和训练支持。
优化模型性能：通过精确的损失计算，帮助模型更好地学习区分不同项的优劣，提升模型的决策能力。
提高训练的灵活性：使训练过程能够处理成对数据，提供更多的训练自定义选项，如返回损失和其他输出，以便进行更细致的性能调试和评估。

作用

PairwiseDataCollatorWithPadding：

确保数据整理过程中能够处理成对的输入格式，每个batch包含等量的被选择和被拒绝的示例。
通过继承DynamicDataCollatorWithPadding，保持了在批数据中动态填充的功能，同时增加了处理成对数据结构的能力。

class PairwiseDataCollatorWithPadding(DynamicDataCollatorWithPadding):
    r"""
    Data collator for pairwise data.
    """

    def __call__(self, features: Sequence[Dict[str, Union[torch.Tensor, Sequence[int]]]]) -> Dict[str, torch.Tensor]:
        r"""
        Pads batched data to the longest sequence in the batch.

        We generate 2 * n examples where the first n examples represent chosen examples and
        the last n examples represent rejected examples.
        """
        features = [{"input_ids": feature[key]} for key in ("accept_ids", "reject_ids") for feature in features]
        return super().__call__(features)

PairwisePeftTrainer：

特别设计来计算成对数据的损失，这对于需要评估两个选项并选择最佳选项的任务至关重要。
允许训练过程中返回额外的输出，例如计算的损失、接受项的评分和拒绝项的评分，有助于分析模型表现。
支持自定义训练行为，可根据具体需求调整损失函数和输出。

class PairwisePeftTrainer(PeftTrainer):
    r"""
    Inherits PeftTrainer to compute pairwise loss.
    """

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.can_return_loss = True # override property to return eval_loss

    def compute_loss(self, model, inputs, return_outputs=False):
        
        batch_size = inputs["input_ids"].size(0) // 2
        _, _, values = model(**inputs)
        r_accept, r_reject = values[:, -1].split(batch_size, dim=0)
        loss = -torch.log(torch.sigmoid(r_accept - r_reject)).mean()
        return (loss, [loss, r_accept, r_reject]) if return_outputs else loss

pairwise库通过提供专门的数据整理器和训练器，为成对比较任务提供了必要的工具，使得训练模型可以有效地学习如何从两个选项中选择更好的一个