损失函数详细复现（pytorch版本）

夏天是冰红茶

已于 2024-07-11 20:14:10 修改

阅读量1k

点赞数 27

分类专栏： pytorch复现文章标签： pytorch 人工智能 python

于 2024-01-25 19:54:13 首次发布

本文链接：https://blog.csdn.net/m0_62919535/article/details/135846638

版权

pytorch复现专栏收录该内容

19 篇文章 2 订阅

订阅专栏

本文介绍了损失函数在机器学习和深度学习中的作用，重点讲解了L1Loss、均方误差(L2Loss)、二元交叉熵(BCELoss)和交叉熵损失(CrossEntropyLoss)的概念、计算方式以及在PyTorch中的应用实例。这些损失函数用于评估模型预测与实际标签的差距，通过最小化损失优化模型性能。

摘要由CSDN通过智能技术生成

什么是损失函数

损失函数（Loss Function）是在机器学习和深度学习中用于评估模型预测结果与实际标签之间差异的函数。它衡量了模型的性能，即模型对训练样本的预测与实际标签的偏差程度。目标是通过调整模型参数，使损失函数的值最小化，从而提高模型的准确性和泛化能力。

常见的损失函数

这里的复现主要是与官方的实现进行对比实验。

L1Loss

它叫做平均绝对误差，定义如下所示：

$eq?L_%7B1%7D%20%3D%20%5Cfrac%7B1%7D%7BN%7D%5Csum_%7BN%7D%5E%7Bi%3D1%7D%5Cleft%20%7C%20y_%7Bi%7D-%5Chat%7By_%7Bi%7D%7D%20%5Cright%20%7C$

其中， $eq?y_%7Bi%7D$ 表示样本i的真实标签， $eq?%5Chat%7By_%7Bi%7D%7D$ 表示模型对于样本i的预测标签。将每个样本的绝对误差取平均值，得到L1 Loss。

class L1Loss(nn.Module):
    def __init__(self):
        super(L1Loss, self).__init__()

    def forward(self, input, target):
        loss = torch.mean(torch.abs(input - target))
        return loss

测试代码为以下所示：

if __name__=="__main__":
    criterion1 = nn.L1Loss()
    criterion2 = L1Loss()

    input_data=torch.Tensor([2, 3, 4, 5])
    target_data=torch.Tensor([4, 5, 6, 7])
    loss1 = criterion1(input_data, target_data)
    print(loss1)
    loss2 = criterion2(input_data, target_data)
    print(loss2)

测试输出均为 tensor(2.)

L2Loss

它叫做均方误差，定义如下所示：

$eq?L_%7B2%7D%20%3D%20%5Cfrac%7B1%7D%7BN%7D%5Csum_%7BN%7D%5E%7Bi%3D1%7D%28y_%7Bi%7D--%5Chat%7By_%7Bi%7D%7D%29%5E%7B2%7D$

其中， $eq?y_%7Bi%7D$ 表示样本i的真实标签， $eq?%5Chat%7By_%7Bi%7D%7D$ 表示模型对于样本i的预测标签。测量预测输出中的每个元素与目标或地面实况中的相应元素之间的平均平方差。

class L2Loss(nn.Module):
    def __init__(self):
        super(L2Loss, self).__init__()

    def forward(self, input, target):
        loss = torch.mean(torch.pow(input - target, 2))
        return loss

测试代码为以下所示：

if __name__=="__main__":
    criterion1 = nn.MSELoss()
    criterion2 = L2Loss()

    input_data=torch.Tensor([2, 3, 4, 5])
    target_data=torch.Tensor([4, 5, 6, 7])
    loss1 = criterion1(input_data, target_data)
    print(loss1)
    loss2 = criterion2(input_data, target_data)
    print(loss2)

测试输出均为 tensor(4.)

BCELoss

二元交叉熵损失（Binary Cross Entropy Loss），也称为对数损失。

$eq?%5Ctext%7BBCELoss%7D%20%3D%20-%5Cfrac%7B1%7D%7BN%7D%20%5Csum_%7Bi%3D1%7D%5E%7BN%7D%20%5Cleft%28%20y_i%20%5Clog%28%5Chat%7By%7D_i%29%20+%20%281%20-%20y_i%29%20%5Clog%281%20-%20%5Chat%7By%7D_i%29%20%5Cright%29$

其中， $eq?y_%7Bi%7D$ 表示样本i的真实标签， $eq?%5Chat%7By_%7Bi%7D%7D$ 表示模型对于样本i的预测标签。用于测量预测输出中的每个元素与目标或地面实况中的相应元素之间的对数概率差异。

class BCELoss(nn.Module):
    def __init__(self):
        super(BCELoss, self).__init__()

    def forward(self, input, target):
        input = torch.sigmoid(input)
        loss = - (target * torch.log(input) + (1 - target) * torch.log(1 - input))
        return loss.mean()

测试代码为以下所示：

if __name__=="__main__":
    criterion1 = nn.BCELoss()
    criterion2 = BCELoss()
    input_data = torch.randn((5,))
    print(input_data)
    target_data = torch.randint(0, 2, (5,), dtype=torch.float32)
    print(target_data)
    loss1 = criterion1(torch.sigmoid(input_data), target_data)
    loss2 = criterion2(input_data, target_data)
    print("PyTorch BCELoss:", loss1.item())
    print("MY BCELoss:", loss2.item())

tensor([-2.0343, -1.5186, 1.6389, 0.4658, 0.6823])
tensor([1., 0., 1., 1., 1.])

测试输出均为 0.6857892274856567

当实际标签为1时（ $eq?y_%7Bi%7D%3D1$ ），我们希望模型的预测概率越接近1，因为实际上这个样本是正类别。因此，我们希望 $eq?%5Chat%7By_%7Bi%7D%7D$ 越大，这样 $eq?log%28%5Chat%7By_%7Bi%7D%7D%29$ 的值越小。因此，我们的损失项是 $eq?-log%28%5Chat%7By_%7Bi%7D%7D%29$ 。

当实际标签为0时（ $eq?y_%7Bi%7D%3D0$ ），我们希望模型的预测概率越接近0，因为实际上这个样本是负类别。因此，我们希望 $eq?1-%5Chat%7By_%7Bi%7D%7D$ 越大，这样 $eq?log%281-%5Chat%7By_%7Bi%7D%7D%29$ 的值就越小。因此，我们的损失项是 $eq?-%281-%5Chat%7By_%7Bi%7D%7D%29log%281-%5Chat%7By_%7Bi%7D%7D%29$ 。

我们将上述两种情况的损失项相加，并取平均。最终的BCELoss公式是上述两项的求和。

CrossEntropyLoss

交叉熵损失（CrossEntropyLoss）是在深度学习中常用于多分类问题的一种损失函数。它衡量了模型输出的概率分布与真实标签之间的差异。

$eq?%5Ctext%7BCrossEntropyLoss%7D%28x%2C%20y%29%20%3D%20-%5Cfrac%7B1%7D%7BN%7D%20%5Csum_%7Bi%3D1%7D%5E%7BN%7D%20%5Clog%5Cleft%28%5Cfrac%7B%5Cexp%28x_%7Bi%2C%20y_i%7D%29%7D%7B%5Csum_%7Bj%3D1%7D%5E%7BC%7D%20%5Cexp%28x_%7Bi%2C%20j%7D%29%7D%5Cright%29$

其中， $eq?y_%7Bi%7D$ 表示样本i的真实标签， $eq?%5Chat%7By_%7Bi%7D%7D$ 表示模型对于样本i的预测标签。

 class CrossEntropyLoss(nn.Module):
    def __init__(self):
        super(CrossEntropyLoss, self).__init__()

    def forward(self, input, target):
        return nn.NLLLoss()(F.log_softmax(input, dim=1), target)

测试代码为以下所示：

if __name__ == "__main__":
    criterion1 = nn.CrossEntropyLoss()
    criterion2 = CrossEntropyLoss()

    input_data = torch.randn((3, 5))
    target_data = torch.randint(0, 5, (3,))
    loss1 = criterion1(input_data, target_data)
    loss2 = criterion2(input_data, target_data)
    print("PyTorch CrossEntropyLoss:", loss1.item())
    print("Custom CrossEntropyLoss:", loss2.item())

测试输出均为 2.0007288455963135

分子部分 $eq?exp%28x_%7Bi%2Cj%7D%29$ 这是模型对第 i 个样本正确类别的原始输出的指数形式。这一部分希望越大越好，因为我们希望模型对正确类别有更高的置信度。

分母部分 $eq?%5Csum_%7Bj%3D1%7D%5E%7BC%7D%20%5Cexp%28x_%7Bi%2Cj%7D%29$ 这是模型对第i个样本所有类别原始输出的指数形式的和。这一部分用于归一化，将原始输出转化为概率分布。通过除以这个和，我们得到每个类别的概率，表示模型对每个类别的相对置信度。

对上述概率分布取对数。这个操作将概率空间映射到实数空间，使得我们可以用数值优化的方法来优化模型。这一部分希望越小越好，因为我们希望模型对真实标签的估计概率越接近于1。

参考文章

L1 loss 是什么_l1loss-CSDN博客

【损失函数】(三) NLLLoss原理 & pytorch代码解析_pytorch nll_loss-CSDN博客

损失函数（lossfunction）的全面介绍（简单易懂版）-CSDN博客

夏天是冰红茶

关注

27
点赞
踩
27

收藏

觉得还不错? 一键收藏
打赏
0
评论
损失函数详细复现（pytorch版本）

损失函数（Loss Function）是在机器学习和深度学习中用于评估模型预测结果与实际标签之间差异的函数。它衡量了模型的性能，即模型对训练样本的预测与实际标签的偏差程度。目标是通过调整模型参数，使损失函数的值最小化，从而提高模型的准确性和泛化能力。
复制链接

扫一扫