【PyTorch】4.2 损失函数（一）

最新推荐文章于 2024-06-22 22:41:08 发布

猎猎长风

最新推荐文章于 2024-06-22 22:41:08 发布

阅读量279

点赞数

分类专栏： PyTorch

本文链接：https://blog.csdn.net/weixin_40633696/article/details/118300919

版权

PyTorch 专栏收录该内容

34 篇文章 17 订阅

订阅专栏

一、损失函数概念

在这里插入图片描述

size_average和reduce两个参数将被舍弃，勿用。

测试代码：

完整代码见：【PyTorch学习】3.1 模型创建与nn.Module

......
# 参数设置
......
# ============================ step 1/5 数据 ============================
......

# 构建MyDataset实例
......
# 构建DataLoder
......
# ============================ step 2/5 模型 ============================
......
# ============================ step 3/5 损失函数 ============================
loss_functoin = nn.CrossEntropyLoss()                                                   # 选择损失函数
# ============================ step 4/5 优化器 ============================
 # 选择优化器
 # 设置学习率下降策略
 .......
# ============================ step 5/5 训练 ============================
......
for epoch in range(MAX_EPOCH):
......
    for i, data in enumerate(train_loader):
        # forward
		......
        # backward
        optimizer.zero_grad()
        loss = loss_functoin(outputs, labels)
        loss.backward()
        # update weights
		......
        # 统计分类情况
        # 打印训练信息
		......

在loss_functoin = nn.CrossEntropyLoss()和loss = loss_functoin(outputs, labels)处设置断点，进行debug了解其机制。

首先debug到loss_functoin = nn.CrossEntropyLoss()，并"step into"进入。

class CrossEntropyLoss(_WeightedLoss):
    r"""This criterion combines :func:`nn.LogSoftmax` and 
	.......
    Examples::

        >>> loss = nn.CrossEntropyLoss()
        >>> input = torch.randn(3, 5, requires_grad=True)
        >>> target = torch.empty(3, dtype=torch.long).random_(5)
        >>> output = loss(input, target)
        >>> output.backward()
    """
    __constants__ = ['weight', 'ignore_index', 'reduction']

    def __init__(self, weight=None, size_average=None, ignore_index=-100,
                 reduce=None, reduction='mean'):
        super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)
        self.ignore_index = ignore_index

    def forward(self, input, target):
        return F.cross_entropy(input, target, weight=self.weight,
                               ignore_index=self.ignore_index, reduction=self.reduction)

"step into"进入:super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)

class _WeightedLoss(_Loss):
    def __init__(self, weight=None, size_average=None, reduce=None, reduction='mean'):
        super(_WeightedLoss, self).__init__(size_average, reduce, reduction)
        self.register_buffer('weight', weight)

_WeightedLoss继承于_Loss

"step into"进入:super(_WeightedLoss, self).__init__(size_average, reduce, reduction)

class _Loss(Module):
    def __init__(self, size_average=None, reduce=None, reduction='mean'):
        super(_Loss, self).__init__()
        if size_average is not None or reduce is not None:
            self.reduction = _Reduction.legacy_get_string(size_average, reduce)
        else:
            self.reduction = reduction

_Loss 又继承于 Module

接下来继续debug，"step into"进入:loss_functoin = nn.CrossEntropyLoss()
在这里插入图片描述
"step into"进入:result = self.forward(*input, **kwargs)

这里调用了F.cross_entropy，在该处"step into"进入cross_entropy中，对reduction进行判断并进行计算。

二、交叉熵损失函数

交叉熵和相对熵的关系

在这里插入图片描述 $P$ 是一个真实的概率分布(训练集、样本分布)， $Q$ 是模型的分布。由于训练集是固定的， $H (P)$ 是固定的，是一个常数，在做优化时，常数是可以忽略掉的。所以，优化交叉熵，相当于优化相对熵。

三、PyTorch中的损失函数

1. nn.CrossEntropyLoss

在这里插入图片描述

1.1 reduction三种模式测试

测试代码：

import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np

# fake data
inputs = torch.tensor([[1, 2], [1, 3], [1, 3]], dtype=torch.float)
target = torch.tensor([0, 1, 1], dtype=torch.long)

# ----------------------------------- CrossEntropy loss: reduction -----------------------------------
# flag = 0
flag = 1
if flag:
    # def loss function
    loss_f_none = nn.CrossEntropyLoss(weight=None, reduction='none')
    loss_f_sum = nn.CrossEntropyLoss(weight=None, reduction='sum')
    loss_f_mean = nn.CrossEntropyLoss(weight=None, reduction='mean')

    # forward
    loss_none = loss_f_none(inputs, target)
    loss_sum = loss_f_sum(inputs, target)
    loss_mean = loss_f_mean(inputs, target)

    # view
    print("Cross Entropy Loss:\n ", loss_none, loss_sum, loss_mean)

输出：

Cross Entropy Loss:
  tensor([1.3133, 0.1269, 0.1269]) tensor(1.5671) tensor(0.5224)

1.2 weight测试

测试代码：

# ----------------------------------- weight -----------------------------------
# flag = 0
flag = 1
if flag:
    # def loss function
    weights = torch.tensor([1, 2], dtype=torch.float)
    # weights = torch.tensor([0.7, 0.3], dtype=torch.float)

    loss_f_none_w = nn.CrossEntropyLoss(weight=weights, reduction='none')
    loss_f_sum = nn.CrossEntropyLoss(weight=weights, reduction='sum')
    loss_f_mean = nn.CrossEntropyLoss(weight=weights, reduction='mean')

    # forward
    loss_none_w = loss_f_none_w(inputs, target)
    loss_sum = loss_f_sum(inputs, target)
    loss_mean = loss_f_mean(inputs, target)

    # view
    print("\nweights: ", weights)
    print(loss_none_w, loss_sum, loss_mean)

输出:

weights:  tensor([1., 2.])
tensor([1.3133, 0.2539, 0.2539]) tensor(1.8210) tensor(0.3642)

不带weight的loss值为:

Cross Entropy Loss:
  tensor([1.3133, 0.1269, 0.1269]) tensor(1.5671) tensor(0.5224)

计算过程： $1\times1.3133 + 2\times0.1269 + 2\times0.1269$

这是因为1.3133 是第0类，权重为1。0.1269 和0.1269 是第1类，权重为2。

以及： $\frac{1.8210}{(1 + 2 + 2)}$

2. nn.NLLLOSS

在这里插入图片描述
测试：

# ----------------------------------- 2 NLLLoss -----------------------------------
# flag = 0
flag = 1
if flag:

    weights = torch.tensor([1, 1], dtype=torch.float)

    loss_f_none_w = nn.NLLLoss(weight=weights, reduction='none')
    loss_f_sum = nn.NLLLoss(weight=weights, reduction='sum')
    loss_f_mean = nn.NLLLoss(weight=weights, reduction='mean')

    # forward
    loss_none_w = loss_f_none_w(inputs, target)
    loss_sum = loss_f_sum(inputs, target)
    loss_mean = loss_f_mean(inputs, target)

    # view
    print("\nweights: ", weights)
    print("NLL Loss", loss_none_w, loss_sum, loss_mean)

输出:

weights:  tensor([1., 1.])
NLL Loss tensor([-1., -3., -3.]) tensor(-7.) tensor(-2.3333)

3. nn.BCELoss

在这里插入图片描述
测试代码:

# ----------------------------------- 3 BCE Loss -----------------------------------
# flag = 0
flag = 1
if flag:
    inputs = torch.tensor([[1, 2], [2, 2], [3, 4], [4, 5]], dtype=torch.float)
    target = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)

    target_bce = target

    # itarget 使用sigmoid函数压缩到0-1之间
    inputs = torch.sigmoid(inputs)

    weights = torch.tensor([1, 1], dtype=torch.float)

    loss_f_none_w = nn.BCELoss(weight=weights, reduction='none')
    loss_f_sum = nn.BCELoss(weight=weights, reduction='sum')
    loss_f_mean = nn.BCELoss(weight=weights, reduction='mean')

    # forward
    loss_none_w = loss_f_none_w(inputs, target_bce)
    loss_sum = loss_f_sum(inputs, target_bce)
    loss_mean = loss_f_mean(inputs, target_bce)

    # view
    print("\nweights: ", weights)
    print("BCE Loss", loss_none_w, loss_sum, loss_mean)

输出：

weights:  tensor([1., 1.])
BCE Loss tensor([[0.3133, 2.1269],
        [0.1269, 2.1269],
        [3.0486, 0.0181],
        [4.0181, 0.0067]]) tensor(11.7856) tensor(1.4732)

4. nn.BCEWithLogitsLoss

在这里插入图片描述
测试代码：

# ----------------------------------- 4 BCE with Logis Loss -----------------------------------
# flag = 0
flag = 1
if flag:
    inputs = torch.tensor([[1, 2], [2, 2], [3, 4], [4, 5]], dtype=torch.float)
    target = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)

    target_bce = target

    # inputs = torch.sigmoid(inputs)

    weights = torch.tensor([1, 1], dtype=torch.float)

    loss_f_none_w = nn.BCEWithLogitsLoss(weight=weights, reduction='none')
    loss_f_sum = nn.BCEWithLogitsLoss(weight=weights, reduction='sum')
    loss_f_mean = nn.BCEWithLogitsLoss(weight=weights, reduction='mean')

    # forward
    loss_none_w = loss_f_none_w(inputs, target_bce)
    loss_sum = loss_f_sum(inputs, target_bce)
    loss_mean = loss_f_mean(inputs, target_bce)

    # view
    print("\nweights: ", weights)
    print(loss_none_w, loss_sum, loss_mean)


# --------------------------------- pos weight

# flag = 0
flag = 1
if flag:
    inputs = torch.tensor([[1, 2], [2, 2], [3, 4], [4, 5]], dtype=torch.float)
    target = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)

    target_bce = target

    # itarget
    # inputs = torch.sigmoid(inputs)

    weights = torch.tensor([1], dtype=torch.float)
    pos_w = torch.tensor([1], dtype=torch.float)        # 3

    loss_f_none_w = nn.BCEWithLogitsLoss(weight=weights, reduction='none', pos_weight=pos_w)
    loss_f_sum = nn.BCEWithLogitsLoss(weight=weights, reduction='sum', pos_weight=pos_w)
    loss_f_mean = nn.BCEWithLogitsLoss(weight=weights, reduction='mean', pos_weight=pos_w)

    # forward
    loss_none_w = loss_f_none_w(inputs, target_bce)
    loss_sum = loss_f_sum(inputs, target_bce)
    loss_mean = loss_f_mean(inputs, target_bce)

    # view
    print("\npos_weights: ", pos_w)
    print(loss_none_w, loss_sum, loss_mean)

输出：

weights:  tensor([1., 1.])
tensor([[0.3133, 2.1269],
        [0.1269, 2.1269],
        [3.0486, 0.0181],
        [4.0181, 0.0067]]) tensor(11.7856) tensor(1.4732)

pos_weights:  tensor([1.])
tensor([[0.3133, 2.1269],
        [0.1269, 2.1269],
        [3.0486, 0.0181],
        [4.0181, 0.0067]]) tensor(11.7856) tensor(1.4732)

这个时候loss值是一致的，当设置pos_w = torch.tensor([3], dtype=torch.float)时：

weights:  tensor([1., 1.])
tensor([[0.3133, 2.1269],
        [0.1269, 2.1269],
        [3.0486, 0.0181],
        [4.0181, 0.0067]]) tensor(11.7856) tensor(1.4732)

pos_weights:  tensor([3.])
tensor([[0.9398, 2.1269],
        [0.3808, 2.1269],
        [3.0486, 0.0544],
        [4.0181, 0.0201]]) tensor(12.7158) tensor(1.5895)

正样本位置上的loss，放大了3倍。

猎猎长风

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【PyTorch】4.2 损失函数（一）

任务简介：学习权值初始化的原理；介绍损失函数、代价函数与目标函数的关系，并学习交叉熵损失函数详细说明：本节学习损失函数、代价函数与目标函数的联系与不同之处，然后学习人民币二分类任务中使用到的交叉熵损失函数，在讲解交叉熵损失函数时补充分析自信息、信息熵、相对熵和交叉熵之间的关系，最后学习四种损失函数：nn.CrossEntropyLossnn.NLLLossnn.BCELossnn.BCEWithLogitsLoss一、损失函数概念size_average和reduc
复制链接

扫一扫

专栏目录