pytorch中的loss function

小傻ssy

已于 2024-04-07 11:50:29 修改

阅读量1.7k

点赞数

分类专栏： python 文章标签： python 开发语言

于 2022-03-08 13:13:16 首次发布

本文链接：https://blog.csdn.net/qq_36348142/article/details/123349337

版权

python 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

1. CrossEntropyLoss() 和 NLLLoss()

nn.CrossEntropyLoss() 由log_softmax()和NLLLoss()组成。

import torch.nn.functional as F
import torch.nn as nn

criterion = nn.NLLLoss()
# node_logits= Nx2 [[0,1], [1, 0], ...,[0,1]]
node_logits = F.log_softmax(node_logits, dim=1)
loss = criterion(node_logits, y)
# y = Nx1 [1,0, ..., 1] index

2. Smooth Labeling

class SmoothLoss(nn.Module):
def __init__(self, smooth=0.0):
	self.smooth = smooth
	self.criterion = nn.KLDivLoss(reduction='batchmean')
def forward(self, logits, labels):
	size = list(logits.size())
	tdata = labels.detach()
	one_hot = torch.zeros(size,device=)
	for key, value in enumerate(tdata):
		if value:
			one_hot[key] = torch.tensor([1-value, value])
		else:
			one_hot[key] = torch.tensor([1-self.smooth, self.smooth])
	gtruth = one_hot.detach()
	loss = self.criterion(logits, gtruth)
	return loss

另一种实现，更加优雅一些。两种都使用的是nn.KLDivLoss()

class LabelSmooth(nn.Module):
	def __init__(self, size, pad_idx, smooth):
		self.criterion = nn.KLDivLoss(size_average=False)
		self.padding_idx = pad_idx
		self.confidence = 1-smooth
		self.size = size
		self.true_dist = None
	def forward(self, x, target):
		true_dist = x.data.clone()
		true_dist.fill_(self.smooth/(self.size-2)
		true_dist.scatter(1, target.data.unsqueeze(1), self.confidence)
		true_dist[:, self.padding_idx] = 0
		mask = torch.nonzero(target.data == self.padding_idx)
		if mask.dim()>0:
			true_dist.index_fill_(0, mask.squeeze(), 0)
		slef.true_dist = true_dist
		return self.criterion(x, Variable(true_dist, requires_grad=False))

所以， nn.KLDivLoss()的input和target的size是一样，然后原地点乘得到loss，而nn.NLLLoss()和nn.CrossEntropyLoss()的input是Nxclass_num, target是index，所以size=Nx1

特别说明，本文的代码不能直接用。因为目的在于说明这三种loss的差别，所以我对代码做了简化。

3. 交叉熵损失函数 vs MSE损失函数

交叉熵损失函数主要用于分类问题，MSE损失函数主要用于回归问题。交叉熵损失函数没有特别考虑类别概率的排序大小。对于考虑类别概率排序的损失函数，二分类排序推荐使用torch.nn.MarginRankingLoss，多分类排序需要自己实现。

GPT-4的回答：

import torch
import torch.nn as nn
import torch.nn.functional as F

class MultiClassRankingLoss(nn.Module):
    def __init__(self, margin=1.0):
        super(MultiClassRankingLoss, self).__init__()
        self.margin = margin

    def forward(self, scores, targets):
        """
        scores: 模型预测的得分，形状为(batch_size, num_classes)
        targets: 真实的类别索引，形状为(batch_size,)
        """
        # 获取batch大小和类别数
        batch_size, num_classes = scores.size()

        # 将targets转换为one-hot编码
        targets_one_hot = torch.zeros_like(scores).scatter_(1, targets.unsqueeze(1), 1)

        # 计算正确类别与错误类别得分之差加上margin
        # 并将正确类别的得分设为负值（因为我们希望正确的类别得分高于错误的类别得分）
        loss = F.relu(self.margin + scores - scores.gather(1, targets.unsqueeze(1)).expand_as(scores))

        # 由于我们只关心错误类别的得分，将正确类别的损失置零
        loss = torch.where(targets_one_hot == 1, torch.zeros_like(loss), loss)

        # 返回损失的平均值
        return loss.mean()

# 示例使用
batch_size, num_classes = 3, 5
scores = torch.tensor([[2.5, 1.0, 0.5, 6.0, 3.0],
                       [3.5, 2.5, 1.0, 0.5, 4.0],
                       [1.0, 2.0, 3.0, 4.0, 5.0]], dtype=torch.float32)  # 预测得分
targets = torch.tensor([3, 4, 4], dtype=torch.long)  # 真实类别

# 初始化多分类排名损失函数
criterion = MultiClassRankingLoss(margin=1.0)

# 计算损失
loss = criterion(scores, targets)
print(f"Loss: {loss.item()}")

尽管交叉熵损失本身不处理概率的排序问题，但在一些特定的场景下，如信息检索或推荐系统中，我们可能更关心模型预测概率的排序正确性，而非仅仅是准确性。在这些情况下，可能会使用其他指标或损失函数，如排序损失（Ranking Loss）或平均精确度（Mean Average Precision，MAP），来专门优化模型在概率排序上的表现。

4. torch.nn.MultiMarginLoss

在torch.nn.MultiMarginLoss中，我们实际上是在处理得分（或称为logits）而不是概率。这个损失函数的目标确实是确保正确类别（正类别）的得分比其他所有错误类别（负类别）的得分都要高出一个指定的边界（margin）。但要注意，MultiMarginLoss操作的是直接从模型输出的得分，而不是经过softmax处理的概率。

MultiMarginLoss的基本思想是，对于每个样本，如果正确类别的得分没有比其他类别的得分高出至少一个margin，就会产生损失。这样，损失函数激励模型调整其参数，以增加正确类别得分与错误类别得分之间的间隔，从而提高分类的准确性。

小傻ssy

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
pytorch中的loss function

1. CrossEntropyLoss() 和 NLLLoss()nn.CrossEntropyLoss() 由log_softmax()和NLLLoss()组成。import torch.nn.functional as Fimport torch.nn as nncriterion = nn.NLLLoss()# node_logits= Nx2 [[0,1], [1, 0], ...,[0,1]]node_logits = F.log_softmax(node_logits, dim=1
复制链接

扫一扫