输出相关
Loss Functions
Creates a criterion that measures the mean absolute error (MAE) between each element in the input xx and target yy . | |
Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input xx and target yy . | |
This criterion combines nn.LogSoftmax() and nn.NLLLoss() in one single class. | |
The Connectionist Temporal Classification loss. | |
The negative log likelihood loss. | |
Negative log likelihood loss with Poisson distribution of target. | |
The Kullback-Leibler divergence loss measure | |
Creates a criterion that measures the Binary Cross Entropy between the target and the output: | |
This loss combines a Sigmoid layer and the BCELoss in one single class. | |
Creates a criterion that measures the loss given inputs x1x1 , x2x2 , two 1D mini-batch Tensors, and a label 1D mini-batch tensor yy (containing 1 or -1). | |
Measures the loss given an input tensor xx and a labels tensor yy (containing 1 or -1). | |
Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input xx (a 2D mini-batch Tensor) and output yy (which is a 2D Tensor of target class indices). | |
Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise. | |
Creates a criterion that optimizes a two-class classification logistic loss between input tensor xx and target tensor yy (containing 1 or -1). | |
Creates a criterion that optimizes a multi-label one-versus-all loss based on max-entropy, between input xx and target yy of size (N, C)(N,C) . | |
Creates a criterion that measures the loss given input tensors x_1x1 , x_2x2 and a Tensor label yy with values 1 or -1. | |
Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input xx (a 2D mini-batch Tensor) and output yy (which is a 1D tensor of target class indices, 0 \leq y \leq \text{x.size}(1)-10≤y≤x.size(1)−1 ): | |
Creates a criterion that measures the triplet loss given an input tensors x1x1 , x2x2 , x3x3 and a margin with a value greater than 00 . | |
Creates a criterion that measures the triplet loss given input tensors aa , pp , and nn (representing anchor, positive, and negative examples, respectively), and a nonnegative, real-valued function (“distance function”) used to compute the relationship between the anchor and positive example (“positive distance”) and the anchor and negative example (“negative distance”). |
参数
reduction (string, optional) – Specifies the reduction to apply to the output: 'none'
| 'mean'
| 'sum'
. 'none'
: no reduction will be applied, 'mean'
: the weighted mean of the output is taken, 'sum'
: the output will be summed. Note: size_average
and reduce
are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction
. Default: 'mean'
BCELoss/BCEWithLogitsLoss和CrossEntropyLoss的区别
BCEWithLogitsLoss = Sigmoid+BCELoss,当网络最后一层使用nn.Sigmoid时,就用BCELoss,当网络最后一层不使用nn.Sigmoid时,就用BCEWithLogitsLoss。
BCELoss/BCEWithLogitsLoss用于单标签二分类或者多标签二分类,输出和目标的维度是(batch,C),batch是样本数量,C是类别数量,对于每一个batch的C个值,对每个值求sigmoid到0-1之间,所以每个batch的C个值之间是没有关系的,相互独立的,所以之和不一定为1。每个C值代表属于一类标签的概率。如果是单标签二分类,那输出和目标的维度是(batch,1)即可。
另外torch.nn.BCELoss是一个类,其中的forward的具体实现调用的是函数F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)。
[torch.nn.modules.loss — PyTorch 2.1 documentation]
[TORCH.NN.FUNCTIONAL.BINARY_CROSS_ENTROPY]
CrossEntropyLoss用于多类别分类,输出和目标的维度是(batch,C),batch是样本数量,C是类别数量,每一个C之间是互斥的,相互关联的,对于每一个batch的C个值,一起求每个C的softmax,所以每个batch的所有C个值之和是1,哪个值大,代表其属于哪一类。如果用于二分类,那输出和目标的维度是(batch,2)。
二分类输出层使用两个神经元和只使用一个神经元的区别
二分类时一般把bert压缩成一维过sigmoid或者压缩成2维过softmax。压缩成2维时,Sigmoid(每1维都过sigmoid)和softmax输出的结果不同,第一个所有神经元之和可能大于1,第二个所有神经元输出的结果之和一定等于1。
[神经网络进行二分类时,输出层使用两个神经元和只使用一个神经元,模型的性能有何差异]
[二分类问题输出一个节点还是两个节点_IT莫莫的博客-CSDN博客]
训练算loss时1 使用一个神经元,是用sigmoid处理,最后通过0.5阈值判断0或者1,交叉熵损失函数用的是F.binary_cross_entropy。
2 使用两个神经元,是用softmax处理,并且交叉熵损失函数用的是CrossEntropyLoss。
两个神经元算loss时
elif self.config.problem_type == "single_label_classification":
loss_fct = CrossEntropyLoss()
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
Note: 这里加view[PyTorch:tensor-基本操作_view]数据和shape都没变化
logits(batch_size, num_labels) labels(batch_size,)
评估Evaluation 或者 预测predict时
使用两个神经元时不需要过softmax,直接比较两个神经元大小,取argmax。
如:两个神经元评估算metric时
# You can define your custom compute_metrics function. It takes an `EvalPrediction` object (a namedtuple with a predictions and label_ids field) and has to return a dictionary string to float.
def compute_metrics(p: EvalPrediction):
preds = p.predictions[0] if isinstance(p.predictions, tuple) else p.predictions
preds = np.squeeze(preds) if is_regression else np.argmax(preds, axis=1)
return metric.compute(predictions=preds, references=p.label_ids)
如:两个神经元predict时
predictions = trainer.predict(predict_dataset, metric_key_prefix="predict").predictions
predictions = np.squeeze(predictions) if is_regression else np.argmax(predictions, axis=1)
Note: 只使用一个神经元时,必须过sigmoid吧,不然没法和0.5比较。
示例
nn.CrossEntropyLoss的示例
import torch
from torch import nn
# Example of target with class indices
loss = nn.CrossEntropyLoss()
input = torch.randn(3, 5, requires_grad=True)
# input.size(): torch.Size([3, 5])
target = torch.empty(3, dtype=torch.long).random_(5)
# target.size(): torch.Size([3])
output = loss(input, target)
output.backward()
# Example of target with class probabilities
input = torch.randn(3, 5, requires_grad=True)
# input.size(): torch.Size([3, 5])
target = torch.randn(3, 5).softmax(dim=1)
# target.size(): torch.Size([3, 5])
output = loss(input, target)
output.backward()
nn.BCELOSS的示例
m = nn.Sigmoid()
loss = nn.BCELoss()
input = torch.randn(3, requires_grad=True)
target = torch.empty(3).random_(2)
output = loss(m(input), target)
output.backward()
nn.functional.binary_cross_entropy的示例
import torch.nn.functional as F
labels = dataloader["label"]
predictions = outputs.squeeze().contiguous()
loss = F.binary_cross_entropy(predictions, labels, reduction='mean')
input = torch.randn(3, 2, requires_grad=True)
target = torch.rand(3, 2, requires_grad=False)
loss = F.binary_cross_entropy(torch.sigmoid(input), target)
loss.backward()
from: -柚子皮-