损失函数准则常用的有均方误差、交叉熵、绝对误差等。PyTorch中包含以下损失函数定义:
MSELoss | NLLLoss |
PoissonNLLLoss | L1Loss |
SmoothL1Loss | KLDivLoss |
BCELoss | BCEWithLogitsLoss |
CrossEntropyLoss | MultiLabelMarginLoss |
MultiLabelSoftMarginLoss | MultiMarginLoss |
HingeEmbeddingLoss | CosineEmbeddingLoss |
TripletMarginLoss | SoftMarginLoss |
CTCLoss | NLLLoss2d(弃用) |
01
均方误差损失(MSE):
调用形式:
torch.nn.MSELoss(input, target)
其中,输入和目标矩阵大小一致。
使用方法实例:
loss = nn.MSELoss()input = torch.randn(3, 5, requires_grad=True)target = torch.randn(3, 5)output = loss(input, target)output.backward()
02
负对数似然损失(NLL):
调用形式:
torch.nn.NLLLoss(input, target)
其中,输入NxC和目标矩阵0 <= value < C之间。
m = nn.LogSoftmax(dim=1)loss = nn.NLLLoss()# input is of size N x C = 3 x 5input = torch.randn(3, 5, requires_grad=True)# each element in target has to have 0 <= value target = torch.tensor([1, 0, 4])output = loss(m(input), target)output.backward()
输出:
tensor(2.2114, grad_fn=<NllLossBackward>)
其他函数:目标是泊松分布的负对数似然损失:
torch.nn.PoissonNLLLoss
03
平均绝对误差(MAE):
调用形式:
torch.nn.L1Loss
其中,输入和目标矩阵大小一致。
# torch.nn.NLLLoss2d # 弃用,用NLLLoss代替
使用方法实例:
loss = nn.L1Loss()input = torch.randn(3, 5, requires_grad=True)target = torch.randn(3, 5)output = loss(input, target)output.backward()
输出:
tensor(1.0578, grad_fn=<L1LossBackward>)
其他类似的损失函数:平滑L1损失:
torch.nn.SmoothL1Loss
04
KL散度损失(KLD):
调用形式:
torch.nn.KLDivLoss
其中,输入和目标矩阵大小一致。
使用方法实例:
loss = nn.KLDivLoss()input = torch.randn(3, 5, requires_grad=True)target = torch.randn(3, 5)output = loss(input, target)output.backward()
输出:
tensor(0.1479, grad_fn=<KlDivBackward>)
05
交叉熵损失(MAE):
调用形式:
torch.nn.CrossEntropyLoss
其中,输入和目标矩阵大小一致。它实际是nn.LogSoftmax和n.NLLLoss的结合。验证实例:
input = torch.randn(3, 5, requires_grad=True)target = torch.empty(3, dtype=torch.long).random_(5)loss1 = nn.CrossEntropyLoss()output = loss1(input, target)output.backward()print(output)m = nn.LogSoftmax()output1 = m(input)loss2 = nn.NLLLoss()output2 = loss2(output1, target)output2.backward()print(output2)
输出:
tensor(2.8411, grad_fn=<NllLossBackward>)tensor(2.8411, grad_fn=<NllLossBackward>)
其他交叉熵损失函数:
二分类交叉熵:
torch.nn.BCELoss
Logit交叉熵:
实际是Sigmoid和BCELoss的组合,可以仿照上例验证。
torch.nn.BCEWithLogitsLoss
06
多类别多分类Hinge损失(MMH):
调用形式:
torch.nn.MultiLabelMarginLoss
其中,输入和目标矩阵大小一致。
使用方法实例:
loss = nn.MultiLabelMarginLoss()x = torch.FloatTensor([[0.1, 0.2, 0.4, 0.8]])# 考虑3和0, -1的目的是保证维度和输入相等y = torch.LongTensor([[3, 0, -1, 1]])print(loss(x, y))
输出:
tensor(0.8500)
其他类似函数:
多标签一对全的最大熵损失函数:
torch.nn.MultiLabelSoftMarginLoss
多分类Hinge损失:
torch.nn.MultiMarginLoss
07
Hinge嵌入损失:
调用形式:
torch.nn.HingeEmbeddingLoss
其中,输入和目标矩阵大小一致。用于测度两个输入是否相似,主要用于非线性嵌入学习和半监督学习。使用的是L1 parewise 距离。
类似函数:
余弦嵌入损失:
torch.nn.CosineEmbeddingLoss
功能和HingeEmbeddingLoss一致,但是采用的是余弦距离度量。
08
测度两个1D张量x和1D张量y={1,-1}的损失:
调用形式:
torch.nn.MarginRankingLoss
其中,输入和目标矩阵大小一致。
使用方法实例:
x1 = torch.tensor([1, 1])x2 = torch.tensor([-1, 1])y =torch.tensor([1, -1])loss = torch.nn.MarginRankingLoss()print(loss(x1, x2, y))
类似函数:三元组损失:
论文:
http://www.bmva.org/bmvc/2016/papers/paper119/index.html
实例:
triplet_loss = nn.TripletMarginLoss(margin=1.0, p=2)anchor = torch.randn(100, 128, requires_grad=True)positive = torch.randn(100, 128, requires_grad=True)negative = torch.randn(100, 128, requires_grad=True)output = triplet_loss(anchor, positive, negative)output.backward()print(output)
输出:
tensor(1.0572, grad_fn=<MeanBackward0>)
09
二分类逻辑损失:
调用形式:
torch.nn.SoftMarginLoss
其中,输入和目标矩阵大小一致。计算输入和目标的逻辑损失。
使用方法实例:
loss = nn.L1Loss()input = torch.randn(3, 5, requires_grad=True)target = torch.randn(3, 5)output = loss(input, target)output.backward()
10
连接主义时间分类误差(CTC):
计算连续时间序列和目标序列之间的误差
调用形式:
torch.nn.CTCloss
具体内容见文章:
下载 地址:https://www.cs.toronto.edu/~graves/icml_2006.pdf
使用方法实例:
>>> # Target are to be padded >>> T = 50 # Input sequence length >>> C = 20 # Number of classes (including blank) >>> N = 16 # Batch size >>> S = 30 # Target sequence length of longest target in batch (padding length) >>> S_min = 10 # Minimum target length, for demonstration purposes >>>>>> # Initialize random batch of input vectors, for *size = (T,N,C) >>> input = torch.randn(T, N, C).log_softmax(2).detach().requires_grad_() >>>>>> # Initialize random batch of targets (0 = blank, 1:C = classes) >>> target = torch.randint(low=1, high=C, size=(N, S), dtype=torch.long) >>>>>> input_lengths = torch.full(size=(N,), fill_value=T, dtype=torch.long) >>> target_lengths = torch.randint(low=S_min, high=S, size=(N,), dtype=torch.long) >>> ctc_loss = nn.CTCLoss() >>> loss = ctc_loss(input, target, input_lengths, target_lengths) >>> loss.backward() >>>>>> >>> # Target are to be un-padded >>> T = 50 # Input sequence length >>> C = 20 # Number of classes (including blank) >>> N = 16 # Batch size >>>>>> # Initialize random batch of input vectors, for *size = (T,N,C) >>> input = torch.randn(T, N, C).log_softmax(2).detach().requires_grad_() >>> input_lengths = torch.full(size=(N,), fill_value=T, dtype=torch.long) >>>>>> # Initialize random batch of targets (0 = blank, 1:C = classes) >>> target_lengths = torch.randint(low=1, high=T, size=(N,), dtype=torch.long) >>> target = torch.randint(low=1, high=C, size=(sum(target_lengths),), dtype=torch.long) >>> ctc_loss = nn.CTCLoss() >>> loss = ctc_loss(input, target, input_lengths, target_lengths) >>> loss.backward()
实例基本来自PyTorch库的函数解释。