CrossEntropyLoss函数，里面还包含了nn.LogSoftmax函数和nn.NLLLoss函数

最新推荐文章于 2023-05-24 19:53:20 发布

一只tobey

最新推荐文章于 2023-05-24 19:53:20 发布

阅读量2.4k

点赞数 2

分类专栏： pytorch

本文链接：https://blog.csdn.net/zz2230633069/article/details/90489098

版权

论文同时被 2 个专栏收录

40 篇文章 0 订阅

订阅专栏

pytorch

31 篇文章 11 订阅

订阅专栏

对于单样本：其中lable是一个数字（最小为0最大为C-1）代表x属于哪一类，y是lable的one hot 编码。

举例：x是一个4分类的一个样本: x=[1, 33.1, 77.02, 3.78]，label=2，y=[0,0,1,0]，这里C=4

$loss(x,y)=-\sum_{j=0}^{C-1}y_{j}log\hat{y}_{j}=-\sum_{j=0}^{C-1}y_{j}log\: (softmax(x)_{j})=-log\: (softmax(x)[lable])=-log\frac{exp(x)}{\sum_{j=0}^{C-1}exp(x[j])}[lable]$

对于多样本：

举例：X=[ [1, 2.22, 3.35, 4], [5, 6, 7.2, 8] ]，target=[1,0]，Y=[ [0, 1, 0, 0], [ 1, 0, 0, 0] ]，这里C=4，N=2

$loss(X,Y)=-\sum_{i=0}^{N-1}\sum_{j=0}^{C-1}y^{(i)}_{j}log\hat{y}^{(i)}_{j}=-\sum_{i=0}^{N-1}\sum_{j=0}^{C-1}y^{(i)}_{j}log\: (softmax(x^{(i)})_{j})=-\sum_{i=0}^{N-1}log\: (softmax(x^{(i)})[target \:[i]])=-\sum_{i=0}^{N-1}log\frac{exp(x^{(i)})}{\sum_{j=0}^{C-1}exp(x^{(i)}_{j})}[target\:[i]]$

class CrossEntropyLoss(_WeightedLoss):
        __constants__ = ['weight', 'ignore_index', 'reduction']

    def __init__(self, weight=None, size_average=None, ignore_index=-100,
                 reduce=None, reduction='mean'):
        super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)
        self.ignore_index = ignore_index

    @weak_script_method
    def forward(self, input, target):
        return F.cross_entropy(input, target, weight=self.weight,
                               ignore_index=self.ignore_index, reduction=self.reduction)

def cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100,
                  reduce=None, reduction='mean'):
    if size_average is not None or reduce is not None:
        reduction = _Reduction.legacy_get_string(size_average, reduce)

# 下面的nll_loss就是可以调用的F.nll_loss()函数，log_softmax就是可以调用的F.logsoftmax()函数
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

This criterion combines :func:`nn.LogSoftmax` and :func:`nn.NLLLoss` in one single class.

非常适用于分类，C个类别

可选参数weight是一个一维的tensor，长度为C，对每一个类别进行加权，当有一个不平衡的训练集的时候可以设置这个参数。

参数input是一个原始的，对每一个类别没有进行正则化的tensor（比如说fc输出的就是没有进行正则化的原始的，如果进行softmax就是进行归一化了，因为对每一个样本的不同类的值相加为1）；shape=‘（batch，C）'或者‘（batch，C，d1，d2，...，dk）’，k>=1；

参数ignore_index如果指定了，那么这个准则也会接受这个类别索引（该索引可以不在类别索引里面）

参数size_average,reduce都会被取消，被参数reduction代替，参数reduction默认'mean'是取loss的平均值，还有可选的是‘none','sum'

参数target的shape=‘（batch）’或者是‘（batch，d1，d2，...，dk）’里面的每一个值都是类别的索引。

输出：如果reduction=’none‘，output的shape和target一样；否则是一个常数。

Examples::

    >>> loss = nn.CrossEntropyLoss()
    >>> input = torch.randn(3, 5, requires_grad=True)
    >>> target = torch.empty(3, dtype=torch.long).random_(5)
    >>> output = loss(input, target)
    >>> output.backward()

对于单个样本的loss计算如下：

$loss(x,class)=-log\frac{exp(x[class]])}{\sum_{j=0}^{C-1}exp(x[j])}=-x[class]+log(\sum_{j=0}^{C-1}exp(x[j]]))=-\sum_{j=0}^{C-1}y_{j}log\hat{y}_{j}=-\sum_{j=0}^{C-1}y_{j}log\: (softmax(x)_{j})=-log\: (softmax(x)[class])=-log\frac{exp(x)}{\sum_{j=0}^{C-1}exp(x[j])}[class]$

上面的y向量就是样本对应lable的one hot编码（如果有3类，该样本所属类别lable=2则y=001；y=100代表第0类；y=010代表属于第1类；y=001代表属于第二类；这个样本的类别在0-2里面）所以y的其他位置值均为0，类别的位置为1即y[class]=1；那么对于这个函数的batch输入input，和target，举例如下：

import torch
import numpy as np
import torch.nn as nn

lable=torch.tensor([0,0,1])
fc_out=torch.tensor([
    [2.5,-2,0.8989],
    [3,0.8,-842],
    [0.00000000000001,2,4.9]
])

# 使用函数得到的loss
loss = nn.CrossEntropyLoss()
l=loss(fc_out,lable)
print(l)  # tensor(1.0862)

# 将上面的函数编写出来
def CrossEntropyloss(input,lable,reduction='mean'):
    # 首先将原始数据进行归一化，就是进行softmax
    exp_input=torch.exp(input)
    sum_input=torch.sum(exp_input,1)
    sum_input=sum_input.reshape(-1,1)
    softmax_input=exp_input/sum_input
    # 然后计算出交叉熵损失
    idx = np.arange(input.shape[0])  # 得到输入数据的行的索引
    log_softmax_input=-torch.log(softmax_input+1e-7)  # 增加一个很小的数1e-7，防止归一化里面出现0引起错误; 得到log_softmax
#  上述过程将原始的input得到log_softmax_input的过程就是函数nn.LogSoftmax(input,dim=1)的执行过程
    '''
    下面的代码是为了说明reduction参数的作用，可以忽略，默认使用'mean'所以直接
    loss=log_softmax_input[idx,lable.numpy()].mean()
    return loss
    '''
#  下面的一行就是nn.NLLLoss(log_softmax_input，lable)的执行过程。
    if reduction=='mean':
        loss=log_softmax_input[idx,lable.numpy()].mean() # ，然后这些样本的loss求平均，使用的torch.mean()函数
    elif reduction=='sum':
        loss=log_softmax_input[idx,lable.numpy()].sum() # 样本loss的总和，使用的torch.sum()函数
    elif reduction=='none':
        loss=log_softmax_input[idx,lable.numpy()].mean() # 索引表示，对应行，对应列，表示得到每个样本的类别对应的值就是该样本的loss 
    else:
        raise NameError('reduction must be one of {"mean","sum","none"}')
    return loss
loss = CrossEntropyloss(fc_out,lable)
print(loss)  # tensor(1.0862)

损失函数torch.nn.NLLLoss

class NLLLoss(_WeightedLoss):
    __constants__ = ['ignore_index', 'weight', 'reduction']

    def __init__(self, weight=None, size_average=None, ignore_index=-100,
                 reduce=None, reduction='mean'):
        super(NLLLoss, self).__init__(weight, size_average, reduce, reduction)
        self.ignore_index = ignore_index

    @weak_script_method
    def forward(self, input, target):
        return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)

是一个负的对数似然损失函数，和上面的交叉熵一样适合分类问题，C个类别。

参数weight是一个1维的，长度为C的tensor，如果没有给定则默认全是1赋权重给对应类。

size_average,ignore_index,reduce,reduction，ignore_index，target这些参数和上面交叉熵一样的定义和功能.

参数input输入要求是一个对数概率，对每一个类别的概率的对数；并且shape=‘（batch，C）’或者’（batch，C，d1，d2，...，dk）‘。要得到对数概率是比较简单的，只需要在网络的最后输出再添加一层使得input=nn.LogSoftmax(last_output, dim=1)。如果不想要添加额外的层，那么就应该使用nn.CrossEntropyLoss().

此loss可以被表示如下：

对于更高维度的输入，比如图片，它就计算每一个pixel的NLL loss

    Examples::

        >>> m = nn.LogSoftmax(dim=1)
        >>> loss = nn.NLLLoss()
        >>> # input is of size N x C = 3 x 5
        >>> input = torch.randn(3, 5, requires_grad=True)
        >>> # each element in target has to have 0 <= value < C
        >>> target = torch.tensor([1, 0, 4])
        >>> output = loss(m(input), target)
        >>> output.backward()
        >>>
        >>>
        >>> # 2D loss example (used, for example, with image inputs)
        >>> N, C = 5, 4  # C = 4 表示4分类
        >>> loss = nn.NLLLoss()
        >>> # input is of size N x Chanel x height x width
        >>> data = torch.randn(N, 16, 10, 10)
        >>> conv = nn.Conv2d(16, C, (3, 3))
        >>> m = nn.LogSoftmax(dim=1)
        >>> # each element in target has to have 0 <= value < C
        >>> target = torch.empty(N, 8, 8, dtype=torch.long).random_(0, C)
        >>> output = loss(m(conv(data)), target)
'''
上面等价于output = nn.CrossEntropyLoss(conv(data)), target)
在这里就是input=（batch，Classes，H_out，W_out），target=（batch，H_out，W_out）
对输出的HxW个像素进行分类。
'''
        >>> output.backward()