Pytorch学习基础——损失函数

Pytorch学习基础——损失函数


损失函数的形式化表示: L l o s s = ∑ i = 1 N b c r i t e r i o n ( y ∗ , y ) L_{loss} = \sum_{i=1}^{N_b}criterion(y^{*},y) Lloss=i=1Nbcriterion(y,y), 其中 y ∗ ∈ R B × C y^{*}\in \R^{B\times C} yRB×C为模型预测输出, y ∈ R B × C y\in \R^{B\times C} yRB×C为真实标签或标注, B B B b a t c h _ s i z e batch\_size batch_size, C C C为全连接层输出,当损失函数对每个 b a t c h batch batch计算损失时,此时有 B = 1 B=1 B=1。根据实际问题的属性,损失函数大致可分类为两类,即分类损失和回归损失。

1. 分类损失
1.1 二分类 B C E L o s s   f a m i l y BCELoss \ family BCELoss family

L l o s s ( y ′ , y ) = ∑ i = 1 N b y i log ⁡ ( y i ′ ) L_{loss} (y^{'},y)= \sum_{i=1}^{N_b}y_i\log(y_{i}^{'}) Lloss(y,y)=i=1Nbyilog(yi)

import torch
import torch.nn as nn

def tensor_info(tensor):
    print('tensor type: {}'.format(tensor.type()))
    print('tensor value: {}'.format(tensor.data))
    print('tensor shape: {}'.format(tensor.shape))

criterion = nn.BCELoss()
batchsize = 2
num_class = 2
y_ = torch.randn(batchsize,num_class)
y = torch.empty(batchsize,num_class).random_(num_class)
loss = criterion(nn.Sigmoid()(y_), y)

tensor_info(y_)
tensor_info(y)
tensor_info(loss)
"""
tensor type: torch.FloatTensor
tensor value: tensor([[-0.0734,  1.1474],
        			  [-0.1513, -0.3409]])
        			  
tensor shape: torch.Size([2, 2])
tensor type: torch.FloatTensor
tensor value: tensor([[0., 1.],
        			  [0., 0.]])
        			  
tensor shape: torch.Size([2, 2])
tensor type: torch.FloatTensor
tensor value: 0.5225892663002014
tensor shape: torch.Size([])
"""

note:

  1. B C E L o s s BCELoss BCELoss用于二分类问题,tensor的类型为 t o r c h . F l o a t T e n s o r torch.FloatTensor torch.FloatTensor,模型的输出为 s i g m o i d sigmoid sigmoid类型,即要求输出为 [ 0 , 1 ] [0,1] [0,1];

  2. B C E L o s s BCELoss BCELoss C r o s s E n t r o p y L o s s CrossEntropyLoss CrossEntropyLoss训练更稳定;

  3. 当二分类类别不平衡时可以考虑 B C E W i t h L o g i t s L o s s BCEWithLogitsLoss BCEWithLogitsLoss,此时的模型输出为 l o g i t s logits logits形式,同时需要传入 w e i g h t weight weight权重参数;

    w_0 = 1
    w_1 = 5
    class_weights = Variable(torch.FloatTensor[w_0, w_1])
    criterion = nn.BCEWithLogitsLoss(class_weights)
    ...
    loss = criterion(y_, y)
    
1.2多分类 C r o s s E n t r o p y L o s s   f a m i l y CrossEntropyLoss \ family CrossEntropyLoss family

L l o s s ( y ′ , y ) = − log ⁡ ( exp ⁡ ( y [ y ] ′ ) ∑ j exp ⁡ [ y [ j ] ′ ] ) = − y [ y ] ′ + l o g ( ∑ j exp ⁡ ( y [ j ] ′ ) L_{loss} (y^{'},y)= -\log(\dfrac{\exp(y^{'}_{[y]})}{\sum_{j}\exp[y^{'}_{[j]}]}) = -y^{'}_{[y]}+log(\sum_{j}\exp(y^{'}_{[j]}) Lloss(y,y)=log(jexp[y[j]]exp(y[y]))=y[y]+log(jexp(y[j])

criterion = nn.CrossEntropyLoss()
batchsize = 2
num_class = 3
y_ = torch.randn(batchsize,num_class)
y = torch.empty(batchsize, dtype=torch.long).random_(num_class)
loss = criterion(nn.Softmax()(y_), y)

tensor_info(y_)
tensor_info(y)
tensor_info(loss)
"""
tensor type: torch.FloatTensor
tensor value: tensor([[ 0.9964,  0.7243, -1.0832],
        			  [ 1.2502,  0.9600, -0.1909]])
tensor shape: torch.Size([2, 3])

tensor type: torch.LongTensor
tensor value: tensor([2, 0])
tensor shape: torch.Size([2])

tensor type: torch.FloatTensor
tensor value: 1.1623895168304443
tensor shape: torch.Size([])
"""

note:

  1. C r o s s E n t r o p y L o s s CrossEntropyLoss CrossEntropyLoss既可以用于二分类问题也可以用于多分类,target tensor的类型为 t o r c h . L o n g T e n s o r torch.LongTensor torch.LongTensor,维度为 y ∈ R B y\in \R^{B} yRB,代码自动将输出转换为 o n e _ h o t one\_hot one_hot编码,模型的输出为 s o f t m a t softmat softmat类型,即要求输出为多维 [ 0 , 1 ] [0,1] [0,1];

  2. 由于 B C E L o s s BCELoss BCELoss C r o s s E n t r o p y L o s s CrossEntropyLoss CrossEntropyLoss训练更稳定,因此二分类多使用前者,而多分类时只能使用后者;

  3. 多类别数据不平衡时,可以考虑多分类负对数损失函数 n n . N L L L o s s nn.NLLLoss nn.NLLLoss

    criterion = nn.NLLLoss()
    ...
    loss = criterion(nn.LogSoftmax(dim=1)(y_), y)
    
2.回归损失
2.1 L 1   l o s s ( M A E ) L1 \ loss (MAE) L1 loss(MAE)

L l o s s ( y ′ , y ) = ∣ y − y ′ ∣ L_{loss} (y^{'},y)= |y-y^{'}| Lloss(y,y)=yy

criterion = nn.L1Loss()
batchsize = 2
data_dim = 5
y_ = torch.randn(batchsize,data_dim)
y = torch.randn(batchsize, data_dim)
loss = criterion(y_, y)

tensor_info(y_)
tensor_info(y)
tensor_info(loss)
"""
tensor type: torch.FloatTensor
tensor value: tensor([[-0.8535, -0.3021,  0.2806,  0.6997, -0.3428],
        			  [ 1.0466, -0.7761,  1.5299,  1.8677,  0.3375]])
tensor shape: torch.Size([2, 5])

tensor type: torch.FloatTensor
tensor value: tensor([[ 0.4172,  0.3862,  1.9460,  0.3330, -0.6183],
        			  [ 0.4837, -0.8353,  0.4653, -0.3128,  1.7366]])
tensor shape: torch.Size([2, 5])

tensor type: torch.FloatTensor
tensor value: 0.953281581401825
tensor shape: torch.Size([])
"""

note:

  1. L 1   l o s s L1 \ loss L1 loss的输入和输出维度相同;

  2. L 1   l o s s L1 \ loss L1 loss在零点处不平滑,相应地使用 L 1 L1 L1正则容易产生稀疏特征; L 2   l o s s L2 \ loss L2 loss对离散点比较敏感,使用梯度下降时可能导致梯度爆炸;

  3. 使用 n n . S m o o t h L 1 L o s s nn.SmoothL1Loss nn.SmoothL1Loss可以在$L1 \ loss 和 和 L2 \ loss 中 折 中 , 其 表 达 式 为 : 中折中,其表达式为: L_{loss}(y^{’},y) = \begin{cases} 0.5(y’ -y )^2 \ \ \ \ \ \ \ if \ |y’-y|<1 \ |y’-y|-0.5\ \ \ \ if \ otherwise \end{cases} $

    criterion = nn.SmoothL1Loss()
    
2.2 L 2   l o s s ( ) L2 \ loss () L2 loss()
2.2 L 2   l o s s   ( M S E ) L2 \ loss \ (MSE) L2 loss (MSE)

L l o s s ( y ′ , y ) = ( y ′ − y ) 2 L_{loss}(y^{'}, y) = (y^{'}-y)^2 Lloss(y,y)=(yy)2

criterion = nn.MSELoss()
batchsize = 2
data_dim = 5
y_ = torch.randn(batchsize,data_dim)
y = torch.randn(batchsize, data_dim)
loss = criterion(y_, y)

tensor_info(y_)
tensor_info(y)
tensor_info(loss)

"""
tensor type: torch.FloatTensor
tensor value: tensor([[-0.9645, -1.3637, -0.3499,  0.1778,  1.4501],
        			  [ 0.0399, -0.7981,  0.2331, -0.8327, -0.1414]])
tensor shape: torch.Size([2, 5])

tensor type: torch.FloatTensor
tensor value: tensor([[ 0.6230,  0.6931,  0.0585, -0.1514, -1.6614],
        			  [-0.8120, -0.3299, -0.0762, -1.5901,  1.2696]])
tensor shape: torch.Size([2, 5])

tensor type: torch.FloatTensor
tensor value: 2.0312931537628174
tensor shape: torch.Size([])
"""
3.one_hot 编码

当我们想在一个含有 C r o s s E n t r o p y L o s s CrossEntropyLoss CrossEntropyLoss中增加新的损失函数时,需要对模型的输出进行 o n e _ h o t one\_hot one_hot编码,从而能与其他损失联合使用,进而设计自己的损失函数,为自定义损失函数做铺垫。

一个高效简洁的 o n e _ h o t one\_hot one_hot编码转换如下:

def tensor_info(tensor):
    print('tensor type: {}'.format(tensor.type()))
    print('tensor value: {}'.format(tensor.data))
    print('tensor shape: {}'.format(tensor.shape))

def make_one_hot(label, classes):
    label = label.unsqueeze(dim=1)
    tensor_info(label)
    tensor = torch.zeros(label.size()[0], classes, 
                         label.size()[2], label.size()[3]).scatter_(1, label, 1)
    tensor_info(tensor)

class_num = 2
batch_size = 2
label = torch.LongTensor(batch_size, 3, 3).random_() % class_num
tensor = make_one_hot(label, class_num)
print(tensor)

"""
tensor type: torch.LongTensor
tensor value: tensor([[[[1, 0, 0],
          				[0, 1, 0],
          				[1, 0, 1]]],

        			[[[0, 0, 0],
          				[0, 0, 0],
          				[0, 1, 1]]]])
tensor shape: torch.Size([2, 1, 3, 3])

tensor type: torch.FloatTensor
tensor value: tensor([[[[0., 1., 1.],
                          [1., 0., 1.],
                          [0., 1., 0.]],
                         [[1., 0., 0.],
                          [0., 1., 0.],
                          [1., 0., 1.]]],

                        [[[1., 1., 1.],
                          [1., 1., 1.],
                          [1., 0., 0.]],
                         [[0., 0., 0.],
                          [0., 0., 0.],
                          [0., 1., 1.]]]])
tensor shape: torch.Size([2, 2, 3, 3])
"""

note:

  1. 上述例子多用于分割图像标注的one_hot编码,一般地,标注的 G r o u n d T r u t h GroundTruth GroundTruth维度为 y ∈ R B × H × W y\in \R^{B \times H\times W} yRB×H×W预测的输出为 y ′ ∈ R B × C × H × W y ^{'}\in \R^{B\times C \times H \times W} yRB×C×H×W,因此需要对 y y y进行 o n e _ h o t one\_hot one_hot编码;
4.自定义损失的两种方法
4.1 继承自 n n . M o d u l e nn.Module nn.Module
class MyLoss(nn.Module):
    def __init__(self):
        super().__init__()
    
    def forward(self, input, target):
        return torch.mean(torch.pow(input-target, 2))

criterion = MyLoss()

batchsize = 2
data_dim = 5
y_ = torch.randn(batchsize,data_dim)
y = torch.randn(batchsize, data_dim)
loss = criterion(y_, y)

tensor_info(y_)
tensor_info(y)
tensor_info(loss)

"""
tensor type: torch.FloatTensor
tensor value: tensor([[-1.0173,  0.4739, -0.7022, -1.2392, -0.9483],
        			  [-0.8169,  1.3850, -0.5899, -0.1689, -0.6612]])
tensor shape: torch.Size([2, 5])

tensor type: torch.FloatTensor
tensor value: tensor([[ 0.6348, -0.9740,  1.2326,  0.5315, -1.0824],
        		      [-0.8435,  0.6862,  0.3101, -0.1409,  0.8937]])
tensor shape: torch.Size([2, 5])

tensor type: torch.FloatTensor
tensor value: 1.543942928314209
tensor shape: torch.Size([])
"""
4.2 自定义损失函数
def myLoss(input, target):
    return torch.mean(torch.pow(input-target, 2))
...
loss = myLoss(y_, y)
...

note:

  1. 继承自 n n . M o d u l e nn.Module nn.Module类的损失损失函数需要重写 f o r w a r d forward forward方法,定义相关的 t o r c h torch torch运算,设计相对灵活;使用自定义的损失函数相当于间接使用 t o r c h torch torch的损失函数,不需要维护 f o r w a r d forward forward方法,使用时相当于函数调用;
  2. 损失函数在进行梯度回传时必然要使用 l o s s . b a c k w a r d loss.backward loss.backward方法,上述两种自定义的损失函数都支持该方法,本质上都是间接调用的 t o r c h torch torch的损失函数;
4.3 两个常见的自定义损失函数

F o c a l L o s s FocalLoss FocalLoss
F L ( p t ) = − ( 1 − p y ) γ l o g ( p t ) FL(p_t)=-(1-p_y)^{\gamma}log(p_t) FL(pt)=(1py)γlog(pt)

class FocalLoss(nn.Module):
    def __init__(self, gamma=2, alpha=None, ignore_index=255, size_average=True):
        super(FocalLoss, self).__init__()
        self.gamma = gamma
        self.size_average = size_average
        self.CE_loss = nn.CrossEntropyLoss(reduce=False, 
                                           ignore_index=ignore_index, weight=alpha)

    def forward(self, output, target):
        logpt = self.CE_loss(output, target)
        pt = torch.exp(-logpt)
        loss = ((1-pt)**self.gamma) * logpt
        if self.size_average:
            return loss.mean()
        return loss.sum()

criterion = FocalLoss()

batchsize = 2
data_dim = 5
y_ = torch.randn(batchsize,data_dim)
y = torch.empty(batchsize,dtype=torch.long).random_(data_dim)
loss = criterion(nn.Softmax()(y_), y)

tensor_info(y_)
tensor_info(y)
tensor_info(loss)

"""
tensor type: torch.FloatTensor
tensor value: tensor([[ 0.1728,  1.1785,  0.2764, -0.3511,  0.4180],
        			  [ 0.3613,  0.7521,  1.2390,  2.0650, -0.6268]])
tensor shape: torch.Size([2, 5])

tensor type: torch.LongTensor
tensor value: tensor([2, 2])
tensor shape: torch.Size([2])

tensor type: torch.FloatTensor
tensor value: 1.0486319065093994
tensor shape: torch.Size([])
"""

D I C E   L o s s DICE\ Loss DICE Loss
L l o s s ( y ′ , y ) = 1 − 2 × ∣   y ′ ⋂ y   ∣ ∣ y ′ ∣ + ∣ y ∣ L_{loss}(y', y) = 1 - 2\times\dfrac{|\ y'\bigcap y\ |}{|y'|+|y|} Lloss(y,y)=12×y+y yy 

class DiceLoss(nn.Module):
    def __init__(self, smooth=1., ignore_index=255):
        super(DiceLoss, self).__init__()
        self.ignore_index = ignore_index
        self.smooth = smooth

    def forward(self, output, target):
        if self.ignore_index not in range(target.min(), target.max()):
            if (target == self.ignore_index).sum() > 0:
                target[target == self.ignore_index] = target.min()
        target = make_one_hot(target, classes=output.size()[1])
        output = F.softmax(output, dim=1)
        output_flat = output.contiguous().view(-1)
        target_flat = target.contiguous().view(-1)
        intersection = (output_flat * target_flat).sum()
        loss = 1 - ((2. * intersection + self.smooth) /
                    (output_flat.sum() + target_flat.sum() + self.smooth))
        return loss

criterion = DiceLoss()
batchsize = 2
data_dim = 5
y_ = torch.randn(batchsize,data_dim, 3, 3)
y = torch.empty(batchsize,3, 3, dtype=torch.long).random_(data_dim)
loss = criterion(y_, y)

tensor_info(y_)
tensor_info(y)
tensor_info(loss)

"""
tensor type: torch.LongTensor
tensor value: tensor([[[[0, 3, 1],
          [0, 2, 2],
          [1, 0, 1]]],
        [[[2, 1, 2],
          [3, 3, 2],
          [1, 3, 4]]]])
tensor shape: torch.Size([2, 1, 3, 3])

tensor type: torch.FloatTensor
tensor value: tensor([[[[1., 0., 0.],
                      [1., 0., 0.],
                      [0., 1., 0.]],
                     [[0., 0., 1.],
                      [0., 0., 0.],
                      [1., 0., 1.]],
                     [[0., 0., 0.],
                      [0., 1., 1.],
                      [0., 0., 0.]],
                     [[0., 1., 0.],
                      [0., 0., 0.],
                      [0., 0., 0.]],
                     [[0., 0., 0.],
                      [0., 0., 0.],
                      [0., 0., 0.]]],
                    [[[0., 0., 0.],
                      [0., 0., 0.],
                      [0., 0., 0.]],
                     [[0., 1., 0.],
                      [0., 0., 0.],
                      [1., 0., 0.]],
                     [[1., 0., 1.],
                      [0., 0., 1.],
                      [0., 0., 0.]],
                     [[0., 0., 0.],
                      [1., 1., 0.],
                      [0., 1., 0.]],
                     [[0., 0., 0.],
                      [0., 0., 0.],
                      [0., 0., 1.]]]])
tensor shape: torch.Size([2, 5, 3, 3])

tensor type: torch.FloatTensor
tensor value: tensor([[[[ 0.2699,  2.0570,  0.3527],
                      [ 0.1577, -0.4064,  0.1343],
                      [ 1.5966,  1.7491,  1.0151]],

                     [[-0.8926,  0.1622,  1.9066],
                      [ 0.5218,  0.4823, -1.1344],
                      [-1.0118, -0.8615, -2.1888]],

                     [[-0.3432, -0.3939,  0.1995],
                      [-0.1927,  0.1906, -0.9791],
                      [-0.7473, -1.4993,  0.3817]],

                     [[ 1.9844, -0.3772,  0.0379],
                      [-0.3522,  0.3117,  3.4582],
                      [ 0.1093, -1.1035,  1.7196]],

                     [[-0.3047, -0.0412,  0.4407],
                      [ 0.1961,  0.7687,  0.2264],
                      [-0.7968, -3.2159,  1.1114]]],


                    [[[ 0.2529, -0.2005,  1.4892],
                      [-0.6280, -0.5346, -0.8372],
                      [ 2.1497, -0.9360,  0.4647]],

                     [[ 0.1600, -0.4615, -0.0581],
                      [-0.8772, -2.2099, -0.4701],
                      [-0.0854, -0.6858,  1.1420]],

                     [[-0.5037, -1.4045,  0.3457],
                      [ 0.4000,  0.8670,  0.2310],
                      [ 0.1687,  2.2899,  1.3715]],

                     [[ 0.6839,  0.0109, -1.9138],
                      [-0.9788, -0.9355,  0.8609],
                      [ 1.4093, -0.5079,  0.1082]],

                     [[ 0.8306, -0.9631, -0.8329],
                      [-0.0351, -1.1003,  0.2656],
                      [-1.8068, -0.5764, -1.0488]]]])
tensor shape: torch.Size([2, 5, 3, 3])

tensor type: torch.LongTensor
tensor value: tensor([[[0, 3, 1],
         [0, 2, 2],
         [1, 0, 1]],

        [[2, 1, 2],
         [3, 3, 2],
         [1, 3, 4]]])
tensor shape: torch.Size([2, 3, 3])

tensor type: torch.FloatTensor
tensor value: 0.8061555624008179
tensor shape: torch.Size([])

"""
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值