F.nll_loss与F.cross_entropy(loss为nan时如何调整)

在用vgg16迁移学习时,fit函数中采用损失函数F.nll_loss()

def fit(epoch,model,data_loader,phase='training',volatile=False):#epoch参数在这里有什么用?
    if phase == 'training':
        model.train()
    if phase =='validation':
        model.eval()
        volatile =True
    running_loss = 0.0
    running_correct = 0
    for batch_idx,(data,target) in enumerate(data_loader):
        if is_cuda:
             data,target = data.cuda(),target.cuda()
        data,target = Variable(data,volatile),Variable(target)#volatile在这里是?requires_grad
        if phase =='training':
            optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output,target)
        running_loss += F.nll_loss(output,target,reduction='sum').item()
        preds = output.data.max(dim=1,keepdim=True)[1]
        running_correct+=preds.eq(target.data.view_as(preds)).cpu().sum()#eq用来判断二者是否相等
        if phase == 'training':
            loss.backward()
            optimizer.step()
    loss = running_loss/len(data_loader.dataset)
    accuracy = 100.*running_correct/len(data_loader.dataset)
    print(f"{phase} loss is {loss:{5}.{2}} and {phase} accuracy is {running_correct}/{len(data_loader.dataset)}{accuracy:{10}.{4}}")
    return loss,accuracy

打印训练过程:
在这里插入图片描述
训练后发现loss都为nan,查阅资料后总结后有以下可能性:
1.学习率太高。
2.loss函数
3.对于回归问题,可能出现了除0 的计算,加一个很小的余项可能可以解决。
4.数据本身是否存在Nan,可以用numpy.any(numpy.isnan(x))检查一下input和target。
5.target本身应该是能够被loss函数计算的,比如sigmoid激活。函数的target应该大于0,同样的需要检查数据集。

检查数据集后未发现问题,但F.nll_loss()的使用网络结构条件似乎点醒了我:NLLLoss 的 输入 是一个对数概率向量和一个目标标签. 它不会为我们计算对数概率. 适合网络的最后一层是log_softmax。回头查看之前有效的网络价格,果然最后一层都是log_softmax。但调用的models.vgg16(),无法改变内部网络结构,所以换用损失函数F.cross_entropy()

在这里插入图片描述
下面对两种损失函数原理解析:
本来想自己整理,查阅资料无意中发现一篇写的很细的文章,贴在这里(偷懒。。。)
详解NLLLoss和CrossEntropyLoss

Traceback (most recent call last): File "/home/bder73002/hpy/ConvNextV2_Demo/train+.py", line 275, in <module> train_loss, train_acc = train(model_ft, DEVICE, train_loader, optimizer, epoch,model_ema) File "/home/bder73002/hpy/ConvNextV2_Demo/train+.py", line 48, in train loss = torch.nan_to_num(criterion_train(output, targets)) # 计算loss File "/home/bder73002/anaconda3/envs/python3.9.2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/bder73002/hpy/ConvNextV2_Demo/models/losses.py", line 56, in forward focal_loss = self.focal_loss(x, target) File "/home/bder73002/anaconda3/envs/python3.9.2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/bder73002/hpy/ConvNextV2_Demo/models/losses.py", line 21, in forward return focal_loss(F.cross_entropy(input, target, reduction='none', weight=self.weight), self.gamma) File "/home/bder73002/anaconda3/envs/python3.9.2/lib/python3.9/site-packages/torch/nn/functional.py", line 2693, in cross_entropy return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction) File "/home/bder73002/anaconda3/envs/python3.9.2/lib/python3.9/site-packages/torch/nn/functional.py", line 2388, in nll_loss ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward
05-26
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值