torch.nn.CrossEntropyLoss 里面究竟发生了什么

最新推荐文章于 2024-03-11 19:54:12 发布

Linky1990

最新推荐文章于 2024-03-11 19:54:12 发布

阅读量2.2k

点赞数 15

分类专栏：深度学习

本文链接：https://blog.csdn.net/liangjiu2009/article/details/107769512

版权

深度学习专栏收录该内容

10 篇文章 1 订阅

订阅专栏

假设

假设有三个分类，模型输出值为 output = model(input)，得到如下输出向量
$o_1,o_2,o_3]$

假设 target 为 2， one hot 编码后为 $[0, 0, 1]$

使用 torch.nn.CrossEntropyLoss 对 output 和 target 计算损失的过程如下所示：

过程

输出值 output 经过 Softmax 后得到每个类别的预测值

$S_1, S_2, S_3]$

然后预测值再经过 log 后得到 LogSoftmax
$log(S_1),\log(S_2), \log(S_3)]$

最后再把 LogSoftmax 的结果与 target 的 one hot 值输入 NLLLoss 后得到
$log(S_3)$
因为 target 的 one hot值为 [0,0,1]，所以只保留 $log(S_3)$ 的值，然后取符号

代码验证

import torch
import torch.nn as nn

if __name__ == "__main__":
    
    torch.manual_seed(2020)
    output = torch.randn((1, 3))
    print("output: \t\t", output)

    target = torch.tensor([2])
    print("target: \t\t", target)

    ss = torch.softmax(output, dim=1)
    print("softmax: \t\t", ss)

    logsoftmax = torch.log(ss)
    print("logsoftmax: \t\t", logsoftmax)

    nllloss = -logsoftmax[0][2]
    print("nllloss : \t\t", nllloss)
    
    celoss = nn.CrossEntropyLoss()(output, target)
    print("CrossEntropyLoss : \t", celoss)

打印输出结果为：

output:                  tensor([[ 1.2372, -0.9604,  1.5415]])
target:                  tensor([2])
softmax:                 tensor([[0.4054, 0.0450, 0.5496]])
logsoftmax:              tensor([[-0.9029, -3.1005, -0.5986]])
nllloss :                tensor(0.5986)
CrossEntropyLoss :       tensor(0.5986)

验证结果正确

总结

根据官方文档的解释：This criterion combines nn.LogSoftmax()and nn.NLLLoss() in one single class. 其中 nn.LogSoftmax 只接收一个参数 output，先执行 softmax 操作，然后再取 log 对数，nn.NLLLoss 则需传入两个参数，一个是 nn.LogSoftmax 的输出 log，一个是 target，将 target one hot 编码后，取 logs 对应 one hot 编码非 0 位置的值，然后再取负数。

Linky1990

关注

15
点赞
踩
24

收藏

觉得还不错? 一键收藏
3
评论
torch.nn.CrossEntropyLoss 里面究竟发生了什么

假设假设有三个分类，模型输出值为 output = model(input)，得到如下输出向量[o1,o2,o3][o_1,o_2,o_3][o1,o2,o3]假设 target 为 2， one hot 编码后为 [0,0,1][0,0,1][0,0,1]使用 torch.nn.CrossEntropyLoss 对 output 和 target 计算损失的过程如下所示：过程输出值 output 经过 Softmax 后的得到每个类别的预测值[S1,S2,S3][S_1, S_
复制链接

扫一扫