交叉熵的计算公式
输入的x的shape为(batch_size, num_class),一般求loss为对batch_size求均值,后进行backward
计算公式
这里class指的是某一类:
- 不带权重
l o s s ( x , c l a s s ) = − l o g e x p ( x [ c l a s s ] ) Σ j e x p ( x [ j ] ) loss(x, class)=-log\frac{exp(x[class])}{\Sigma_{j}exp(x[j])} loss(x,class)=−logΣjexp(x[j])exp(x[class]) - 带权重
l o s s ( x , c l a s s ) = w e i g h t [ c l a s s ] × − l o g e x p ( x [ c l a s s ] ) Σ j e x p ( x [ j ] ) loss(x, class)=weight[class]\times -log\frac{exp(x[class])}{\Sigma_{j}exp(x[j])} loss(x,class)=weight[class]×−logΣjexp(x[j])exp(x[class])
手算交叉熵
import torch
import torch.nn as nn
entroy=nn.CrossEntropyLoss()
input=torch.Tensor([[-0.7715, -0.6205,-0.2562]]) # shape: [1, 3]
target = torch.tensor([0]) # shape: [1]
output = entroy(input, target)
print(output)
Out:
tensor(1.3447)
手算的结果
−
l
o
g
e
x
p
(
x
[
0
]
)
Σ
j
e
x
p
(
x
[
j
]
)
=
0.7715
+
l
o
g
(
e
x
p
(
−
0.7715
)
+
e
x
p
(
−
0.6205
)
+
e
x
p
(
−
0.2562
)
)
=
1.3447
-log\frac{exp(x[0])}{\Sigma_{j}exp(x[j])}=0.7715+log(exp(-0.7715)+exp(-0.6205)+exp(-0.2562))=1.3447
−logΣjexp(x[j])exp(x[0])=0.7715+log(exp(−0.7715)+exp(−0.6205)+exp(−0.2562))=1.3447
交叉熵的源码实现
CrossEntropyLoss()=LogSoftmax()+NLLLoss()
m = nn.LogSoftmax()
loss = nn.NLLLoss()
input=m(input)
print('input', input)
output = loss(input, target)
print('output:', output)
Out:
input: tensor([[-1.3447, -1.1937, -0.8294]])
output: tensor(1.3447)
nn.LogSoftmax()
的计算公式
l
o
g
e
x
p
(
x
)
Σ
i
e
x
p
(
x
i
)
log\frac{exp(x)}{\Sigma_{i}exp(x_i)}
logΣiexp(xi)exp(x)
nn.NLLLoss()
的计算公式
l
o
s
s
n
=
−
w
n
x
n
,
y
n
loss_n=-w_nx_{n, y_n}
lossn=−wnxn,yn