torch.nn中的loss function
(公式都来源于源码)
持续更新…
CrossEntrophyLoss:
源码中的公式有三种表述:
loss
(
x
,
c
l
a
s
s
)
=
−
log
(
exp
(
x
[
c
l
a
s
s
]
)
∑
j
exp
(
x
[
j
]
)
)
=
−
x
[
c
l
a
s
s
]
+
log
(
∑
j
exp
(
x
[
j
]
)
)
\text{loss}(x, class) = -\log\left(\frac{\exp(x[class])}{\sum_j \exp(x[j])}\right) = -x[class] + \log\left(\sum_j \exp(x[j])\right)
loss(x,class)=−log(∑jexp(x[j])exp(x[class]))=−x[class]+log(∑jexp(x[j]))
也就是说,在输出output时,可以不用先softmax,因为竟然交叉熵的时候会再算一遍
loss
(
x
,
c
l
a
s
s
)
=
w
e
i
g
h
t
[
c
l
a
s
s
]
(
−
x
[
c
l
a
s
s
]
+
log
(
∑
j
exp
(
x
[
j
]
)
)
)
\text{loss}(x, class) = weight[class] \left(-x[class] + \log\left(\sum_j \exp(x[j])\right)\right)
loss(x,class)=weight[class](−x[class]+log(∑jexp(x[j])))
这是带有权重的
loss
=
∑
i
=
1
N
l
o
s
s
(
i
,
c
l
a
s
s
[
i
]
)
∑
i
=
1
N
w
e
i
g
h
t
[
c
l
a
s
s
[
i
]
]
\text{loss} = \frac{\sum^{N}_{i=1} loss(i, class[i])}{\sum^{N}_{i=1} weight[class[i]]}
loss=∑i=1Nweight[class[i]]∑i=1Nloss(i,class[i])
NLLLoss:
x
x
x是网络输出,
y
y
y是目标
源码中表示这个是与logsoftmax层共同合作实现的负对数似然损失,多项分布的似然函数:
所以在多分类问题上应该与交叉熵是等价存在的
ℓ ( x , y ) = L = { l 1 , … , l N } ⊤ , l n = − w y n x n , y n , w c = weight [ c ] ⋅ 1 { c ≠ ignore_index } \ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_{y_n} x_{n,y_n}, \quad w_{c} = \text{weight}[c] \cdot \mathbb{1}\{c \not= \text{ignore\_index}\} ℓ(x,y)=L={l1,…,lN}⊤,ln=−wynxn,yn,wc=weight[c]⋅1{c=ignore_index}
ℓ ( x , y ) = { ∑ n = 1 N 1 ∑ n = 1 N w y n l n , if reduction = ‘mean’; ∑ n = 1 N l n , if reduction = ‘sum’. \ell(x, y) = \begin{cases} \sum_{n=1}^N \frac{1}{\sum_{n=1}^N w_{y_n}} l_n, & \text{if reduction} = \text{`mean';}\\ \sum_{n=1}^N l_n, & \text{if reduction} = \text{`sum'.} \end{cases} ℓ(x,y)={∑n=1N∑n=1Nwyn1ln,∑n=1Nln,if reduction=‘mean’;if reduction=‘sum’.
其实在使用中,最后一层如果加了logsoftmax,那么使用NLLLoss和CrossEntropy的结果是一样的,因为在CrossEntropy中又进行了一遍softmax,所以输出的结果都是交叉熵损失