损失函数
损失函数:衡量模型输出与真实标签的诧异
损失函数(Loss Function):
计算一个样本的差异
Loss = f ( y ∧ , y ) \text { Loss }=f\left(y^{\wedge}, y\right) Loss =f(y∧,y)
代价函数(Cost Function)
计算整个样本集的loss的平均值
cos t = 1 N ∑ i N f ( y i ∧ , y i ) \cos t=\frac{1}{N} \sum_{i}^{N} f\left(y_{i}^{\wedge}, y_{i}\right) cost=N1i∑Nf(yi∧,yi)
目标函数(Objective Function)
obj = Cost + Regularization \text{obj}=\text{Cost}+\text { Regularization } obj=Cost+ Regularization
nn.CrossEntropyLoss
功能:nn.LogSoftmax()与nn.NLLLoss()结合,进行交叉熵计算
主要参数
weight 各个类别的loss设置权值
ignore_index 忽略某个类别
reduction 计算模式,可分为none/sum/mean
none 逐个元素计算
sum 所有元素求和 返回标量
mean 加权平均,返回标量
交叉熵 = 信息熵 + 相对熵
交叉熵
H ( P , Q ) = − ∑ i = 1 N P ( x i ) log Q ( x i ) \mathrm{H}(\boldsymbol{P}, \boldsymbol{Q})=-\sum_{i=1}^{N} \boldsymbol{P}\left(\boldsymbol{x}_{i}\right) \log \boldsymbol{Q}\left(\boldsymbol{x}_{i}\right) H(P,Q)=−i=1∑NP(xi)logQ(xi)
自信息
I ( x ) = − log [ p ( x ) ] \mathrm{I}(\boldsymbol{x})=-\log [\boldsymbol{p}(\boldsymbol{x})] I(x)=−log[p(x)]
熵
H ( P ) = E x ∼ p [ I ( x ) ] = − ∑ i N P ( x i ) log P ( x i ) \mathrm{H}(\mathrm{P})=\boldsymbol{E}_{x \sim p}[\boldsymbol{I}(\boldsymbol{x})]=-\sum_{i}^{N} \boldsymbol{P}\left(\boldsymbol{x}_{i}\right) \log P\left(\boldsymbol{x}_{i}\right) H(P)=Ex∼p[I(x)]=−i∑NP(xi)logP(xi)
相对熵
D K L ( P , Q ) = E x ∼ p [ log P ( x ) Q ( x ) ] \boldsymbol{D}_{K L}(\boldsymbol{P}, \boldsymbol{Q})=\boldsymbol{E}_{\boldsymbol{x} \sim \boldsymbol{p}}\left[\log \frac{\boldsymbol{P}(\boldsymbol{x})}{\boldsymbol{Q}(\boldsymbol{x})}\right] DKL(P,Q)=Ex∼p[log