Focal Loss代码及数学原理解析笔记

Facias

已于 2023-02-23 15:51:48 修改

阅读量528

点赞数 2

分类专栏：深度学习持续积累文章标签：深度学习计算机视觉目标检测 python pytorch

于 2023-02-11 11:46:43 首次发布

本文链接：https://blog.csdn.net/weixin_45453121/article/details/128978806

版权

深度学习持续积累专栏收录该内容

4 篇文章 0 订阅

订阅专栏

FocalLoss是为了解决单阶段目标检测算法如RetinaNet中正负样本不平衡问题而设计的损失函数。它通过调整正负样本的权重，减少容易样本的贡献，增加困难样本的影响力，从而优化训练过程。在计算FocalLoss前，需先确定锚点（anchor）与标注（annotation）的IoU，然后根据IoU定义正负样本，并为每个样本分配相应的标签。接着，通过引入正例因子α和困难样本权重γ，修改传统的交叉熵损失，使得损失函数对困难样本更加敏感。

摘要由CSDN通过智能技术生成

Focal Loss代码解析笔记

参考：RetinaNet代码解析
 Focal Loss解析
 RetinaNet详解

Focal Loss原理

Focal Loss主要用于解决单阶段目标检测算法正负样本分布不均衡问题。anchor-based目标检测算法会产生大量anchor，将每个anchor与所有annotation（ground truth）进行比较，计算IoU值，以此来区分anchor是否为正负样本。
例如，RetinaNet网络中，每个cell生成9个anchor，256x256的feature map共有256x256x9个anchor。将每个anchor都与每个annotation进行比较，若一个anchor与所有annotation的IoU值都小于0.4，则该anchor为负样本；若一个anchor与所有annotation的IoU值中的最大值大于等于0.5，则为正样本；介于中间的anchor舍弃，即：

IOU>=0.5, 正样本
IOU<0.4, 负样本
IOU∈[0.4, 0.5), 舍弃

所以在计算Focal Loss之前，我们首先要知道求得的anchor和已知的annotation：

 anchors                     #[num_anchors,4] 4代表x_ctr,y_ctr,h,w
 classification              #[num_anchors,num_classes] 模型预测的分类情况
 regression                  #[num_anchors,4] 模型预测的回归框
 annotations       #[num_annotations,5] #ground truth标注，5代表x_ctr,y_ctr,h,w,类别标签

接着根据以上信息计算每个anchors和每个annotations的IoU大小，并获得每个anchors最大IoU对应的annotations的索引和值：

IoU = calc_iou(anchors, bbox_annotation[:, :4])       # num_anchors x num_annotations

IoU_max, IoU_argmax = torch.max(IoU, dim=1)#IoU_max代表anchors与哪个annotation的IoU最大
IoU_argmax代表最大值是多少。两者维度同为[num_anchors,1]

计算正负样本标签targets

Focal Loss函数是在传统交叉熵Loss函数基础上进行改进，传统交叉熵函数如下：
$\log (p)+(1-y) \log (1-p)]$
y代表正负样本标签，y=1为正样本，p代表正样本的概率，介于[0,1]之间，p越大，则CE越小，即当真实标签为正样本时，模型预测为正样本的概率越大，loss越小；负样本同理。该公式可转化为
$\mathrm{CE}(p, y)=\left\{\begin{array}{ll} -\log (p) & \text { if } y=1 \\ -\log (1-p) & \text { otherwise. } \end{array}\right.$

首先获得正负样本信息targets：

targets = torch.ones(classification.shape) * -1 
#[num_anchors,num_classes] 预先初始化每个anchor的分类标签

targets[torch.lt(IoU_max, 0.4), :] = 0 #IOU<0.4为负样本，bool,（num_anchors,1）获得负
样本的索引，并将负样本在targets中对应的行赋值为0，即负样本对应的class标签全为0

positive_indices = torch.ge(IoU_max, 0.5) #IOU>0.5为正样本，bool,（anchor_nums,1）获得
正样本的索引

num_positive_anchors = positive_indices.sum()#正样本个数

assigned_annotations = annotations[IoU_argmax, :] #（anchor_nums,5）将每个anchor最大IoU
对应的annotation坐标信息和分类信息取出

targets[positive_indices, :] = 0

# 将targets转ONE-HOT编码
targets[positive_indices, assigned_annotations[positive_indices, 4].long()] = 1
# assigned_annotations[positive_indices, 4] shape为（anchors_num,1),每个元素为
类别的索引序号，例如有car，person，truck三个类别，则索引分别为0，1，2；
该操作将targets中的正样本的分类信息按annotation转换为one-hot编码
最终targets为一个[num_anchors,num_classes]的矩阵，元素为0，1
 num_anchors       car            person           truck
 1                  0                0               0
 2                  1                0               0
 3                  0                1               0
 4                  0                0               1
负样本所在行类别信息全为0，正样本按真实label，例如一个anchor选出与其有最大IoU的annotation，
且大于等于0.5，而该annotation对应的类别标注为car，则该正样本anchor在targets所在行对应的car
所在列为1，其余为0

计算权重矩阵

平衡交叉熵权重

Focal Loss是为了解决正负样本的不均衡问题，具体做法为引入一个正例因子 $\alpha \in[0,1]$ ，一个负例因子 $1-\alpha$ ，以此来平衡正负样本的数量差异：
$y)=-\left\{\begin{array}{cc} \alpha \log (p) & \text { if }(y=1) \\ (1-\alpha) \log (1-p) & \text { otherwise } \end{array}\right.$
作者最终实验得到最好的α是0.25。

alpha_factor = torch.where(torch.eq(targets, 1.), alpha_factor, 1. - alpha_factor) 
#alpha_factor
 num_anchors       car            person           truck
 1               1-alpha          1-alpha         1-alpha 
 2                alpha           1-alpha         1-alpha 
 3               1-alpha           alpha          1-alpha 
 4               1-alpha          1-alpha          alpha

引入困难样本权重

为了解决难易样本严重不均衡，导致loss被容易的样本所主导的问题，引入权重 $(1-p)^{\gamma}$ ，其中γ为超参数，这个参数能够降低易分样本的损失贡献：
$L(p)\left\{\begin{array}{ll} -\alpha(1-p)^{\gamma} \log (p) & \text { if } y=1 \\ -(1-\alpha) p^{\gamma} \log (1-p) & \text { otherwise } \end{array}\right.$
对于该方法数学原理本人做出的总结如下，可能有误：
当y=1时，此时真实标签为正样本，模型预测的概率为p，则CE损失和FL损失比例为：
$-\alpha \log (p) : -\alpha(1-p)^{\gamma} \log (p)$
化简为：
$1/(1-p)^{\gamma}$
在这里插入图片描述

γ=2时，当y=1，此时为正样本，若p较大，此时为容易样本，其CE损失对总损失的贡献与FL损失对总损失的贡献的比率f1无限大；若p较小，此时为困难样本，比率f1相比于前者过小，说明容易样本的CE损失对总损失的贡献远大于困难样本的CE损失对总损失的贡献。

当y=0时，CE损失和FL损失比例为：
$-(1-\alpha) \log (1-p) : -(1-\alpha) p^{\gamma} \log (1-p)$
即：
$1/p^{\gamma}$

γ=2时，同理。

以上理论部分为个人总结，若有误还请指教，以下为实验解释，更直观一些。

y=1时，模型预测的p越大，则说明该样本越容易预测，即为容易样本，此时f1越大
这里借用一张实验结果图进行说明：
在这里插入图片描述

当p为0.9，y=1时，此时为容易样本，可看到CE损失为FL损失的400倍；
当p为0.1，y=1时，此时为困难样本，可看到CE损失为FL损失的4.9倍；
折算下来可看出，在CE损失里，容易样本对总损失的贡献远大于困难样本对损失的贡献，而FL大大降低了容易样本和困难样本对总损失贡献的比率，提高了困难样本对总损失贡献的权重。

代码求得该部分权重：

focal_weight = torch.where(torch.eq(targets, 1.), 1. - classification, classification)
focal_weight = alpha_factor * torch.pow(focal_weight, gamma)

计算总分类损失

bce = -(targets * torch.log(classification) + (1.0 - targets) * torch.log(1.0 - classification))
cls_loss = focal_weight * bce

Facias

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Focal Loss代码及数学原理解析笔记

RetinaNet网络中Focal Loss数学原理分析，以及代码解析
复制链接

扫一扫

专栏目录