论文笔记 Focal Loss for Dense Object Detection - ICCV 2017

CiLin-Yan

已于 2022-04-11 21:44:51 修改

阅读量381

点赞数

分类专栏：目标识别文章标签：深度学习 cnn 计算机视觉

于 2022-03-06 10:08:39 首次发布

本文链接：https://blog.csdn.net/weixin_43791477/article/details/123306269

版权

目标识别专栏收录该内容

8 篇文章 0 订阅

订阅专栏

`2017 RetinaNet` Focal Loss for Dense Object Detection

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He ICCV, 2017 PDF

1. Contributions

提出 Focal Loss 损失函数，缓解前景和背景样本极度不均衡的问题。The Focal Loss is designed to address the one-stage object detection scenario in which there is an extreme imbalance between foreground and background classes during training (e.g., 1:1000). ( 在早期 Faster R-CNN，SSD，yolo 中均采用Hard Negative Ming，即使用两个规则筛选正负样本。)

2. Experiment

One-stage network 性能第一次超越 two-stage network.

3. Focal loss

Figure 1

3.1. Cross Entropy Loss

$CE(p_t) = - \log(p_t),\quad \text{where}\quad p_{\mathrm{t}}= \begin{cases}p & \text { if } y=1 \\ 1-p & \text { otherwise }\end{cases}$

The CE loss can be seen as the blue (top) curve in Figure 1. One notable property of this loss, which can be easily seen in its plot, is that even examples that are easily classified ( $p_t \gg .5$ ) incur a loss with non-trivial magnitude. When summed over a large number of easy examples, these small loss values can overwhelm the rare class.

3.2. $\alpha$ -Balanced CE Loss

解决样本不平衡的常用方法是对前景引入权重因子 $\alpha$ ，对背景引入权重因子 $1-\alpha$ . 在实现中，权重因子 $\alpha$ 可通过反类频率 $\displaystyle rate=\frac{n_{background\_proposal}}{n_{all\_proposal}}$ 设置，也可作为超参数通过交叉验证来设置。在论文中，作者没有使用反类频率设置权重因子 $\alpha$ ，通过作者的实验， $\alpha=0.75$ 时 $m\text{AP}$ 值最高，而对于网络训练而言，反类频率 $\displaystyle rate$ 通常要达到 $0.999$ 甚至更高。
$CE(p_t) = -\alpha_t \log(p_t)$

This loss is a simple extension to CE. 作者将 $\alpha$ -Balanced CE Loss 视为 Focal Loss 的实验基线。

3.3. Focal Loss

在训练过程中，易分类的负样本构成的总损失的大部分，主导了梯度。虽然 $\alpha$ -Balanced CE Loss 平衡了正负样本的重要性，但是没有区分易分样本和难分样本的重要性。因此，作者在 CE Loss 中引入了一个调制因子 $\left(1-p_{\mathrm{t}}\right)^{\gamma}$ ， Focal Loss 定义如下：
$\mathrm{FL}\left(p_{\mathrm{t}}\right)=-\left(1-p_{\mathrm{t}}\right)^{\gamma} \log \left(p_{\mathrm{t}}\right)$
在 Figure 1 中，对于几个 $\gamma \in [0,5]$ 的值，可视化 Focal Loss。注意到 Focal Loss 的两个属性。

当一个样本被错误分类并且 $p_t$ 很小时，调制因子接近1，损失不受影响。当 $p_t \rightarrow 1$ 时，因子变为 0，并且分类良好的示例的损失被降低权重。
Focal Loss 参数 $\gamma$ 平滑地调整了简单示例被降低权重的速率。当 $\gamma=0$ 时，FL 等价于 CE，并且随着 $\gamma$ 的增加，调制因子的影响同样增加（在作者的实验中 $\gamma=2$ 效果最好）。

In practice we use an $\alpha$ -balanced variant of the focal loss:
$\mathrm{FL}\left(p_{\mathrm{t}}\right)=-\alpha_{\mathrm{t}}\left(1-p_{\mathrm{t}}\right)^{\gamma} \log \left(p_{\mathrm{t}}\right)$

4. 缺点

Focal Loss 易受噪声干扰。Focal Loss 是给难分样本更高的权重，难分样本是那些分类效果不好的点。当 Label 存在 Noise 时，Focal Loss 将会给难分样本和错误标签的样本更高的权重。因此如果有比较多样本标签打错了，Focal Loss 效果就会不好。