论文阅读：FreeAnchor: Learning to Match Anchors for Visual Object Detection

最新推荐文章于 2024-10-11 16:57:10 发布

贾小树

最新推荐文章于 2024-10-11 16:57:10 发布

阅读量252

点赞数

分类专栏：论文阅读目标检测

本文链接：https://blog.csdn.net/j879159541/article/details/114947950

版权

论文阅读同时被 2 个专栏收录

74 篇文章 1 订阅

订阅专栏

目标检测

45 篇文章 1 订阅

订阅专栏

文章目录

1、论文总述

本篇论文并不是anchor-free检测模型，是在RetinaNet的基础上进行改进，原先的正负样本分配都是根据手工设计的anchor与GT的IOU，作者认为这样不好，就想自适应的选择anchor作为正样本，具体做法：从IOU值较高的许多anchor中，让网络自己选择正样本（free anchor是这个意思），这块好像是用的最大似然估计做的，该过程从每个对象的“一包”锚中选择最具代表性的锚，并将每个锚包的似然概率定义为锚包中最大的锚置信度。最大化似然概率可确保至少存在一个对对象分类和定位具有较高置信度的锚，同时，大多数分类或定位误差较大的锚会被分类为背景。在训练时，似然概率被转换为损失函数，然后可以基于CNN的探测器训练和对象锚匹配。（差不多就是把focal loss换成了作者定义的free anchor loss）

注：本论文没看太懂，可参考的解读暂时没找到

Code is available at https://github.com/zhangxiaosong18/FreeAnchor

在这里插入图片描述

To fulfill these objectives, we formulate object-anchor matching as a maximum likelihood estimation (MLE) procedure [8, 9], which selects the most representative anchor from a “bag" of anchors for
each object. We define the likelihood probability of each anchor bag as the largest anchor confidence within it. Maximizing the likelihood probability guarantees that there exists at least one anchor,
which has high confidence for both object classification and localization. Meanwhile, most anchors, which have large classification or localization error, are classified as background. During training,
the likelihood probability is converted into a loss function, which then drives CNN-based detector training and object-anchor matching.
The contributions of this work are concluded as follows:
• We formulate detector training as an MLE procedure and update hand-crafted anchor assignment to free anchor matching. The proposed approach breaks the IoU restriction, allowing objects to flexibly select anchors under the principle of maximum likelihood.
• We define a detection customized likelihood, and implement joint optimization of object classification and localization in an end-to-end mechanism. Maximizing the likelihood drives network learning to match optimal anchors and guarantees the comparability of with the NMS procedure.

2、手工设计的基于IOU的anchor分配的不合理

On the one hand, for objects of acentric features, e.g., slender objects, the most representative features are not close to object centers. A spatially aligned anchor might correspond to fewer representative features, which deteriorate classification and localization capabilities.
On the other hand, it is infeasible to match proper anchors/features for objects using IoU when multiple objects come together.

3、anchor与GT匹配时的3个准则

We propose a learning-to-match approach for object detection, and target at discarding hand-crafted anchor assignment while optimizing learning procedures of visual object detection from three specific
aspects. First, to achieve a high recall rate, the detector is required to guarantee that for each object at least one anchor’s prediction is close to the ground-truth. Second, in order to achieve high
detection precision, the detector needs to classify anchors with poor localization (large bounding box regression error) into background. Third, the predictions of anchors should be compatible with the non-maximum suppression (NMS) procedure, i.e., the higher the classification score is, the more
accurate the localization is. Otherwise, an anchor with accurate localization but low classification score could be suppressed when using the NMS process.