摘要:
本文提出了一个速度和精度权衡的detector。主要研究两个问题?
how to make the anchor-free detection head better?
how to utilize the power of feature pyramid better?
分别提出两种训练策略:soften optimization techniques,
1)soft-weighted anchor points
2)soft-selected pyramid levels
实验表明,我们简洁的SAPD将速度/精度权衡提升到了一个新的水平,COCO上获得a single-model single-scale AP of 47.4%,5xfaster than other detectors.
———————————————————————————————————————————————
目标检测分为anchor-base和anchor-free,anchor-free主要分为两大类,anchor-point和key-point检测。
Anchor-point方法:
将bounding box根据point-to-boundary distance编解码成anchor points 这些anchor point=pixel on pyramid feature maps并具有相应的距离信息。
优点:1.框架简单 2.训练速度快 3.backbone增大有帮助 4. flexible feature level selection
缺点:same image scale ->low mAP
Key-point方法:
预测bounding box的关键点。
优点:input image size small-> high mAP
缺点:下采样不能过多,需要single high-resolution and repeat bottom-down ,down-bottom in processing.因此FLOPs大,训练时间长,计算内存消耗大,对预先练backbone兼容性差。
没有DCN(deformable convolution)5x faster
有DCN(deformable convolution)high mAP
一般的Anchor-point detector模型框架
1.Feature Pyramid 中l代表input-size*(1/sl),sl=2**l
2.Head中有两个子网络,分类(每个anchor-point输出K类的可能性)和回归(4个值)
3.Supervision targets:Plij表示在第l层,(i,j)坐标的特征, i = 0,1,…,W/sl − 1 j = 0,1,…,H/sl −1. Each plij has a corresponding image space location (Xlij,Ylij) where Xlij = sl(i + 0.5) and Ylij = sl(j + 0.5)。
正样本判断:ground-truth B = (c,x,y,w,h),中心收缩Bv = (c,x,y,ew,eh), 只有在BV里面才是positive。
最后归一化的left,top,right,behind的坐标为:
- Loss functions :focal-loss 分类,IOU-loss 回归
两个创新策略
1.SW(soft-weight )
权重公式:yita控制下降步伐。
2.ss(soft-select)![在这里插入图片描述](https://i-blog.csdnimg.cn/blog_migrate/9d416c487d4aae1eebc3c95b898698ec.png)
Instance-dependent经过每层level->ROI Align->concat->经过meta-selection network-> a vector
The meta-selection network is jointly trained with the detector. Cross entropy loss is used for optimization and the ground-truth is a one hot vector indicating which pyramid level has minimal loss as defined in the FSAF module
Loss![在这里插入图片描述](https://i-blog.csdnimg.cn/blog_migrate/a79fc6cdc3b6a68324907a50b59c26e6.png)
———————————————————————————————————————————————
消融实验
结果表明,对于金字塔级别更高的实例,更大的实例往往被赋予更高的权重。大多数事例不超过两层就可以学会。非常罕见的实例需要超过两层,例如图7右上角的沙发子图。这与表4中的结果一致。
Joint training of the meta-selection network has a negligible effect on performance.
有人认为是因为多任务训练产生优异的表现,但是本文只是用训练进行联合训练,验证并没有使用权重,换句话说特征选择策略 Is the same as the baseline FSAF module
SAPD is robust and efficient
修改backbone很容易提高AP,AR
并且和anchor-base方法进行比较,实验表明
SAPD不仅跑的快(因为head 结构简单)
并且精准率高(即使是anchor-base和anchor-free方法结合
by significant margins)
Without DCN, our fastest SAPD version based on ResNet-50 can reach a 14.9 FPS while maintaining a 41.7% AP
With DCN, our SAPD forms an upper envelope of recent state-of-the-art anchor-based and anchor-free detectors
———————————————————————————————————————————————
补充
FSAF