论文笔记 |What makes for effective detection proposals?

最新推荐文章于 2021-12-13 14:47:05 发布

bea_tree

最新推荐文章于 2021-12-13 14:47:05 发布

阅读量2.6k

点赞数

本文链接：https://blog.csdn.net/bea_tree/article/details/51851804

版权

ConvNets 专栏收录该内容

39 篇文章 4 订阅

订阅专栏

在faster rcnn中提到的proposal的综述：

J. Hosang, R. Benenson, and B. Schiele, “How good are detection 
proposals, really?” in British Machine Vision Conference
(BMVC), 2014.
J. Hosang, R. Benenson, P. Doll´ar, and B. Schiele, “What makes
for effective detection proposals?” IEEE Transactions on Pattern
Analysis and Machine Intelligence (TPAMI), 2015.
N. Chavali, H. Agrawal, A. Mahendru, and D. Batra,
“Object-Proposal Evaluation Protocol is ’Gameable’,” arXiv:
1505.05836, 2015.

这个站点不错，博主很多文章都写的很好，供大家关注
http://zhangliliang.com/2015/05/19/paper-note-object-proposal-review-pami15/

Authors

Jan Hosang , Rodrigo Benenson , Piotr Dollár , and Bernt Schiele
这里写图片描述
Jan Hosang

Abstract

本文比较了12种proposal方法，提出了average recall的评价标准。

1 Introduction

对于滑动窗口，每一个目标大约 $10^4-10^5$ 个窗口，如果考虑多目标及bbox的aspect ratio，又会增加几个数量级。所以需要使用detection proposal的方法。

2 Detection proposal methods

作者将其分为两类：
1. grouping methods，代表：Selective Search，将原图片打碎然后聚合
2. Window scoring methods，代表：objectness，先划分很多窗口，然后分别打分，取高分窗口。
先overview一下：
这里写图片描述

2.1 Grouping proposal methods

作者将其分为三类grouping superpixels （SP），graph cut（GC），edge contours（EC），下面分述之：
1. selective search（SP）具体内容可见 http://blog.csdn.net/mao_kun/article/details/50576003 这里补充以下什么是超像素http://www.kev-smith.com/papers/SLIC_Superpixels.pdf 看起来挺炫的，ss没有学习的参数，人为定义聚合的方式。
2. RandomizedPrim’s（sp）：与ss使用相同的特征，但是聚合方式是随机化的，聚合的概率是学习得到的。速度提高了。
3. Rantalankila（sp）：与ss使用的聚合策略相同但是用的特征不同，之后它得到的segments用做graph cuts的种子
4. Chang（sp）：结合saliency和objectness 聚合超像素得到segmentation
5. CPMC（GC）：关于graph cut http://blog.csdn.net/zouxy09/article/details/8532111，避免初始化分割，使用几个不同的种子和unaries计算图分割，最后的结果使用大pool of features排序
6. Endres（GC）：从occlusion boundaries建立一个分层的分割，使用不同的种子和参数graph cut，proposals的排序向着多样性的放方向进行
7. Rigor（GC）：相当于CPMC的增强版，速度提高了，在multiple graphcut中重复使用计算，使用了fast edge detections。
8. Geodesic（EC）：EC就是直接使用edge contours进行分割，这里从fast edge的over segmentation开始，classifers用于放置geodesic distance transform的种子，每个distance tansform 的level sets 又define了proposals，具体的还是看原文吧～
9. MCG（EC）：在fast edge的基础上引入了快速计算multiscale 分层分割，基于edge strength进行merge，生成的结果使用size/location/shape/edgestrength等排序。

2.2 Window scoring proposal methods

这种方法一般很快，但是精度低，因此有些方法对生成的windows进一步refine。
1. objectness:首先，从image的salient locations选取proposals，然后根据颜色/边界/location/size/strong superpixel straddling cue等线索来打分
2. Rahtu：起始的pool of proposal包括独立的superpixels， pairs ，triplets of superpixels, multiple randomly sampled boxes，然后进行objectness中的打分，加入了low level features，强调NMS的重要性。
3. Bing：使用edge特征训练的分类器，应用于滑动窗口，速度很快，但是被指出分类的作用很小。
4. Edgeboxes[EC]:也是起始于滑动窗口，但是又应用了通过决策森林得到的目标边界估计及一系列的refinement，没有学习的参数，也用到nms等
5. Feng：对于滑动的窗口使用显著性打分，作者引入了新的显著性measures（包括与背景的融合难易）
6. Zhang：使用特定种类的数据训练一个级联的svms分类器，作用于窗口的梯度特征，但是不知道其泛化能力如何。
7. RandomizedSeed：使用多随机种子的超像素maps来给候选框打分，打分方法类似于objectness中的strong superpixel straddling cue，作者强调多超像素maps的重要性。

2.3 Alternative proposal methods

ShapeSharing：这名字的意思就是examplars与testimage的形状通过马match edge来共享（没有参数），之后通过graph cut来融合和refine。
Multibox：直接使用神经网络回归一定数量的proposals，通过bias来diversify proposals的位置，效果不错

2.4 Baseline proposal methods

作者使用了4个方法来作为baseline：uniform，gaussian，slidingwindow，superpixels

2.5 Proposals versus cascades

级联就是使用快速但不精确的分类器，来去掉不好的proposals

3 Proposal Repeatability

Repeatability是在各种扰动的情况下，算法依然能够得出相似的proposal的能力，作者进行质量/光照/模糊/噪声等干扰检查其repeatability，使用IoU评价，定义repeatability为recall vs. IoU threshold 曲线下的面积。
为了避免大尺寸窗口的影响，作者将proposal分为10分，求平均得到最终的结果，另外CPMC，Endres太慢，作者没有实验。
结果是：
1. Scale：影响较大，bing 更好点
2. JPEG artefacts：bing 最好
3. Rotation :都差不多
4. illumination：趋势相似，bing更好
5. Blur：相似
6. salt and pepper noise：该因素影响比较大
结论：Bing在repeatability中表现最好，EdgeBoxes也不错。

4 Proposal Recall

三种评价：
1. 一定数量的proposal的IoU
2. 一定IoU需要的proposal数量
3. 作者提出的average recall AR：数量与0.5-1之间的IoU的平均recall的关系
结论：
MCG/EdgeBoxes/Selectivesearch/rigor/geodesit最好，SS效率最高，在数量少于1000时MCG/endres/CPMC结果很好。
这些方法分为两类：1定位精确，随着IOU的增加，recall逐渐减少2定位不准确，recall drops rapidly。bing，rahtu，objectness，edgeboxes是属于后者，bing最严重。
以AR为标准，MCG表现最好，endres，edgeboxes在数量少的时候表现好，rigor ss在数量多的时候表现好。
另外各种方法在不同数据集上表现相似。

5 using the detection proposals

比较各方法在不同detection方法的表现：
这里写图片描述

5.2 LM-LLDA

top 5: mcg,SS,EdgeBoxes,Geodesic,Rigor
可以发现对于自行车等物体，这几种方法表现都不是很好

5.3 R-CNN

Rcnn和fast rcnn的预训练模型都是在ss上预训练的，最后一排是重新训练之后的结果。

5.4 AR

AR与mAP有较好的相关性

5.5 tuning

使用AR指导调整proposal method-EdgeBoxes得到了更好的结果

5.6 Detection with oracles

all ground truth annotations
nms
两种方法成绩都有提高 nms提高最多

6 discussion

Top methods：
SS，merge superpixels
Rigor,Graph cut
MCG,generates hierarchical segmentations
EdgeBoxes,socres windows
1. 这些方法原理不同但是表现相似
2. 定位精度（IoU）和recall都很重要
3. repeatability，所有的方法的repeatability都不太好，repeatability好可能会提高最终的表现，但并不绝度
4. 重要特征：repeatability，recall，localisation，speed。

Authors
Abstract
Introduction
Detection proposal methods
Proposal Repeatability
Proposal Recall
using the detection proposals
discussion

bea_tree

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
2
评论
论文笔记 |What makes for effective detection proposals?

在faster rcnn中提到的proposal的综述：J. Hosang, R. Benenson, and B. Schiele, “How good are detection proposals, really?” in British Machine Vision Conference(BMVC), 2014.J. Hosang, R. Benenson, P. Doll´ar, a
复制链接

扫一扫

专栏目录