Faster RCNN论文理解

最新推荐文章于 2022-12-10 18:24:22 发布

qq_29631521

最新推荐文章于 2022-12-10 18:24:22 发布

阅读量108

点赞数

Faster RCNN论文理解
代码地址

首先通过网络来生成feature map（例如vgg16），然后通过RPN生成anchors，anchors是人为设定的候选框(拥有一定的尺度和比率，例如尺度为3，比率为3，那么anchor就有9个)，然后拿每个anchors的中心点在feature map上的每个像素点上滑动，就可以得到20000多个anchors,anchors在feature map上滑动对应代码如下(proposal_layer.py里面)：

shift_x = np.arange(0, feat_width) * self._feat_stride   # np.arange(0, feat_width)产生0-feat_width之间的整数;遍历feature map
shift_y = np.arange(0, feat_height) * self._feat_stride
shift_x, shift_y = np.meshgrid(shift_x, shift_y)
shifts = torch.from_numpy(np.vstack((shift_x.ravel(), shift_y.ravel(),
                                  shift_x.ravel(), shift_y.ravel())).transpose())
shifts = shifts.contiguous().type_as(scores).float()
A = self._num_anchors # anchors数量
K = shifts.size(0) # shifts里面元素个数
self._anchors = self._anchors.type_as(scores) # 将_anchors转化为scores的类型
anchors = self._anchors.view(1, A, 4) + shifts.view(K, 1, 4)  # self._anchors是人工生成的anchors，加上shifts，相当于加上了一个偏移量，实现了在feature map上的滑动

RPN包括AnhorTargetCreator和ProposalCreator。
然后通过AnhorTargetCreator从20000多个anchors中抽取256个正负样本，用于RPN网络的训练。抽取规则是：1>对于每一个ground truth bounding box (gt_bbox)，选择和它重叠度（IoU）最高的一个anchor作为正样本。2>对于剩下的anchor，从中选择和任意一个gt_bbox重叠度超过0.7的anchor，作为正样本，正样本的数目不超过128个。3>随机选择和gt_bbox重叠度小于0.3的anchor作为负样本。负样本和正样本的总数为256。注意：AnhorTargetCreator只在训练中使用。相应代码在anchor_target_layer.py里面。
第一步生成的anchors，通过ProposalCreator来生成rois(region of interests，个人理解，感兴趣区域，即类似于聚类的好多候选框组成的区域，一个区域对应一个目标)，生成过程如下：1>计算20000多个anchors属于前景的概率，然后选择概率最大的前12000个，并使用回归量来修正rois。2>使用NMS得到2000个rois。注意：ProposalCreator训练测试都会使用。代码里面的实现过程是这样的：首先根据box偏移量对20000多个anchors进行修正，然后计算20000多个anchors属于前景的概率，对概率从大到小排序，取出前12000个，然后进行NMS，保留前2000个，得到Rois。相应代码在anchor_target_layer.py、proposal_layer.py里面。
Fast rcnn阶段。RPN生成的rois，通过Proposal Target Creator，从2000个rois中选128个rois，然后通过Roi Pooling，用于训练Fast rcnn.注意：Proposal Target Creator只在训练中使用。相应代码在proposal_target_layer_cascaded.py里面。
在RPN的时候，已经对anchor做了一遍NMS，在Fast rcnn测试的时候，还要再做一遍
在RPN的时候，已经对anchor的位置做了回归调整，在Fast rcnn阶段还要对RoI再做一遍
在RPN阶段分类是二分类（前景/背景），而Fast RCNN阶段是21分类(voc数据集中的20个类加上背景类)

参考
[论文解读]https://zhuanlan.zhihu.com/p/31426458
https://zhuanlan.zhihu.com/p/32404424
[anchor解读]https://blog.csdn.net/ture_dream/article/details/76824889
[ROI POOLING详解]https://blog.csdn.net/auto1993/article/details/78514071

qq_29631521

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Faster RCNN论文理解

Faster RCNN论文理解参考[论文解读]https://zhuanlan.zhihu.com/p/31426458[anchor解读]https://blog.csdn.net/ture_dream/article/details/76824889[ROI POOLING详解]https://blog.csdn.net/auto1993/article/details/7851407...
复制链接

扫一扫