Faster R-CNN学习笔记

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun

Abstract

        State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features——using the recently popular terminology of neural networks with “attention” mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3], our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

 

学习博客:1.Faster R-CNN论文翻译——中英文对照

 2.【RCNN系列】【超详细解析】

3. 经典网络解读系列(一):RegionProposal+CNN (rcnn)

4. 

 

RCNN

RCNN   ( Regions with CNN )是继  DPM ( Deformable Parts Model )之后运用深度学习进行多目标检 测的代表性方法。 RCNN  的思想主要分为以下几个步骤(如图 3.16 所示):

步骤一,先将通过图像分割手段将原图分割成若干个小区域,然后检查这些小区域,若 相邻的两个小区域颜色直方图相近,纹理直方图相近,合并后区域面积不大,或者在  bounding  box  中所占的面积比较大,就将其合并。该操作会在( RGB 、 HSV )等多个颜色空间中进行, 以减少遗漏候选区的可能性,最后得到的图片就是候选区,数量大约在  2000  张左右。  

步骤二,将提取的候选区域进行预处理,尺寸统一缩放成  227* 227 大小,然后送入卷积 神经网络进行特征提取,得到  4096  维的特征向量。  

步骤三,将每一个候选框进行类别判断。对于一个已经训练好的神经网络模型,每一个 类都有一个特征向量表达,现在将从候选框中提取的  4096  维特征经过一个线性的二分的  SVM 分类器进行分类,判别过程如图  3.16  中所示,原图的特征分类是否是飞机?否。是否是显示 器?否,是否是人?是。评判标准由原图提取的特征向量与某类的特征向量之间的距离来判 断。  

步骤四,待分类结束后,将目标物体在输入图片中框出来。有时图片的检测结果准确, 但是定位未必非常准确,或许真实物体与候选框的重叠面积并不是非常一致。此时需要对候 选框进行修正,对每一个候选框包含实物面积进行打分,然后用  canny  算子进行边缘检测, 最后得到一个得分最高的候选框。

 

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值