R-FCN:Object Detection via Region-based Fully Convolutional Networks论文部分总结学习

最新推荐文章于 2019-04-09 20:42:55 发布

小_小_杨_

最新推荐文章于 2019-04-09 20:42:55 发布

阅读量661

点赞数

分类专栏：模式识别 R-FCN

本文链接：https://blog.csdn.net/u012235274/article/details/53760897

版权

模式识别同时被 2 个专栏收录

13 篇文章 0 订阅

订阅专栏

R-FCN

1 篇文章 0 订阅

订阅专栏

摘要：
之前基于区域的目标检测方法（Fast/Faster R-CNN）需要将重复性地将成千上百proposal输入到子网络。R-FCN网络近乎全图共享计算，避免前面方法的计算冗余。R-FCN提出使用位置敏感的得分谱（解决这个问题，大体意思是，图像分类需要不变性，目标检测需要对目标位置，形状等改变做出特征描述的改变。to address a dilemma between translation-invariance in image classification and translation-variance in object detection（实在不知道怎么准确翻译））。R-FCN使用的是分类的残差网络作为整个的backbones。性能如下We show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet.Meanwhile, our result is achieved at a test-time speed of 170ms per image, 2.5-20× faster than the Faster R-CNN counterpart.

介绍：
现在主流（prevalent）深度学习的目标检测通过RoI-pooling 层分为两个子网络：第一部分，共享的全卷积网络独立于RoIs。第二部分，不计算共享的单个RoI的子网络。这种分解的方式是受分类结构的影响（AlexNet and VGG Nets）。略

方法：
R-FCN使用RPN网络进行candidate RoIs,RPN和R-FCN是共享特征计算。The last convolutional layer produces a bank of k2 position-sensitive score maps for each category, and thus has a k2(C +1)-channel output layer with C object categories (+1 for background). The bank of k2 score maps correspond to a k ×k spatial grid describing relative positions. For example, with k×k = 3×3, the 9 score maps encode the cases of {top-left, top-center, top-right, …, bottom-right} of an object category.

R-FCN ends with a position-sensitive RoI pooling layer. This layer aggregates the outputs of the last convolutional layer and generates scores for each RoI.

Position-sensitive score maps & Position-sensitive RoI pooling. To explicitly encode position information into each RoI, we divide each RoI rectangle into k ×k bins by a regular grid. For an RoI rectangle of a size w×h, a bin is of a size≈ w/k × h/k . In our method, the last convolutional layer is constructed to produce k2 score maps for each category. Inside the (i, j)-th bin (0 ≤ i, j ≤ k −1), we define a position-sensitive RoI pooling operation that pools only over the (i, j)-th score map:它的计算是第ｃ类的第ｉ，ｊ个bin里面进行平均pooling。
这里写图片描述