论文笔记：R-FCN: Object Detection via Region-based Fully Convolutional Network

最新推荐文章于 2020-07-21 10:59:05 发布

这题我会啊

最新推荐文章于 2020-07-21 10:59:05 发布

阅读量1.4k

点赞数 1

分类专栏： deep-learning paper 文章标签：深度学习目标检测神经网络论文笔记

本文链接：https://blog.csdn.net/Love_wanling/article/details/64443277

版权

deep-learning 同时被 2 个专栏收录

21 篇文章 0 订阅

订阅专栏

paper

10 篇文章 0 订阅

订阅专栏

前提

ResNet做classification问题，效果很好。但是不能直接用到detection问题中去。作者认为这是分类问题的平移不变性以及检测问题的平移变换性导致的。

We propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection.

网络结构

这里写图片描述

简而言之，R-FCN是RPN+classification network, classifactrion network是如下结构：
ResNet + position-sensitive score maps + position-sensitive RoI pooling

The k²position-sensitive scores then vote on the ROI. In this paper we simply vote by averaging the scores, producing a (C+1)-dimensional vector for each ROI: $r_c(\theta)=\sum_{i,j}r_c(i,j|\theta)$ . Then we compute the softmax responses across categories: $s_c(\theta)=e^{r_c(\theta)}/\sum_{i=0}^{C}e^{r_i(\theta)}$ . They are used for evaluating the cross-entropy loss during training and for ranking the ROIs during inference.

优点

All learnable weight layers are convolutional and are computed on the entire image; the per-RoI computational cost is negligible.
Receiving arbitrary sizes of image
remove fully connected layer. 这个是极好的，一直觉得SPPNet还有ROI pooling其实还是有误差的，有压缩的。
position-sensitive map做了类似CRAFT的工作与无形之中，针对每个类单独pooling，提高精度
3.3x3的vote机制，增加了鲁棒性。因为是针对一个物体进行二分类（是或者否）而不是进行全物体分类，所以3x3就挺好的了。