目标检测--A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection-CSDN博客

本文链接：https://blog.csdn.net/zhangjunhit/article/details/78892923

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
ECCV2016
https://github.com/zhaoweicai/mscnn

本文首先指出 Faster RCNN 在小目标检测存在的问题，分析其原因。随后提出本文的解决思路：1）在不同尺度特征图上进行候选区域提取，2）放大特征图用于检测

the MS-CNN achieves speeds of 10 fps on KITTI (1250×375) and 15 fps on Caltech (640×480) images

首先来看看 Faster-RCNN 中 RPN 存在的问题
RPN 是怎么提取候选区域的了？在一组固定的卷积特征图上滑动一组固定的滤波器
the RPN generates proposals of multiple scales by sliding a fixed set of filters over a fixed set of convolutional feature maps.

这就有一个不匹配的问题，物体尺度是变化的，但是滤波器感受野是固定的。导致小目标的检测效果尤其的差
This creates an inconsistency between the sizes of objects, which are variable, and filter receptive fields, which are fixed

这里写图片描述

我们针对目标检测提出了一个 unified multi-scale deep CNN, denoted the multi-scale CNN (MS-CNN)，
主要包括两个部分： an object proposal network and an accurate detection network
3 Multi-scale Object Proposal Network
3.1 Multi-scale Detection
这里写图片描述
（a）单个分类器，多尺度输入图像, 这种方法检测精度最高，计算量很大
（b）多个分类器，单尺度输入图像，效率高点，精度差些
（c）介于（a）和（b）之间，若干分类器和若干尺度输入图像
（d）合成多尺度特征图，单个分类器
（e） RCNN 中对候选区域多特征图归一化
（f） RPN 多个模板 anchor
（g）本文的多尺度策略

本文的候选区域提取架构：
这里写图片描述
这么做的目的就是靠前的特征图可以检测小目标，靠后的特征图可以检测大目标

4 Object Detection Network 检测网络，这里用了一个反卷积的特征图放大
To the best of our knowledge, this is the first application of deconvolution to jointly improve the speed and accuracy of an object detector.

这里写图片描述
这个结构中有一个 context，就是候选区域外围的一圈，The context region is 1.5 times larger than the object region