「Computer Vision」Note on SSD: Single Shot MultiBox Detector

最新推荐文章于 2024-08-27 11:57:41 发布

小锋子Shawn

最新推荐文章于 2024-08-27 11:57:41 发布

阅读量128

点赞数

本文链接：https://blog.csdn.net/dgyuanshaofeng/article/details/81048365

版权

QQ Group: 428014259
Sina Weibo：小锋子Shawn
Tencent E-mail：403568338@qq.com
http://blog.csdn.net/dgyuanshaofeng/article/details/81048365

作者：Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg
单位：UNC Chapel Hill, Zoox Inc., Google Inc., University of Michigan Ann-Arbor

SSD是Wei Liu和谷歌大神Christian Szegedy等人的作品，为物体检测单stage模型（single dnn），平衡了检测精度和速度。

0 摘要

这句话不是很懂，“discretizes the output space of bounding boxes into a set of default boxes over different aspct ratios and scales per feature map location”。将检测框的输出空间离散化为一系列默认框？在VOC，COCO，ILSVRC数据集上进行验证，精度具有竞争力，速度非常快。对于300 $\times$ 300的输入，SSD在VOC2007 test数据集上，实现74.3% mAP，在英伟达Titan X上速度达59 FPS。对于512 $\times$ 512的输入，SSD实现76.9% mAP，好于Faster R-CNN。

1 介绍

目标检测的套路为：第一，hypothesize bounding boxes，第二，resample pixels or features for each box，第三，apply a high-quality classifier。这种套路的模型为Selective Search、SPPnet、Fast和Faster R-CNN。
加速来源为：“the fundamental improvement in speed comes from eliminating bounding box proposals and the subsequent pixel or feature resampling stage”。也就是取消套路中的两个过程。这个思路参考了Overfeat和YOLO。
改进之处为：第一，使用small conv filter预测类别和位移。第二，不同长宽比的检测使用separate filters。第三，类似U-Net思想，把上面的separate filters或者predictors拿到后面用，解决物体的多尺度；类似GoogLeNet思想，旁路引出辅助分类器，代表了不同尺度。如图1所示。虚线代表新增的特征通道。

[1] SSD Single Shot MultiBox Detector ECCV 2016 [ ECCV paper] [Arxiv paper] [Caffe code]