SSD（Single Shot MultiBox Detector）理论学习

最新推荐文章于 2020-03-27 13:14:58 发布

小_小_杨_

最新推荐文章于 2020-03-27 13:14:58 发布

阅读量3.8k

点赞数 1

分类专栏： ssd 文章标签： ssd

本文链接：https://blog.csdn.net/u012235274/article/details/52212346

版权

ssd 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

一、Performance of SSD and other mainstream algorithms.

Method(VOC2007 test)	FPS	mAP
SSD	59	72.1%
Faster R-CNN	7	73.2%
YOLO	45	63.4%

According to the paper of SSD， the fundamental improvement of ssd in speed comes from eliminating bounding box proposals and the subsequent pixel or feature resampling stage. SSD isn’t the first paper to do this, but they add a series of improvements.
This improvements include:
1、using a small convolutional filter to predict object categories and offsets in bounding box locations
2、applying these filters to multiple feature maps from the later stages of a network in order to perform detection at multiple scales.

二、Contributions of SSD

1、It is faster and more accurate than the previous state of the art for single shot detector（YOLO）. And It is as accurate as slower techniques that perform explicit region proposals and pooling(including faster R-CNN).
2、The cores of SSD is predicting categories scores and bounding box offsets for a fixed set of prior bounding boxes(different aspect ratio) using small convolutional filters applied to multi feature maps(improve the accuracy).
3、End-to-end train.

三、SSD convolutional predictor

这里写图片描述
As show in the picture，at each selected feature map， there will be three branches.
1、prior bounding boxes offsets prediction
2、prior bounding boxes confidence prediction
3、prior bounding boxes
For examples， if the size of feature map（feturemap_pool6） is m×n×c， and the basic element for predicting parameters of a potential detection is a 3×3×c small kernel that produces either offsets prediction or confidence prediction.At each of the m×n locations where pool6_mbox_priorbox produces k prior bounding boxes. So the the number of prior bounding boxes is mnk 、the output_nums of pool6_mbox_loc(convolution layer) is 4k and he output_nums of pool6_mbox_conf(convolution layer) is classes*k.

四、Training

1、 Matching strategy
At training time it need to establish the correspondence between the ground truth and the default boxes. It selected default box matching each ground truth with the best jaccard overlap. This is the matching approach used by the original MultiBox and it ensures that each ground truth box has exactly one matched default box. Unlike MultiBox, SSD match default boxes to any ground truth with jaccard overlap higher than a threshold(0.5).Adding these matches simplifies the learning problem: it allows the network to predict high confidence for multiple overlapping default boxes rather than requiring it to pick only the on with maximum overlap.
2、Training objective