基于深度学习的目标检测DET - SSD

原创 2016年08月30日 23:31:02


   SSD: Single Shot MultiBox Detector, 是一个end to end 的目标检测识别模型。先小八卦下,它属于google派系,它的作者也是googlenet的作者。该模型旨在高精度的快速识别, 它不用额外计算bounding box而能达到相当的识别精度,而且速度有极大的提高,号称可以达到58的FPS 和 72.1%的mAP。

  我们先来看下这个模型的全貌。它的最底几层是一个经典的VGG16的网络(也可以替换成ResNet),其中的卷积层conv4_3和全连接层fc7、 以及再往上的三个卷积层conv6、conv7、conv8,分别分支出mbox_conf, mbox_loc, priorbox三种节点(称X节点),然后通过对应的concat将来自不同层的X节点进行融合, 最后将concat结果输出一并进行分类决策。




    更详细的,可以看下面的一次前向计算的代码输出。

[INFO 2016-08-30 21:58:39.619143 21429 net.cpp:540] Forwarding data
[INFO 2016-08-30 21:58:39.622481 21429 net.cpp:540] Forwarding data_data_0_split
[INFO 2016-08-30 21:58:39.622514 21429 net.cpp:540] Forwarding conv1_1
[INFO 2016-08-30 21:58:39.627096 21429 net.cpp:540] Forwarding relu1_1
[INFO 2016-08-30 21:58:39.627473 21429 net.cpp:540] Forwarding conv1_2
[INFO 2016-08-30 21:58:39.631721 21429 net.cpp:540] Forwarding relu1_2
[INFO 2016-08-30 21:58:39.631757 21429 net.cpp:540] Forwarding pool1
[INFO 2016-08-30 21:58:39.632096 21429 net.cpp:540] Forwarding conv2_1
[INFO 2016-08-30 21:58:39.634774 21429 net.cpp:540] Forwarding relu2_1
[INFO 2016-08-30 21:58:39.634809 21429 net.cpp:540] Forwarding conv2_2
[INFO 2016-08-30 21:58:39.639045 21429 net.cpp:540] Forwarding relu2_2
[INFO 2016-08-30 21:58:39.639080 21429 net.cpp:540] Forwarding pool2
[INFO 2016-08-30 21:58:39.639394 21429 net.cpp:540] Forwarding conv3_1
[INFO 2016-08-30 21:58:39.642501 21429 net.cpp:540] Forwarding relu3_1
[INFO 2016-08-30 21:58:39.642535 21429 net.cpp:540] Forwarding conv3_2
[INFO 2016-08-30 21:58:39.647202 21429 net.cpp:540] Forwarding relu3_2
[INFO 2016-08-30 21:58:39.647235 21429 net.cpp:540] Forwarding conv3_3
[INFO 2016-08-30 21:58:39.650738 21429 net.cpp:540] Forwarding relu3_3
[INFO 2016-08-30 21:58:39.650770 21429 net.cpp:540] Forwarding pool3
[INFO 2016-08-30 21:58:39.651074 21429 net.cpp:540] Forwarding conv4_1
[INFO 2016-08-30 21:58:39.655285 21429 net.cpp:540] Forwarding relu4_1
[INFO 2016-08-30 21:58:39.655323 21429 net.cpp:540] Forwarding conv4_2
[INFO 2016-08-30 21:58:39.660395 21429 net.cpp:540] Forwarding relu4_2
[INFO 2016-08-30 21:58:39.660429 21429 net.cpp:540] Forwarding conv4_3
[INFO 2016-08-30 21:58:39.665523 21429 net.cpp:540] Forwarding relu4_3
[INFO 2016-08-30 21:58:39.665555 21429 net.cpp:540] Forwarding conv4_3_relu4_3_0_split
[INFO 2016-08-30 21:58:39.665570 21429 net.cpp:540] Forwarding pool4
[INFO 2016-08-30 21:58:39.665881 21429 net.cpp:540] Forwarding conv5_1
[INFO 2016-08-30 21:58:39.668714 21429 net.cpp:540] Forwarding relu5_1
[INFO 2016-08-30 21:58:39.668748 21429 net.cpp:540] Forwarding conv5_2
[INFO 2016-08-30 21:58:39.671761 21429 net.cpp:540] Forwarding relu5_2
[INFO 2016-08-30 21:58:39.671807 21429 net.cpp:540] Forwarding conv5_3
[INFO 2016-08-30 21:58:39.675269 21429 net.cpp:540] Forwarding relu5_3
[INFO 2016-08-30 21:58:39.675302 21429 net.cpp:540] Forwarding pool5
[INFO 2016-08-30 21:58:39.675624 21429 net.cpp:540] Forwarding fc6
[INFO 2016-08-30 21:58:39.685935 21429 net.cpp:540] Forwarding relu6
[INFO 2016-08-30 21:58:39.685971 21429 net.cpp:540] Forwarding fc7
[INFO 2016-08-30 21:58:39.688531 21429 net.cpp:540] Forwarding relu7
[INFO 2016-08-30 21:58:39.688565 21429 net.cpp:540] Forwarding fc7_relu7_0_split
[INFO 2016-08-30 21:58:39.688580 21429 net.cpp:540] Forwarding conv6_1
[INFO 2016-08-30 21:58:39.691439 21429 net.cpp:540] Forwarding conv6_1_relu
[INFO 2016-08-30 21:58:39.691473 21429 net.cpp:540] Forwarding conv6_2
[INFO 2016-08-30 21:58:39.695135 21429 net.cpp:540] Forwarding conv6_2_relu
[INFO 2016-08-30 21:58:39.695169 21429 net.cpp:540] Forwarding conv6_2_conv6_2_relu_0_split
[INFO 2016-08-30 21:58:39.695183 21429 net.cpp:540] Forwarding conv7_1
[INFO 2016-08-30 21:58:39.698765 21429 net.cpp:540] Forwarding conv7_1_relu
[INFO 2016-08-30 21:58:39.698796 21429 net.cpp:540] Forwarding conv7_2
[INFO 2016-08-30 21:58:39.701938 21429 net.cpp:540] Forwarding conv7_2_relu
[INFO 2016-08-30 21:58:39.702193 21429 net.cpp:540] Forwarding conv7_2_conv7_2_relu_0_split
[INFO 2016-08-30 21:58:39.702220 21429 net.cpp:540] Forwarding conv8_1
[INFO 2016-08-30 21:58:39.704677 21429 net.cpp:540] Forwarding conv8_1_relu
[INFO 2016-08-30 21:58:39.704716 21429 net.cpp:540] Forwarding conv8_2
[INFO 2016-08-30 21:58:39.707798 21429 net.cpp:540] Forwarding conv8_2_relu
[INFO 2016-08-30 21:58:39.707839 21429 net.cpp:540] Forwarding conv8_2_conv8_2_relu_0_split
[INFO 2016-08-30 21:58:39.707859 21429 net.cpp:540] Forwarding pool6
[INFO 2016-08-30 21:58:39.707926 21429 net.cpp:540] Forwarding pool6_pool6_0_split
[INFO 2016-08-30 21:58:39.707947 21429 net.cpp:540] Forwarding conv4_3_norm
[INFO 2016-08-30 21:58:39.711788 21429 net.cpp:540] Forwarding conv4_3_norm_conv4_3_norm_0_split
[INFO 2016-08-30 21:58:39.711818 21429 net.cpp:540] Forwarding conv4_3_norm_mbox_loc
[INFO 2016-08-30 21:58:39.714972 21429 net.cpp:540] Forwarding conv4_3_norm_mbox_loc_perm
[INFO 2016-08-30 21:58:39.717313 21429 net.cpp:540] Forwarding conv4_3_norm_mbox_loc_flat
[INFO 2016-08-30 21:58:39.717339 21429 net.cpp:540] Forwarding conv4_3_norm_mbox_conf
[INFO 2016-08-30 21:58:39.724395 21429 net.cpp:540] Forwarding conv4_3_norm_mbox_conf_perm
[INFO 2016-08-30 21:58:39.731096 21429 net.cpp:540] Forwarding conv4_3_norm_mbox_conf_flat
[INFO 2016-08-30 21:58:39.731127 21429 net.cpp:540] Forwarding conv4_3_norm_mbox_priorbox
[INFO 2016-08-30 21:58:39.731290 21429 net.cpp:540] Forwarding fc7_mbox_loc
[INFO 2016-08-30 21:58:39.733963 21429 net.cpp:540] Forwarding fc7_mbox_loc_perm
[INFO 2016-08-30 21:58:39.737503 21429 net.cpp:540] Forwarding fc7_mbox_loc_flat
[INFO 2016-08-30 21:58:39.737527 21429 net.cpp:540] Forwarding fc7_mbox_conf
[INFO 2016-08-30 21:58:39.746902 21429 net.cpp:540] Forwarding fc7_mbox_conf_perm
[INFO 2016-08-30 21:58:39.750918 21429 net.cpp:540] Forwarding fc7_mbox_conf_flat
[INFO 2016-08-30 21:58:39.750946 21429 net.cpp:540] Forwarding fc7_mbox_priorbox
[INFO 2016-08-30 21:58:39.751056 21429 net.cpp:540] Forwarding conv6_2_mbox_loc
[INFO 2016-08-30 21:58:39.753976 21429 net.cpp:540] Forwarding conv6_2_mbox_loc_perm
[INFO 2016-08-30 21:58:39.756206 21429 net.cpp:540] Forwarding conv6_2_mbox_loc_flat
[INFO 2016-08-30 21:58:39.756239 21429 net.cpp:540] Forwarding conv6_2_mbox_conf
[INFO 2016-08-30 21:58:39.763130 21429 net.cpp:540] Forwarding conv6_2_mbox_conf_perm
[INFO 2016-08-30 21:58:39.764664 21429 net.cpp:540] Forwarding conv6_2_mbox_conf_flat
[INFO 2016-08-30 21:58:39.764689 21429 net.cpp:540] Forwarding conv6_2_mbox_priorbox
[INFO 2016-08-30 21:58:39.764760 21429 net.cpp:540] Forwarding conv7_2_mbox_loc
[INFO 2016-08-30 21:58:39.768630 21429 net.cpp:540] Forwarding conv7_2_mbox_loc_perm
[INFO 2016-08-30 21:58:39.772903 21429 net.cpp:540] Forwarding conv7_2_mbox_loc_flat
[INFO 2016-08-30 21:58:39.772927 21429 net.cpp:540] Forwarding conv7_2_mbox_conf
[INFO 2016-08-30 21:58:39.777669 21429 net.cpp:540] Forwarding conv7_2_mbox_conf_perm
[INFO 2016-08-30 21:58:39.781180 21429 net.cpp:540] Forwarding conv7_2_mbox_conf_flat
[INFO 2016-08-30 21:58:39.781205 21429 net.cpp:540] Forwarding conv7_2_mbox_priorbox
[INFO 2016-08-30 21:58:39.781263 21429 net.cpp:540] Forwarding conv8_2_mbox_loc
[INFO 2016-08-30 21:58:39.783634 21429 net.cpp:540] Forwarding conv8_2_mbox_loc_perm
[INFO 2016-08-30 21:58:39.788920 21429 net.cpp:540] Forwarding conv8_2_mbox_loc_flat
[INFO 2016-08-30 21:58:39.788944 21429 net.cpp:540] Forwarding conv8_2_mbox_conf
[INFO 2016-08-30 21:58:39.793294 21429 net.cpp:540] Forwarding conv8_2_mbox_conf_perm
[INFO 2016-08-30 21:58:39.797371 21429 net.cpp:540] Forwarding conv8_2_mbox_conf_flat
[INFO 2016-08-30 21:58:39.797397 21429 net.cpp:540] Forwarding conv8_2_mbox_priorbox
[INFO 2016-08-30 21:58:39.797449 21429 net.cpp:540] Forwarding pool6_mbox_loc
[INFO 2016-08-30 21:58:39.800542 21429 net.cpp:540] Forwarding pool6_mbox_loc_perm
[INFO 2016-08-30 21:58:39.804468 21429 net.cpp:540] Forwarding pool6_mbox_loc_flat
[INFO 2016-08-30 21:58:39.804493 21429 net.cpp:540] Forwarding pool6_mbox_conf
[INFO 2016-08-30 21:58:39.808717 21429 net.cpp:540] Forwarding pool6_mbox_conf_perm
[INFO 2016-08-30 21:58:39.812292 21429 net.cpp:540] Forwarding pool6_mbox_conf_flat
[INFO 2016-08-30 21:58:39.812317 21429 net.cpp:540] Forwarding pool6_mbox_priorbox
[INFO 2016-08-30 21:58:39.812382 21429 net.cpp:540] Forwarding mbox_loc
[INFO 2016-08-30 21:58:39.812604 21429 net.cpp:540] Forwarding mbox_conf
[INFO 2016-08-30 21:58:39.812834 21429 net.cpp:540] Forwarding mbox_priorbox
[INFO 2016-08-30 21:58:39.819844 21429 net.cpp:540] Forwarding mbox_conf_reshape
[INFO 2016-08-30 21:58:39.819871 21429 net.cpp:540] Forwarding mbox_conf_softmax
[INFO 2016-08-30 21:58:39.820596 21429 net.cpp:540] Forwarding mbox_conf_flatten
[INFO 2016-08-30 21:58:39.820647 21429 net.cpp:540] Forwarding detection_out
[INFO 2016-08-30 21:58:39.832866 21429 net.cpp:540] Forwarding detection_eval

 SSD 网络使用了大量的小的卷积核(1x1, 3x3),不仅用于分类而且用于bounding box的位置回归,通过一些滤波实现不同长宽比的目标检测,并进而用于在后续的不同feature map下的多尺度的检测。

  SSD设计了一个bounding box集合, 包含4个:长的 、宽的、大正方、小正方,分布在不同尺寸(4x4,8x8)的feature map的每个位置,  即用卷积的方式覆盖了一个m*n*p的feature map的m*n个位置。在训练时,对这些box与groundtruth box进行匹配,即对每个box计算和groundtruth的位移和分类概率,获得了4个位移值和c个分类概率值,并根据groundtruth的类别获得TP和FP,最终通过计算加权位置损失和分类置信度损失获得模型整体损失, 并通过非极大值抑制来获得最终的检测结果。

  不同形状的box,及其在多分辨率feature map下的应用,实现了box的参数空间的离散化从而提高计算效率。groudtruth的信息, 包括类别和位置都需要明确地附给那些网络输出,使得损失函数和反向传播是end to end的。在训练时,需要将groundtruh和box对应起来,只要和groundtruth的jaccard覆盖率大于0.5,就能和该groundtruth对应上,每个groundtruh必须至少有一个box与其对应。另外,当候选box数量很多时,往往FP也很多,导致TP和FP的数量不平衡。于是,根据分类置信度对候选box进行排序,取top个候选使得FP和TP的比例在3:1。

  关于如何识别多尺度目标。我们知道,低层的feature map对图像的细节表达出色从而可以提高语义分割质量,高层的feature map可以平滑分割结果。于是,综合底层和高层的feature map进行检测。不同层的feature map有不同的感受野尺寸,这个很关键,可以参考Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene cnns. In: ICLR. (2015)。然而不需要给某一层feature map构建不同尺寸的box, 而是某层的feature map只学习检测某个尺度的对象,所以某一层的feature map只有一个尺度的box。举个例子,在8x8的feature map中的box是无法检测到尺寸较大的狗的(如下图)。从低层到高层,box的缩放比均匀地分布在0.2~0.95之间。进一步为了解决长宽比的问题,每层的box又生成了{1,1+,2,3,1/2,1/3} 6个不同长宽比的扩展box。

  


  SSD从某种意义上是结合了RPN和YOLO的思想。即

1)RPN的anchor思想,在feature map上运用256 个 3x3 的滤波器,事实上是在feature map的每个位置,从256个维度来表达9种anchor box特征。滤波器滑动窗的位置提供了相对原图的定位信息。回归框提供了相对该滑动窗的更精细的定位信息。RPN使得计算降低256倍(即从基于原图的操作转为基于特征图的操作)。

 2)YOLO的回归思想,即用特征回归出目标的位置和了类别, 而没有使用ROI pooling进行分类和提取。


版权声明:本文为博主原创文章,未经博主允许不得转载。

相关文章推荐

操作系统的启动过程分析(以Linux系统为例)

购置一台电脑,我们要做Linux系统学习,我们首先安装好了一个Centos6.5系统。从摁下电源键的那一刻到出现桌面或者是字符界面。这期间发生了什么呢? 第一阶段: 为搞清楚这个,从硬件到软件,首先看...

Linux性能之CPU性能

CPU的性能依赖于它所获取的资源。内核有个调度器,负责调度两种类型的资源:线程(单or多)和中断。调度器赋予不同的资源不同的优先级。其中中断最高,然后内核(系统)进程次之,用户进程为最后。内核如何管理...

八个经典故事

问题一,如果你家附近有一家餐厅,东西又贵又难吃,桌上还爬着蟑螂,你会因为它很近很方便,就一而再、再而三地光临吗? 回答:你一定会说,这是什么烂问题,谁那么笨,花钱买罪受? 可同样的情况换个场...

英伟达jetson TX1的caffe-ssd配置

本文主要介绍,英伟达超级计算模块jetson tx1的刷机过程及其配置caffe的教程 刷Jetpack包 首先是刷英伟达官网提供的Jetpack2.3,该包的具体介绍见:https://develo...

英伟达jetson tx1开发套件配置tensorflow

英伟达jetson tx1开发套件配置tensorflow 本文为原创作品,未经本人同意,禁止转载,禁止用于商业用途!本人对博客拥有最终解释权 欢迎关注我的博客:http://blog.csdn...

【深度学习:目标检测】RCNN学习笔记(10):SSD:Single Shot MultiBox Detector

之前一直想总结下SSD,奈何时间缘故一直没有整理,在我的认知当中,SSD是对Faster RCNN RPN这一独特步骤的延伸与整合。总而言之,在思考于RPN进行2-class分类的时候,能否借鉴YOL...

【深度学习:目标检测】RCNN学习笔记(9):OverFeat:Integrated Recognition, Localization and Detection using Convolution

转载:http://blog.csdn.net/u011534057/article/details/51274907 Reference link:  http://blog.csdn....
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:深度学习:神经网络中的前向传播和反向传播算法推导
举报原因:
原因补充:

(最多只允许输入30个字)