SqueezeNet和Faster RCNN结合


论文提交ICLR 2017 



  • More efficient distributed training
  • Less overhead when exporting new models to clients
  • Feasible FPGA and embedded deployment

即 高效的分布式训练、更容易替换模型、更方便FPGA和嵌入式部署。 

  • Replace 3x3 filters with 1x1 filters.
  • Decrease the number of input channels to 3x3 filters.
  • Downsample late in the network so that convolution layers have large activation maps.

  1. 使用1x1的核替换3x3的核,因为1x1核参数是3x3的1/9;
  2. 输入通道减少3x3核的数量,因为参数的数量由输入通道数、卷积核数、卷积核的大小决定。因此,减少1x1的核数量还不够,还需要减少输入通道数量,在文中,作者使用squeeze layer来达到这一目的;
  3. 后移池化层,得到更大的feature map。作者认为在网络的前段使用大的步长进行池化,后面的feature map将会减小,而大的feature map会有较高的准确率。


由上面的思路,作者提出了Fire Module,结构如下: 




  • 为了3x3的核输出的feature map和1x1的大小相同,padding取1(主要是为了concat)
  • squeezelayer和expandlayer后面跟ReLU激活函数
  • Dropout比例为0.5,跟在fire9后面
  • 取消全连接,参考NIN结构
  • 训练过程采用多项式学习率(我用来做检测时改为了step策略)
  • 由于caffe不支持同一个卷积层既有1x1,又有3x3,所以需要concat,将两个分辨率的图在channel维度concat。这在数学上是等价的



二、SqueezeNet与Faster RCNN结合

这里,我首先尝试的是使用alt-opt,但是很遗憾的是,出来的结果很糟糕,基本不能用,后来改为使用end2end,在最开始的时候,采用的就是faster rcnn官方提供的zfnet end2end训练的solvers,又很不幸的是,在网络运行大概400步后出现:

loss = NAN
name: "Alex_Squeeze_v1.1"
layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 4"

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 64
    kernel_size: 3
    stride: 2
layer {
  name: "drop9"
  type: "Dropout"
  bottom: "fire9/concat"
  top: "fire9/concat"
  dropout_param {
    dropout_ratio: 0.5

#========= RPN ============

layer {
  name: "rpn_conv/3x3"
  type: "Convolution"
  bottom: "fire9/concat"
  top: "rpn/output"
  param { lr_mult: 1.0 }
  param { lr_mult: 2.0 }
  convolution_param {
    num_output: 256
    kernel_size: 3 pad: 1 stride: 1
    weight_filler { type: "gaussian" std: 0.01 }
    bias_filler { type: "constant" value: 0 }
layer {
  name: 'roi-data'
  type: 'Python'
  bottom: 'rpn_rois'
  bottom: 'gt_boxes'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_inside_weights'
  top: 'bbox_outside_weights'
  python_param {
    module: 'rpn.proposal_target_layer'
    layer: 'ProposalTargetLayer'
    param_str: "'num_classes': 4"

#===================== RCNN =============

layer {
  name: "roi_pool5"
  type: "ROIPooling"
  bottom: "fire9/concat"
  bottom: "rois"
  top: "roi_pool5"
  roi_pooling_param {
    pooled_w: 7
    pooled_h: 7
    spatial_scale: 0.0625 # 1/16

layer {
  name: "conv1_last"
  type: "Convolution"
  bottom: "roi_pool5"
  top: "conv1_last"
  param { lr_mult: 1.0 }
  param { lr_mult: 1.0 }
  convolution_param {
    num_output: 1000
    kernel_size: 1
    weight_filler {
      type: "gaussian"
      mean: 0.0
      std: 0.01
layer {
  name: "relu/conv1_last"
  type: "ReLU"
  bottom: "conv1_last"
  top: "relu/conv1_last"

layer {
  name: "cls_score"
  type: "InnerProduct"
  bottom: "relu/conv1_last"
  top: "cls_score"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  inner_product_param {
    num_output: 5
    weight_filler {
      type: "gaussian"
      std: 0.01
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "bbox_pred"
  type: "InnerProduct"
  bottom: "relu/conv1_last"
  top: "bbox_pred"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  inner_product_param {
    num_output: 20
    weight_filler {
      type: "gaussian"
      std: 0.001
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "loss_cls"
  type: "SoftmaxWithLoss"
  bottom: "cls_score"
  bottom: "labels"
  propagate_down: 1
  propagate_down: 0
  top: "loss_cls"
  loss_weight: 1
layer {
  name: "loss_bbox"
  type: "SmoothL1Loss"
  bottom: "bbox_pred"
  bottom: "bbox_targets"
  bottom: "bbox_inside_weights"
  bottom: "bbox_outside_weights"
  top: "loss_bbox"
  loss_weight: 1
三、SqueezeNet+Faster RCNN+OHEM


#====== RoI Proposal ====================
layer {
  name: "rpn_cls_prob"
  type: "Softmax"
  bottom: "rpn_cls_score_reshape"
  top: "rpn_cls_prob"
layer {
  name: 'rpn_cls_prob_reshape'
  type: 'Reshape'
  bottom: 'rpn_cls_prob'
  top: 'rpn_cls_prob_reshape'
  reshape_param { shape { dim: 0 dim: 140 dim: -1 dim: 0 } }
layer {
  name: 'proposal'
  type: 'Python'
  bottom: 'rpn_cls_prob_reshape'
  bottom: 'rpn_bbox_pred'
  bottom: 'im_info'
  top: 'rpn_rois'
  python_param {
    module: 'rpn.proposal_layer'
    layer: 'ProposalLayer'
    param_str: "'feat_stride': 16"
layer {
  name: 'roi-data'
  type: 'Python'
  bottom: 'rpn_rois'
  bottom: 'gt_boxes'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_inside_weights'
  top: 'bbox_outside_weights'
  python_param {
    module: 'rpn.proposal_target_layer'
    layer: 'ProposalTargetLayer'
    param_str: "'num_classes': 4"
## Readonly RoI Network ##
######### Start ##########
layer {
  name: "roi_pool5_readonly"
  type: "ROIPooling"
  bottom: "fire9/concat"
  bottom: "rois"
  top: "pool5_readonly"
  propagate_down: false
  propagate_down: false
  roi_pooling_param {
    pooled_w: 6
    pooled_h: 6
    spatial_scale: 0.0625 # 1/16
layer {
  name: "conv1_last_readonly"
  type: "Convolution"
  bottom: "pool5_readonly"
  top: "conv1_last_readonly"
  propagate_down: false  
  param {
    name: "conv1_last_w"
  param {
    name: "conv1_last_b"
  convolution_param {
    num_output: 1000
    kernel_size: 1
    weight_filler {
      type: "gaussian"
      mean: 0.0
      std: 0.01
layer {
  name: "relu/conv1_last_readonly"
  type: "ReLU"
  bottom: "conv1_last_readonly"
  top: "relu/conv1_last_readonly"
  propagate_down: false
layer {
  name: "cls_score_readonly"
  type: "InnerProduct"
  bottom: "relu/conv1_last_readonly"
  top: "cls_score_readonly"
  propagate_down: false
  param {
    name: "cls_score_w"
  param {
    name: "cls_score_b"
  inner_product_param {
    num_output: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "bbox_pred_readonly"
  type: "InnerProduct"
  bottom: "relu/conv1_last_readonly"
  top: "bbox_pred_readonly"
  propagate_down: false
  param {
    name: "bbox_pred_w"
  param {
    name: "bbox_pred_b"
  inner_product_param {
    num_output: 16
    weight_filler {
      type: "gaussian"
      std: 0.001
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "cls_prob_readonly"
  type: "Softmax"
  bottom: "cls_score_readonly"
  top: "cls_prob_readonly"
  propagate_down: false
layer {
  name: "hard_roi_mining"
  type: "Python"
  bottom: "cls_prob_readonly"
  bottom: "bbox_pred_readonly"
  bottom: "rois"
  bottom: "labels"
  bottom: "bbox_targets"
  bottom: "bbox_inside_weights"
  bottom: "bbox_outside_weights"
  top: "rois_hard"
  top: "labels_hard"
  top: "bbox_targets_hard"
  top: "bbox_inside_weights_hard"
  top: "bbox_outside_weights_hard"
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
  python_param {
    module: "roi_data_layer.layer"
    layer: "OHEMDataLayer"
    param_str: "'num_classes': 4"
########## End ###########
## Readonly RoI Network ##
#===================== RCNN =============
layer {
  name: "roi_pool5"
  type: "ROIPooling"
  bottom: "fire9/concat"
  bottom: "rois_hard"
  top: "roi_pool5"
  propagate_down: true
  propagate_down: false
  roi_pooling_param {
    pooled_w: 7
    pooled_h: 7
    spatial_scale: 0.0625 # 1/16
layer {
  name: "conv1_last"
  type: "Convolution"
  bottom: "roi_pool5"
  top: "conv1_last"
  param { 
      lr_mult: 1.0 
      name: "conv1_last_w"
  param { 
      lr_mult: 1.0 
      name: "conv1_last_b"
  convolution_param {
    num_output: 1000
    kernel_size: 1
    weight_filler {
      type: "gaussian"
      mean: 0.0
      std: 0.01
layer {
  name: "relu/conv1_last"
  type: "ReLU"
  bottom: "conv1_last"
  top: "relu/conv1_last"
layer {
  name: "cls_score"
  type: "InnerProduct"
  bottom: "relu/conv1_last"
  top: "cls_score"
  param {
    lr_mult: 1
    name: "cls_score_w"
  param {
    lr_mult: 2
    name: "cls_score_b"
  inner_product_param {
    num_output: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "bbox_pred"
  type: "InnerProduct"
  bottom: "relu/conv1_last"
  top: "bbox_pred"
  param {
    lr_mult: 1
    name: "bbox_pred_w"
  param {
    lr_mult: 2
    name: "bbox_pred_b"
  inner_product_param {
    num_output: 16
    weight_filler {
      type: "gaussian"
      std: 0.001
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "loss_cls"
  type: "SoftmaxWithLoss"
  bottom: "cls_score"
  bottom: "labels_hard"
  propagate_down: true
  propagate_down: false
  top: "loss_cls"
  loss_weight: 1
layer {
  name: "loss_bbox"
  type: "SmoothL1Loss"
  bottom: "bbox_pred"
  bottom: "bbox_targets_hard"
  bottom: "bbox_inside_weights_hard"
  bottom: "bbox_outside_weights_hard"
  top: "loss_bbox"
  loss_weight: 1
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
