DSOD: Learning Deeply Supervised Object Detectors from Scratch

Key Problems

  • Limited structure design space.
  • Learning bias
    • As both the loss functions and the category distributions between classification and detection tasks are different, we argue that this will lead to different searching/optimization spaces. Therefore, learning may be biased towards a local minimum which is not the best for detection task.
  • Domain mismatch
  • State-of-the-art object objectors rely heavily on the offthe-shelf networks pre-trained on large-scale classification datasets like ImageNet
  • transferring pre-trained models from classification to detection between discrepant domains is even more difficult

Architecture

这里写图片描述

这里写图片描述

Principles

  • training detection network from scratch requires a proposal-free framework.
  • Deep Supervision
    • Transition w/o Pooling Layer. We introduce this layer in order to increase the number of dense blocks without reducing the final feature map resolution.
  • Stem Block
    • stem block can reduce the information loss from raw input images.
  • Dense Prediction Structure
    • *

Contributions

  • DSOD is a simple yet efficient framework which could learn object detectors from scratch
  • DSOD is fairly flexible, so that we can tailor various network structures for different computing platforms such as server, desktop, mobile and even embedded devices.
  • We present DSOD, to the best of our knowledge, world first framework that can train object detection networks from scratch with state-of-the-art performance.
  • We introduce and validate a set of principles to design efficient object detection networks from scratch through step-by-step ablation studies.
  • We show that our DSOD can achieve state-of-the-art performance on three standard benchmarks (PASCAL VOC 2007, 2012 and MS COCO datasets) with realtime processing speed and more compact models.

Experiments

这里写图片描述

这里写图片描述

这里写图片描述

这里写图片描述

这里写图片描述

Others

  • a well-designed network structure can outperform state-ofthe-art solutions without using the pre-trained models
  • only the proposal-free method (the 3rd category) can converge successfully without the pre-trained models.
    • RoI pooling generates features for each region proposals, which hinders the gradients being smoothly back-propagated from region-level to convolutional feature maps.
    • The proposal-based methods work well with pretrained network models because the parameter initialization
      is good for those layers before RoI pooling, while this is not
      true for training from scratch
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值