[Paper note] PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection

  • paper
  • Author: Kye-Hyeon Kim, Sanghoon Hong, Byungseok Roh, Yeongjae Cheon, and Minje Park
  • Intel Imaging and Camera Technology

Highlight

  • Speed up (real-time) detection process with a more efficient feature extraction CNN, without lossing too much accuracy.
  • This network is smaller and more efficient than ResNet and can be a substitution of the latter.

Main structure

  • C.ReLU: Concatenated ReLU in early stage of CNN to reduce the number of computation.
    • In early stages, output nodes tend to be paired, i.e. one node’s activation is the opposite of another’s.
    • C.ReLU reduce the output channels by half and concatenate with negation.
  • Inception
    • Not yet been widely applied.
    • Cost-efficient building block for multi-scale representation.
    • 1x1 Conv can preserve the receptive field of the previous layer.
  • Multi-scale representation like HyperNet
    • Concatenate the output of the last layer and two intermediate layers, whose size are 2x and 4x of the last layer.
    • Set 2x layer as reference scale, down-scaling (pooing) 4x layer, up-scaling (interpolation) the last layer.

Experiment

  • Training details
    • Add Batch normalization layers before all ReLU.
    • Plateau detection based learning rate policy.
      • Measure the moving average of loss
      • Decide as on-plateau if its improvement is below a threshold.
      • Decrease the learning rate by a constant factor when on-plateau.
    • Number of proposals = 200
  • VOC2007 & VOC2012 performance
    performance
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值