Reading Note: Deformable Part-based Fully Convolutional Network for Object Detection

42 篇文章 0 订阅
1 篇文章 0 订阅

TITLE: Deformable Part-based Fully Convolutional Network for Object Detection

AUTHOR: Taylor Mordan, Nicolas Thome, Matthieu Cord, Gilles Henaff

FROM: arXiv:1707.06175

CONTRIBUTIONS

  1. Deformable Part-based Fully Convolutional Network (DPFCN), an end-to-end model integrating ideas from DPM into region-based deep ConvNets for object detection, is proposed.
  2. A new deformable part-based RoI pooling layer is introduced, which explicitly selects discriminative elements of objects around region proposals by simultaneously optimizing latent displacements of all parts.
  3. Another improvement is the design of a deformation-aware localization module, a specific module exploiting configuration information to refine localization.

METHOD

R-FCN is the work closest to DP-FCN. Both are developed on the basis of Faster-RCNN, in which an RPN is used to generate object proposals and a designed pooling layer is used to extract features for classification and localization. The architecture of DP-FCN is illustrated in the following figure. A Deformable part-based RoI Pooling layer follows a FCN network. Then two branches predict category and location respectively. The output of the backbone FCN is similar to that in R-FCN. It has k2(C+1) channels corresponding to k×k parts and C categories and background.

DP-FCN

Deformable part-based RoI pooling

For each input channel, just like what has been done in DPM, a transformation is carried out to spread high responses to nearby locations, taking into account the deformation costs.

Deformable part-based RoI pooling

In my understanding, the output of RPN works like the root filter in DPM. Then the region proposal is evenly divided into k×k sub-regions. Then these sub-regions will displace taking deformation into account. Displacement computed during the forward pass are stored and used to backpropagate gradients at the same locations.

Classification and localization predictions with deformable parts

Predictions are performed with two sibling branches for classification and relocalization of region proposals as is common practice. The classification branch is simply composed of an average pooling followed by a SoftMax layer.

Deformation-aware localization refinement

As for location prediction, every part has 4 elements to be predicted. In addition to that, the displacement is sent to two fully connected layers and is then element-wise multiplied with the first values to yield the final localization output for this class.

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值