Main Contributions:
- sub-region attention map and
- aspect ratio attention map for each RoI
RoI feature extractor
adopt a 1 × 1 convo- lutional layer to reduce the channel number to Cs and pool the compacted RoI features.
an Nsr Cs -d sub-region attention bank for the entire image by a group of designed shifted convolutional layers.
classify RoIs of different aspect ratios into Nar categories (Nar = 3 demonstrated in Figure 3) and then generate an Nar Cs -d aspect ratio attention bank
Sub-Region Attention Bank
Shifted Convolution
- special cases of deformable convolutions
- the 2D offsets of shifted convolutional layers are fixed to (1, 1), (1, 0), (1, −1), ..., (−1, −1), respectively.
Aspect Ratio Attention Bank
- , a 1 × 1 convolutional layer is placed on the convolutional feature map to get the aspect ratio aware components of each spatial position
Attention Maps
Selective RoI Pooling