Detect Globally, Refine Locally: A Novel Approach to Saliency Detection
Research Background
- Existing saliency detection approaches usually focus on how to effectively combine hierarchical features so as to encode rich semantic representation, then capture distinctive objectness and detailed boundaries simultaneously. However, it is often overlooked that directly apply concatenation or element-wise operation to different feature maps is sub-optimal, because some maps are too clustered which may introduce misleading information.
Motivation and Proposed Approach
- This work first proposes a Recurrent Localization Network, which consists of a contextual weighting module (CWM) and a recurrent module (RM). CWM can adaptively weight the feature maps for each position based on a predicted spatial response map. The recurrent module gradually refine the predicted saliency map over ‘time’.
- This work adopt a Boundary Refinement Network (BRN) to recover the detailed boundary information. BRN can predict a n × n n\times n n×n coefficient map for each pixel which indicates the relationship between the center point and its n × n n\times n n×n neighbors.
- In summary, the contextual weighting module is organized as an inception-like module with 3x3, 5x5 and 7x7 convolutional kernal, followed by a concatnation and convolution operation. The CWM module generates a response map, indicating the importance for each spatial position.
- For the feature map of each block, the recurrent module simultaneously utilize both the current feed-forward input and the previous state of the same block.
- The boundary refinement network takes current image and its saliency map as input, aiming to learn the propagation coefficients with several convolutional layers. The propagation coefficients are then used to refine the saliency map.
疑惑:
- BRN 与spatial propagation network类似?
- 在globally detect模块中, 包含recurrent refinement 模块,同时又采用BRN修正,意义是什么?
- 当前工作,念神CVPR 2018,以及ICCV 2017的embedding,如何做到为每个点学习一个attention map?
- recurrent module 的time step只是设置为2,帮助大吗?这个思路计算量翻倍,但效果应该有提升
- 在修正部分,输入图像的分辨率更大,提升最终效果