READING NOTE: SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

最新推荐文章于 2024-05-14 22:04:28 发布

Joshua_Li_

最新推荐文章于 2024-05-14 22:04:28 发布

阅读量1.6k

点赞数

分类专栏：计算机视觉

本文链接：https://blog.csdn.net/joshua_1988/article/details/51009744

版权

计算机视觉专栏收录该内容

72 篇文章 0 订阅

订阅专栏

TITLE: SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

AUTHER: Shengfeng He, Rynson W.H. Lau, Wenxi Liu, Zhe Huang, Qingxiong Yang Reid, Ian

FROM: IJCV2015

CONTRIBUTIONS

A novel superpixel-wise convolutional neural network approach is proposed.
Two kinds of sequence code are designed as input feature to CNN.

METHOD

Superpixels are extracted via some methods such as oversegmentation.
Extract Color Uniqueness Sequences (CU) for each superpixels to describe the color contrast between regions.
Extract Color Distribution Sequences (CD) for each superpixels to measure the color compactness of colors.
The two sequences are fed into a CNN to generate two saliency maps.
A regressor is used to merge the two predicted saliency maps

SOME DETAILS

Color Uniqueness Sequence is used to describe the color contrast of a Region. Given an image $I$ and the superpixels or regions $R=\lbrace r_{1},...,r_{x},...,r_{N} \rbrace$ , each region $r_{x}$ contains a color uniqueness sequence $Q_{x}^{C} = \lbrace q_{1}^{c},...,q_{j}^{c},...,q_{N}^{c} \rbrace$ . Each element, $q_{x}^{c}$ is defined as

q c x = t (r j) \cdot | C (r x) - C (r j) | \cdot w (P (r x), P (r j))

$q_{x}^{c} = t(r_{j})\cdot \vert C(r_{x})-C(r_{j}) \vert \cdot w(P(r_{x}),P(r_{j}))$

where $t(r_{j})$ counts the total number of pixels in region $r_{j}$ . $\vert C(r_{x})-C(r_{j}) \vert$ is a 3D vector storing the absolute differences of each color channel. $P(r_{x}$ is the mean position of region $r_{j}$ and $w(P(r_{x}),P(r_{j}))$ is defined as

w (P (r x), P (r j)) = e x p (- 1 2 σ 2 s ∥ P (r x) - P (r j) ∥ 2)

$w(P(r_{x}),P(r_{j}))=exp(- \frac{1}{2\sigma_{s}^{2}}\Vert P(r_{x})-P(r_{j}) \Vert^{2} )$

The sequence $Q_{x}^{C}$ is sorted by the spatial distance to region $r_{x}$ .

Color Distribution Sequence is a sequence $Q_{x}^{D} = \lbrace q_{1}^{d},...,q_{j}^{d},...,q_{N}^{d} \rbrace$ with the element $q_{j}^{d}$ defined as:

q d j = t (r j) \cdot | P (r x) - P (r j) | \cdot w (C (r x), C (r j))

$q_{j}^{d} = t(r_{j})\cdot \vert P(r_{x})-P(r_{j}) \vert \cdot w(C(r_{x}),C(r_{j}))$

where

w (C (r x), C (r j)) = e x p (- 1 2 σ 2 s ∥ C (r x) - C (r j) ∥ 2)

$w(C(r_{x}),C(r_{j}))=exp(- \frac{1}{2\sigma_{s}^{2}}\Vert C(r_{x})-C(r_{j}) \Vert^{2} )$

the sequence is also sorted by the spatial distance.

Network Structure is briefly illustrated as below:

Saliency Inference is first to get the $N$ predicted saliency scores of the $N$ regions. Because of the two kinds of sequences, two sets of scores $S_{1} and S_{2}$ are predicted. The final saliency map can be obtained by:

S (r x) = \prod u \in {1, 2} v u c \cdot S u (r x)

$S(r_{x})=\prod_{u\in \lbrace 1,2\rbrace}v_c^{u} \cdot S_{u}(r_{x})$

ADVANTAGES

It is fast when infering.
Large context are encoded in the sequences.

DISADVANTAGES

The CNN is of a very light-weight structure. Deeper network may provide better performance.
As the sequences are used to describe contrast information, which may lead to failure with the foreground and background having similar colors.

Joshua_Li_

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
READING NOTE: SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection
复制链接

扫一扫

专栏目录