Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion

最新推荐文章于 2020-11-15 15:42:37 发布

小鬼那个菜

最新推荐文章于 2020-11-15 15:42:37 发布

阅读量1.2k

点赞数 2

分类专栏： ReId 文章标签： ReID Spindle Net CNN

本文链接：https://blog.csdn.net/weixin_38257545/article/details/90900189

版权

ReId 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

论文Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion解读

1 Abstruct
2 Related work
3 Body Region Proposal Network
4 Body Region Guided Spindle Net
- 4.1 Feature Extraction Network (FEN)
- 4.2 Feature Fusion Network (FFN)

1 Abstruct

In this paper, the author proposed a new CNN, called Spindle Net, which covers two parts: the Feature Extraction Net (FEN) and the Feature Fusion Net (FFN). The unique advantages of this network include: 1) It separately extracts semantic feature from different body region to get macro and micro body features. 2) It merges(fuses) these different regions feature learned from Spindle Net by a competitive scheme and discriminative features can be well preserved.

2 Related work

Figure 1 these
Figure 1. The challenges of ReID. (a-b) denotes that we can’t only use position information to judge two images, which may result in ambiguities. © shows the importance of detail information, these two images are similar but not the same identification. (d) Occasion problem.
The contributions of this essay include:

It is the first time human body structure information considered in a ReID pipeline. Which can describe much more features about different body region and detail information
It designs a Spindle Net to handle ReID task
It proposes a new ReID evaluation dataset, i.e. SenseReID

3 Body Region Proposal Network

Here, the author uses Region Proposal Network(RPN) and ReID pipeline to extract human body features. If you are not familiar with RPN, your can refer this link RPN detail（Faster RCNN）. Given an input image, the RPN generates seven rectangle region proposals representing seven sub-regions of person body, including the head-shoulder region, the upper body region, the lower body region, two arm regions, and two leg regions.

Body joint localization:
It adopts a CNN to output 14 response map $F_i \in R^{X\times Y}$ , where i $\in$ [1,14], X and Y denotes the size of feature map. It calculates the largest value of each response map as final joint position.
$p_i = [x_i, y_i] = arg \max \limits_{x \in [1, X], y\in[1,Y]} F_i(x, y) \tag{1}$
Body region generation:
It obtains 7 body sub-region by RPN, each of body joint set $\in {\{S_1^A, S_2^A, S_3^A, S_1^B, S_2^B, S_3^B, S_4^B\}}$ , the total numbers of joints are 14, and the corresponding sub-region bounding box $\beta \in {\{\beta_1^A, \beta_2^A,\beta_3^A,\beta_1^B,\beta_2^B, \beta_3^B, \beta_4^B\}}$ . These bounding box $\beta$ can be calculated by the following formula:
$\begin{aligned} \beta &= [x_{min},x_{max}, y_{min}, y_{max}]\\ &=[\min\limits_{i \in S}(x_i),\max\limits_{i \in S}(x_i),\min\limits_{i\in S}(y_i), \max\limits_{i\in S}(y_i)] \end{aligned} \tag{2}$

Figure 2. Illustration of the Region Proposal Network. (a) One sample image and the fourteen body joints. (b) The fourteen body joints are assigned to seven sets. $(c)$ The seven body sub-regions proposed by the RPN from the corresponding body joint sets.

4 Body Region Guided Spindle Net

Figure3. The Spindle Net architecture. The left is Feature Extraction Net(FEN) and the right is Feature Fusion Net(FFN).

4.1 Feature Extraction Network (FEN)

We can see the left part of above Spindle Net pipeline, it contains multi-stages network composed with two ROI pooling stages and three CNN stages. The sub-regions features are extracted at different stages. Three macro region features are extracted after FEN-c1, and four micro region features are extracted after FEN-c2. To be specific, when input a image(96*96), getting a full body feature map $F_0^{c1}$ by FEN-c1 net, then it pools out three macro sub-regions from $F_0^{c1}$ and gets corresponding feature maps proposed by RPN. At FEN-2 stage, using the same method, it can generate four micro sub-region feature maps. At FEN-3 stage, output eight 256-dimensional feature vectors.
在这里插入图片描述
Figure 4. Two examples to demonstrate the effectiveness of the proposed sub-region features. (a) Two images of the same person. (b) Corresponding feature maps after FEN-C1. $(c)$ Feature maps after FEN-P1. (d) Two similar persons. (e) Corresponding feature maps after FEN-C1. (f) Feature maps after FEN-P1. The average L2 distances between feature maps are also listed.

4.2 Feature Fusion Network (FFN)

FFN is showed the right part of above Spindle Net architecture. It also includes multi-stages network. Fusion unit has two processes: 1) First, it computes among these sub-region feature maps by an element-wise maximization operation. 2) Second, the feature transformation process is conducted by a inner product layer. Here it uses a tree-structured fusion strategy to fuse different sub-region feature maps, which can be seen clearly from the net pipeline.
在这里插入图片描述
Figure 5. Illustration of feature fusion. Feature entries are sorted for better visualization. (a) Input image. (b-d) Three input feature vectors of the body fusion unit. The features of the head-shoulder region, the upper body region, and lower body region are marked in red, green and blue, respectively. (f) Result of the max oper- ation. The head-shoulder features win 46.1% of the competition, much more than the other two region features in green and blue.

小鬼那个菜

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion

论文Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion解读1 Abstruct2 Related work3 Body Region Proposal Network4 Body Region Guided Spindle Net4.1 Featur...
复制链接

扫一扫