2018-ECCV-Mancs-A Multi-task Attentional Network with Curriculum Sampling

最新推荐文章于 2022-09-30 11:52:38 发布

_Xiaobo

最新推荐文章于 2022-09-30 11:52:38 发布

阅读量1.7k

点赞数 1

分类专栏：论文笔记行人重识别文章标签：行人重识别 re-id attention multi-task triplet

本文链接：https://blog.csdn.net/weixin_41427758/article/details/83024571

版权

行人重识别同时被 2 个专栏收录

20 篇文章 58 订阅

订阅专栏

论文笔记

18 篇文章 2 订阅

订阅专栏

论文地址

Motivation

现有的Re-ID工作都面临以下的问题：
- loss function的选择
- 不对准问题
- 寻找高判别力的局部特征
- 对于rank loss优化中的采样问题
目前的大多数工作都是针对上述问题中的一两个来进行解决，能不能用一个统一的框架来解决上述问题呢？

Contribution

提出了Mancs框架来统一解决上述问题
提出了fully attentional block with deep supervision与curriculum sampling来提高模型提取特征的能力与训练的效果(这两个可以借鉴到其他工作上）
本文提出的方法在三个公开数据集上达到了SOTA效果

1 Introduction

Re-ID定义、意义以及难点
研究方向：
- 行人特征表示
- 距离度量：存在正负样本不平衡问题，通常对采样方法要求较高
动机与贡献

2 Related Work

Attention Network
- MSCAN
- HA-CNN
- CAN
Metric Learning
- triplet loss ==> online hard examples mining(OHEM)
- contrastive loss
Multi-task learning
- triplet loss + softmax
- 本文：triplet loss + focal loss

3 Method

3.1 Training Architecture

如下图，本文的网络结构主要由三部分构成：
- backbone network (ResNet50) ==> a multi-scale feature extractor
- attention module ==> attention mask
- loss function:attention loss + triplet loss + focal loss

3.2 Fully Attentional Block

借鉴了SE Block，对其结构进行了改进：
- SE Block的问题：使用GAP导致空间结构信息的丢失 ==> 本文去掉池化层，用1x1的卷积层来代替全连接层来保留空间信息
attention map计算公式：
$M = Sigmoid(Conv(ReLU(Conv(F_i))))$
由attention map得到输出feature map
$F_o = F_i * M + F_i$

3.3 ReID Task #1: Triplet loss with curriculum sampling

ranking loss相比classification loss在数据量不大的时候有更强的泛化性能
rank branch：共享backbone + a pooling layer + FC layer
采样方法：OHEM每个选择最困难的样本进行参数更新容易导致训练过程中模型坍塌 ==> curriculum sampling(from easy triplets to hard triplets)
- 对于一个anchor $I_i^a$ ，首先随机选择一个positive $I_i^p$
- 根据负样本到anchor的距离从小到大(hard --> easy)进行排序
- 根据概率分布(Gaussian distribution $\mathcal{N}(\mu, \sigma)$ )来对负样本进行选择

$\mu = [N_n - \frac{N_n}{t_0}t]_+ \\ \sigma = a \times b^{\frac{t-t_0}{t_1 - t_0}}\\$

$I_i^n$ 的选择概率，随着 $t$ 增大，选择困难样本的概率增大，如下图
$Pr(I^{n^*}_i=I_i^n|I^a_i) \propto \mathcal{N}(\mu, \sigma)$

final loss for ranking branch

$L_{rank} = \frac{1}{P(K-1)K} \sum\limits_{i=1}^{P(K-1)K}[m+D(f_{rank}(I^a_i),f_{rank}(I^n_i))]_+$

3.4 ReID Task #2: Person classification with focal loss

考虑到classification + ranking效果更好，添加了classification branch，同时考虑到困难样本应该比简单样本更受重视，选择了focal loss(softmax loss的一种改进版本)，给困难样本更多的权重
focal loss for classification branch
$L_{cls} = -\frac{1}{PK}\sum \limits_{i=1}^{PK}(1-p_i)^\gamma log(p_i) \\ p_i = Sigmoid_{c_i}(FC(f_{cls}(I_i)))$

3.5 ReID Task #3: Deep supervision for better attention

将不同尺度得到的attention map(与attention mask相乘过的特征图)进行平均池化与concatated得到attention feature vector $f_{att}$ 进行来身份分类 ==> accurate attention maps
loss function for attention branch
$L_{att} = \frac{1}{PKC}\sum \limits_{i = 1}^{PK}\sum \limits_{c=1}^Cy_i^clog(q^c_i) + (1-y_i^c)log(1-q^c_i) \\ q^c_i = Sigmoid_c(FC(f_{att}(I_i)))$

3.6 Multi-task learning

three tasks(rank + cls + att)共享backbone，最终的loss function:
$\mathcal{L}= \lambda_{rank}L_{rank} + \lambda_{cls}L_{cls} + \lambda_{att}L_{att}$

3.7 Inference

rank branch的特征具有更强的泛化性能，在测试阶段用来代表行人图片，如下图所示

4 Experiments

4.1 Datasets

Market1501、CUHK03、DukeMTMC-reID

4.2 Evaluation Protocol

mAP、CMC
Market1501：both single query and multi-query；CUHK03与DukeMTMC-reID：single query
CUHK03 split：1367/100 and 767/700

4.3 Implementation Details

Pytorch
Pretrained ResNet-50 + 分类层前的2048FC

Data Augmengtation

resize images to 256 x 128 ==> randomly crop with scale in [0.64, 1.0] and aspect ratio in [2, 3] ==> resize back to 256 x 128 ==> randomly horizontally flip with probility 0.5 ==> random erasing ==> subtracted the mean value and divided by the standard deviation

Training Configurations

PK Sampling strategy：Market1501 and DukeMTMC-ReID：P、K = 16 CUHK03：P=32，K=8 DukeMTMC-ReID
160 epochs、 $t_0=30 \ t_1=60 a=15 b=0.001$
$\lambda_{rank}=1,\lambda_{cls}=1,\lambda_{att}=0.2$
$\ m=0.5 \ \gamma=2$
Adam optimizer, lr=3x10e-4
gradient clipping to prevent model collision
最后卷积层的ReLU换成了PReLU ==> 增强最后的特征的表达能力

4.4 Comparisons with the state-of-art methods

Evaluation On Market-1501

Evaluation On CUHK03

Evaluation On DukeMTMC-reID

4.5 Ablation Study

对本文提出的Curriculum Sampling(CS)、Full Attentional Block、Focal Loss、Random Erasing有效性进行了验证，如下表

cls + rank的baseline已经很高了，本文提出的方法每个提升相对比较小
下图举的例子不是很懂，文中该图说明random erasing与cls有很大的提升

5 Conclusions

本文提出的Mancs能够学习稳定的特征在三个常用的公开数据集上取得了SOTA的性能
本文提出的fully attentional block with deep supervision与curriculum sampling的有效性（可以在其他相关任务借鉴）
未来工作：结合数据采样与增强进一步提供reID特征的泛化能力

_Xiaobo

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
4
评论
2018-ECCV-Mancs-A Multi-task Attentional Network with Curriculum Sampling

论文地址Motivation现有的Re-ID工作都面临以下的问题：loss function的选择不对准问题寻找高判别力的局部特征对于rank loss优化中的采样问题目前的大多数工作都是针对上述问题中的一两个来进行解决，能不能用一个统一的框架来解决上述问题呢？Contribution1 IntroductionRe-ID定义、意义以及难点研究方向：...
复制链接

扫一扫