【SB-ReID】《Bag of Tricks and A Strong Baseline for Deep Person Re-identification》

最新推荐文章于 2024-06-04 00:49:34 发布

bryant_meng

最新推荐文章于 2024-06-04 00:49:34 发布

阅读量275

点赞数 1

分类专栏： CNN / Transformer 文章标签： ReID

本文链接：https://blog.csdn.net/bryant_meng/article/details/112986502

版权

CNN / Transformer 专栏收录该内容

208 篇文章 7 订阅

订阅专栏

在这里插入图片描述

CVPR-2019

1 Background and Motivation

ReID 任务很多相关工作都是在一个相对较低的 baseline 上开展的，且许多 improvements were mainly from training tricks rather than methods themselves

本文作者 collect and evaluate 一些 effective training tricks in person ReID 任务，提出一个 SOTA 的较为规范的 baseline

2 Advantages / Contributions

仅用 global feature（而不是 concatenate multi-branch features）

实现 94.5% rank-1 and 85.9% mAP on Market1501

3 Standard Baseline

在这里插入图片描述

一个 batch 又 P 个人，每个人 K 张图片，经过 backbone 提取出 ReID features（比如 1024 维），然后接个 FC 计算出 ID prediction logits 来判断图片中的人是谁

Triplet loss 让同一个人的特征靠近，不同人的特征拉远

ID loss 让网络学会预测图片中的人是谁

4 Method

在 standard baseline 基础上，加入了 6 个 tricks

在这里插入图片描述

4.1 Warmup Learning Rate

在这里插入图片描述

花 10 个 epoch 慢热，然后慢慢减小学习率

4.2 Random Erasing Augmentation

在这里插入图片描述

0.3<spatial ratio<3.33

0.02<面积占比<0.4

4.3 Label Smoothing

在这里插入图片描述
$\varepsilon$ 为 0.1

具体理论参考【Inception-v3】《Rethinking the Inception Architecture for Computer Vision》

4.4 Last Stride

backbone 的最后一个 stage 的 stride 变为 1，这样保证了特征图的分辨率

4.5 BNNeck

在这里插入图片描述

ID loss 优化的是 cosine distance（找超平面，图 6（a）中的黄色虚线）

triplet loss 优化的是 euclidean distance（图 6 （b），类内紧凑，类间距离拉大）

如果联合二者一起优化，a possible phenomenon is that one loss is reduced，while the other loss is oscillating or even increased

作者的解决方法是通过改变下 ID loss 中 logits 的分布！达到利于优化的目的

在这里插入图片描述

BNNeck 结构中 FC 层去掉了 bias，这样能保证 ID loss 的 hyper-planes 经过 coordinate axis

道理同 y = kx 能过原点， y = kx+b (b≠0) 不过原点

4.6 Center Loss

在这里插入图片描述
triplet loss 中 $d_p$ and $d_n$ are feature distances of positive pair and negative pair. $\alpha$ is the margin of triplet loss, $x]_+$ 等价于 $m a x (0, x)$ ，更多细节可以参考 Triplet-Loss原理及其实现、应用

上面 loss 的形式有个缺点， $d_p$ 、 $d_n$ 为 0.3 与 0.1 时和为 1.3 与 1.1 时 loss 是一样的

Triplet loss is determined by two person IDs sampled randomly. It is difficult to ensure that $d_p$ < $d_n$ in the whole training dataset.

作者引入了 center loss 来 make up triplet loss 的缺点，形式如下

在这里插入图片描述
其中 $c_{y_j}$ denotes the $y_i$ th class center of deep features，B 是 batch-size，让同一个人的尽量聚在一起

改进后的整体 loss 如下

在这里插入图片描述
$L_{ID}$ 为交叉熵 loss

5 Experiments

4.1 Datasets

Market1501 和 DukeMTMC

4.2 Influences of Each Trick (Same domain)

在这里插入图片描述
6 个都有涨点

4.3 Influences of Each Trick (Cross domain)

在这里插入图片描述

REA 不行哈，作者的解释为

We infer that REA masking the regions of training images lets the model learn more knowledge in the training domain.

4.4 Comparison of State-of-the-Arts

在这里插入图片描述

挺猛的

4.5 Analysis of BNNeck

在这里插入图片描述
ID loss 用 cosine distance 优化比较好

4.6 Influences of the Number of Batch Size

在这里插入图片描述
影响不大

We infer that large K helps to mine hard positive pairs while large P helps to mining hard negative pairs.

4.7 Influences of Image Size

在这里插入图片描述
结论是影响不大

5 Conclusion（own）

BNNeck 的把特征映射更标准化，这样划分超平面时更容易（ID loss 和 triplet loss）
Random Erasing Augmentation 还蛮过瘾的
triplet loss 只追求绝对差值而忽略了原始积累，提出 center + triplet loss 来进一步使得同一类特征聚集得更紧密

bryant_meng

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
2
评论
【SB-ReID】《Bag of Tricks and A Strong Baseline for Deep Person Re-identification》

CVPR-2019文章目录1 Background and Motivation2 Advantages / Contributions3 Standard Baseline4 Method4.1 Warmup Learning Rate4.2 Random Erasing Augmentation4.3 Label Smoothing4.4 Last Stride4.5 BNNeck4.6 Center Loss5 Experiments4.1 Datasets4.2 Influences of .
复制链接

扫一扫

专栏目录