《Improving Person Re-identification by Attribute and Identity Learning》论文笔记

《Improving Person Re-identification by Attribute and Identity Learning》论文笔记


作者:Yutian Lin, Liang Zheng, Zhedong Zheng, Yu Wu , Zhilan Hu , Chenggang Yan, Yi Yanga.
原文链接: link.

Motivation

ReID和属性识别在行人描述方面有共同的目标,他们的区别在于细粒度,属性识别关注行人的局部特征,而ReID通常提出全局特征,考虑到他们的相似性和区别性,作者提出一个集成了ReID和属性识别的多任务方法,具体地,提出一个名为attribute-person recognition(APR)网络,它包含两种类型地损失函数:ID分类损失和属性分类损失。说白了就是使用CNN(Resnet50),删除最后一层FC层,后面分别接一些FC层,其中一个是FC-ID,用于对行人ID进行分类,剩下其他的FC层用于属性分类。

Contributions

(1) 结合ID和属性分类损失,提出了一个attribute-person recognition(APR)网络。
(2) 为Market-1501和DukeMTMC-reID两个数据集手动标记属性标签。

Method

在这里插入图片描述
上图是APR的结构图,backbone是ResNet50,可以分为两个baseline: ReID baseline和 Attribute baseline。
ReID baseline: 删除ResNet最后的FC层,添加一个新的FC成,输出维度为K,K表示身份的数量,在FC层前添加一个dropout层,测试时,提出pool5输出的2048-dim特征,使用欧式距离计算query和gallery图像之间的距离。
**Attribute baseline: ** 在softmax层之后添加M个FC层,M表示属性类别的数目。

Loss 计算:
给定K个身份的n个图像,每个身份有M个属性;Di={xi,di,li }表示训练集,xi表示第i个身份,di表示xi的身份,li={lil,…,liM}是图像xi的一组M个属性标签。
ID loss: 提取pool5输出的1×1×2048向量,FC0层的输出是z=[z1,z2,…,zK]∈RK,因此每个ID标签的预测概率为在这里插入图片描述
ID分类的交叉熵损失表示为:
在这里插入图片描述
**属性分类loss:**属性预测由M个softmax损失组成,样本x属于属性类别j∈1….m的概率为
在这里插入图片描述
因此每个样本x的属性分类损失表示为:
在这里插入图片描述
最后APR的总体损失为:
在这里插入图片描述

Experiment

参数设置

使用ResNet50作为backbone时,训练55个epoch;batch size=64;使用SGD优化器,在最后5个epoch,学习率从0.001下降到0.0001。
使用CaffeNet作为backbone时,训练110个epoch;batch size=128;使用SGD优化器,在最后10个epoch,学习率从0.1下降到0.01。

Person Re-ID评估

在这里插入图片描述
在这里插入图片描述

属性识别评估

在这里插入图片描述

不同属性对ReID的影响

在这里插入图片描述

Deep person re-identification is the task of recognizing a person across different camera views in a surveillance system. It is a challenging problem due to variations in lighting, pose, and occlusion. To address this problem, researchers have proposed various deep learning models that can learn discriminative features for person re-identification. However, achieving state-of-the-art performance often requires carefully designed training strategies and model architectures. One approach to improving the performance of deep person re-identification is to use a "bag of tricks" consisting of various techniques that have been shown to be effective in other computer vision tasks. These techniques include data augmentation, label smoothing, mixup, warm-up learning rates, and more. By combining these techniques, researchers have been able to achieve significant improvements in re-identification accuracy. In addition to using a bag of tricks, it is also important to establish a strong baseline for deep person re-identification. A strong baseline provides a foundation for future research and enables fair comparisons between different methods. A typical baseline for re-identification consists of a deep convolutional neural network (CNN) trained on a large-scale dataset such as Market-1501 or DukeMTMC-reID. The baseline should also include appropriate data preprocessing, such as resizing and normalization, and evaluation metrics, such as mean average precision (mAP) and cumulative matching characteristic (CMC) curves. Overall, combining a bag of tricks with a strong baseline can lead to significant improvements in deep person re-identification performance. This can have important practical applications in surveillance systems, where accurate person recognition is essential for ensuring public safety.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值