泛读论文:Person-reID 行人重识别合集

基于融合特征的行人再识别方法

模式识别与人工智能 2017.3

问题

  • 目前常用的行人再识别方法主要集中在行人外形特征的描述和同一行人对应的 2 幅图像之间距离的学习度量.由于行人图像的亮度和相机角度的变化等,提取行人的外形特征的不变性较难,因此在各个图像库上行人再识别的识别率较低

方法

  • 基于融合特征的特征提取
    • 包括 HSV 颜色特征、颜色直方图特征和梯度方向直方图特征.
    • HSV 颜色特征和颜色直方图这 2 种颜色特征的融合可以增强图像颜色信息的鉴别性,
    • 梯度方向直方图特征可以描述图像局部像素点之间的关系

参考

http://kns.cnki.net/KCMS/detail/detail.aspx?filename=mssb201703010&dbname=CJFD&dbcode=CJFQ


多置信度重排序的行人再识别算法

模式识别与人工智能 2017.11

问题

  • 针对行人再识别中相似性度量误差引起的识别效果较差的问题,提出多置信度重排序的行人再识别算

方法

  • 用ResNet50获得描述特征
  • 对目标样本与测试样本之间的相似性进行初始排序
  • 对相似排序得到的样本构建相似样本集合,获得每个类别的聚类中心和样本距离聚类中心的最小、最大、均值距离,设置 3 个置信度不同的置信区间
  • 最后使用 Jaccard 距离对目标样本与测试样本的相似度进行重排序

收获

  • 杰卡德距离( Jaccard Distance) 可以用来度量 2 个集合之间的差异性

参考

http://kns.cnki.net/KCMS/detail/detail.aspx?filename=mssb201711005&dbname=CJFD&dbcode=CJFQ


An Improved Deep Learning Architecture for Person Re-Identification

CVPR’15

问题

  • A typical re-identification system takes as input two images, each of which usually contains a person’s full body, and outputs either a similarity score between the two images or a classification of the pair of images as same (if the two images depict the same person) or different (if the images are of different people)
  • In this paper, we follow this approach and use a novel deep learning network to assign similarity scores to pairs of images of human bodies

方法

这里写图片描述

  • 具体解释参考文献1

参考

行人检索“An Improved Deep Learning Architecture for Person Re-Identification”

https://www.cv-foundation.org/openaccess/content_cvpr_2015/app/2B_062.pdf


Deep Feature Learning with Relative Distance Comparison for Person Re-identification

Pattern Recognition 2015

问题

  • Although the effectiveness of the distance function has been demonstrated, it heavily relies on the quality of the features selected, and such selection requires deep domain knowledge and expertise

方法

  • 提出 Triplet Loss
  • we train the network through a set of triplets. Each triplet contains three images, i.e. a query image, one matched reference (an image of the same person as that in the query image) and one mismatched reference

参考

https://arxiv.org/abs/1512.03622


Deep Transfer Learning for Person Re-identification

arXiv:1611

问题

  • Person re-identification (Re-ID) poses a unique challenge to deep learning: how to learn a deep model with millions of parameters on a small training set of few or no labels

方法

这里写图片描述

  • First , a deep network architecture is designed which differs from existing deep Re-ID models in that (a) it is more suitable for transferring representations learned from large image classification datasets, and (b) classification loss and verification loss are combined, each of which adopts a different dropout strategy
  • Second, a two-stepped fine-tuning strategy is developed to transfer knowledge from auxiliary datasets.
  • Third, given an unlabelled Re-ID dataset, a novel unsupervised deep transfer learning model is developed based on co-training.

收获

  • 表征学习也成为了ReID领域的一个非常重要的baseline,并且表征学习的方法比较鲁棒,训练比较稳定,结果也比较容易复现
  • 表征学习容易在数据集的domain上容易过拟合,并且当训练ID增加到一定程度的时候会显得比较乏力

参考

https://arxiv.org/abs/1611.05244


A Discriminatively Learned CNN Embedding for Person Re-identification

TOMM 2017

问题

  • 在行人重识别的问题上一般有verification and identification 两种模型
  • The two models have their respective advantages and limitations due to different loss functions
  • 作者想结合两种模型的长处提高识别准确率

方法

  • identification loss + verification loss

收获

  • identification loss 做分类的时候容易过拟合(比如一个人背了包,它就认为只要背包就是这个人),这时候需要加正则项,比如verification loss

参考

https://arxiv.org/abs/1611.05666


Person Re-Identification Using CNN Features Learned from Combination of Attributes

ICPR‘16

问题

  1. However, large disparity among the pre-trained task, i.e., ImageNet classification, and the target task, i.e., person image matching, limits performances of the CNN features for person re-identification
  2. Therefore, the discriminative power of CNN features solely fined-tuned on pedestrian attributes is typically insufficient

方法

  • 对于问题1
    • we conduct a fine-tuning of CNN features on a pedestrian attribute dataset to bridge the gap of ImageNet classification and person re-identification
  • 对于问题2
    • we focus on combinations of attributes for grouping similar people.

收获

  • Re-training a pre-trained CNN for another task is called fine-tuning, which transfers the knowledge of pre-training data and significantly improves the performance on another task

参考

https://pdfs.semanticscholar.org/0e80/10baaa8dd93b7077719d6c43629c070da6bf.pdf


Gated Siamese Convolutional Neural Network Architecture for Human Re-Identification

ECCV‘16

问题

  • several end-to-end deep Siamese CNN architectures have been proposed for human re-identification with the objective of projecting the images of similar pairs (i.e. same identity) to be closer to each other and those of dissimilar pairs to be distant from each other. However, current networks extract fixed representations for each image regardless of other images which are paired with it and the comparison with other images is done only at the final level

方法

  • we propose a gating function to selectively emphasize such fine common local patterns by comparing the mid-level features across pairs of images
  • The fundamental CNN architecture is modeled in a siamese fashion optimized by the contrastive loss function

参考

https://arxiv.org/abs/1607.08378


MARS: A Video Benchmark for Large-Scale Person Re-identification

ECCV‘16

问题

  • a few video re-id datasets exist [4, 15, 28, 36]. They are limited in scale: typically several hundred identities are contained, and the number of image sequences doubles
  • image sequences in these video re-id datasets are generated by hand-drawn bboxes. This process is extremely expensive, requiring intensive human labor
  • But in reality, pedestrian detectors will lead to part occlusion or misalignment which may have a non-ignorable effect on re-id accuracy
  • As a result, in practice one identity will have multiple probes and multiple sequences as ground truths. It remains unsolved how to make use of these visual cues

方法

  • collecting and annotating a new person re-identification dataset, named \Motion Analysis and Re-identification Set" (MARS)
  • instead of hand-drawn bboxes, we use the DPM detector [11] and GMMCP tracker [7] for pedestrian detection and tracking, respectively
  • Third, MARS includes a number of distractor tracklets produced by false detection or tracking result
  • the multiplequery and multiple-ground truth mode will enable future research in fields such as query re-formulation and search re-ranking

参考

https://pdfs.semanticscholar.org/c038/7e788a52f10bf35d4d50659cfa515d89fbec.pdf


Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro

ICCV 2017

问题

  • 行人重识别里面的数据比较少

方法

这里写图片描述

  • 用训练数据训练一个DCGAN(无监督学习),然后用generator生成数据,和训练数据一起去训练一个卷及网络(半监督学习)
  • 生成出来的数据是没标签的,作者用LSRO方法把标签变成 1/K (K为总ID数)
    • 有待改进……

收获

  • 数据不够时可以考虑用GAN来填,及时生成的效果不好,在一定程度上能防止过拟合

参考

https://arxiv.org/abs/1701.07717


Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

arxiv 1801

问题

  • 行人重识别中每个ID的数据少,用GAN生成出来的数据没标签
  • LSRO方法不切实际

方法

  • 作者提出 Multi-pseudo Regularized Label (MpRL) 的方法
    • 标签变成 a k / K a_k/K ak/K (K为总ID数)
    • where α_k is the contribution from k-th pre-defined class in the dictionary α.

参考

https://arxiv.org/pdf/1801.06742.pdf


Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification

CVPR 2016

问题

  • Learning generic and robust feature representations with data from multiple domains for the same problem is of great value, especially for the problems that have multiple datasets but none of them are large enough to provide abundant data variations
  • 即对于同一个问题,从多个数据库学习,对学习具有鲁棒性的一般特征表达是非常有价值的,特别是在有很多不同的数据库,但没有一个数据库有足够的数据情况下。

方法

  • 从多个训练集进行训练
    提出 Domain Guided Dropout (DGD )
  • Domain Guided Dropout — a simple yet effective method of muting non-related neurons for each domain.
  • 这个方法就是能在学习特征时抑制对该特征不活跃的神经元,并促进对该特征活跃神经元的工作,这样在一定程度上能减少训练的参数,以提高程序的性能

收获

  • 用多个数据库训练时,不同特征的学习,神经元的活跃程度是不一样的,这时可以采用DGD方法去正则化

参考

https://arxiv.org/abs/1604.07528


Person Re-identification in the Wild

CVPR’17

问题

  • Our baselines address three issues: the performance of various combination of detectors and recognizers, mechanisms for pedestrian detection to help improve overall re-identification (re-ID) accuracy and assessing the effectiveness of different detectors for re-ID.
  • Current datasets lack annotations for such combined evaluation of person detection and re-ID.
  • person re-ID datasets, such as VIPeR [16] or CUHK03 [21], usually provide just cropped bounding boxes without the complete video frames, especially at a large scale
  • As a consequence, a large-scale dataset that evaluates both detection and overall re-ID is needed

方法

  • 提出新的数据库PRW

收获

  • detectors对于re-id非常重要

参考

http://openaccess.thecvf.com/content_cvpr_2017/papers/Zheng_Person_Re-Identification_in_CVPR_2017_paper.pdf


Joint Detection and Identification Feature Learning for Person Search

CVPR 2017

问题

  • Although numerous person re-id datasets and methods have been proposed, there is still a big gap between the problem setting itself and real-world applications. In most benchmarks, the gallery only contains anually cropped pedestrian images, while in real applications, the goal is to find a target person in a gallery of whole scene images
  • 即许多方法用到的是人工裁剪过的图像,而在现实中首先要图片背景中识别出行人
  • 传统的 pairwise or triplet distance loss functions 计算量太大
  • Softmax loss 随着行人类型的增多,运行时间会变慢甚至函数无法收敛

方法

  • 训练一个含两部分组成的CNN,
    • 一个pedestrian proposal net (Faster RCNN),来产生候选行人的 bounding boxes
    • 一个identification net,来提取特征来进行与检索目标的比较
    • 两者在 joint optimization过程中具有相互适应的特点,从而消除自身外另一网络带来的问题
  • 提出了 Online Instance Matching (OIM) loss function

收获

  • 可以先Detection再处理
  • OIM损失可以更好地解决一个人的类别太多但一个mini-batch里面样本不够多样,导致没法训练分类器的问题

参考

https://github.com/ShuangLI59/person_search

http://openaccess.thecvf.com/content_cvpr_2017/papers/Xiao_Joint_Detection_and_CVPR_2017_paper.pdf


In Defense of the Triplet Loss for Person Re-Identification

arXiv1703

问题

  • Classification Loss: 当目标很大时,会严重增加网络参数,而训练结束后很多参数都会被摒弃。
  • Verification Loss: 只能成对的判断两张图片的相似度,因此很难应用到目标聚类和检索上去。因为一对一对比太慢。
  • Triplet Loss:没有hard mining会导致训练阻塞收敛结果不佳,选择过难的hard又会导致训练不稳定收敛变难

方法

提出了 triplet hard loss

  • 把几种 Triplet Loss 做对比实验
    • Large Margin Nearest Neighbor loss
    • FaceNet Triplet Loss
    • Batch All Triplet Loss
    • Batch Hard Triplet Loss
    • Lifted Embedding Loss

收获

  • Triplet hard Loss 要优于其他 Loss

参考

Re-ID with Triplet Loss

https://arxiv.org/abs/1703.07737


Beyond triplet loss: a deep quadruplet network for person re-identification

CVPR’17

问题

  • the triplet loss pays main attentions on obtaining correct orders on the training set
  • It still suffers from a weaker generalization capability from the training set to the testing set, thus resulting in inferior performance
  • 即 triplet loss 泛化能力不好

方法

  • we design a quadruplet loss, which can lead to the model output with a larger inter-class variation and a smaller intra-class variation compared to the triplet loss
  • L q u a d = ∑ i , j , k N [ g ( x i , x j ) 2 − g ( x i , x k ) 2 + α 1 ] + L_{quad} =\sum ^N_{i,j,k}[g(x_i, x_j)^2 - g(x_i, x_k)^2 + α_1]_+ Lquad=i,j,kN[g(xi,xj)2g(xi,xk)2+α1]+
    • + ∑ i , j , k , l N [ g ( x i , x j ) 2 − g ( x l , x k ) 2 + α 2 ] + +\sum ^N_ {i,j,k,l}[g(x_i, x_j)^2 - g(x_l, x_k)^2 + α_2]_+ +i,j,k,lN[g(xi,xj)2g(xl,xk)2+α2]+
    • $s_i = s_j, s_l≠ s_k, s_i ≠ s_l, s_i ≠ s_k $
  • 前一项是传统的 Triplet Loss,后一项用于进一步缩小类内差距
  • 由于前一项的重要更大,因此作者控制 ( α 1 > α 2 ) (\alpha_1 > \alpha_2) (α1>α2).

参考

http://arxiv.org/abs/1704.01719


Improving Person Re-identification by Attribute and Identity Learning

arXiv 1703

问题

  • Attribute recognition 关注一个人的局部表征
  • person re-ID 关注整体
  • 作者想结合它们

方法

  • 训练了一个CNN用于学习re-ID,同时预测行人属性
  • This multi-task method integrates an ID classificationloss and a number of attribute classification losses, and back-propagates the weighted sum of the individual losses

收获

  • 识别问题可以用属性来约束(但是数据可能是问题)
  • 可以结合LSTM,attention的机制试试

参考

https://arxiv.org/abs/1703.07220


SVDNet for Pedestrian Retrieval

ICCV 2017

问题

  • 当训练一个用于提取re-ID问题中行人特征的深度卷积神经网络(CNN)时,与在其它所有典型的深度学习训练一样,通常所学到的权向量是“杂乱无章”的,这种杂乱无章体现在,网络同一层中的权向量,通常是存在较强的相关性(注意不是线性相关linear dependent)。这种相关性,对于特征表达可能会造成不必要甚至是非常有害的冗余

方法

  • 基础网络为resnet-50

    训练方法分为3步,称之为Restraint and Relaxation Iteration (RRI)

    1. 去相关——每次训练模型收敛之后,对特征表达层的权矩阵W进行奇异值分解,即W=USV’,然后,用US去取代原来的W,这时,W变成了一个正交阵(每个权向量彼此正交),且新的权向量是原来权矩阵WW’的本征向量。经过这样一次去相关之后,原本已经收敛的模型偏离原先的局部最优解、在训练集上的分类损失变大了。
    2. 紧张训练(Restraint)——固定住步骤1中的W不更新,学习其它层参数,直至网络重新收敛。需要注意的是,在这种情况下,网络会收敛到一个次优解:因为它有一层的W是受限制。因此,在接下来,我们会取消这个限制,继续训练。
    3. 松弛训练(Relaxation)——在步骤2之后,取消W固定的限制,这个时候,网络会发现对于拟合训练样本会这个目标会有一个更好的解:请注意,仅仅是针对拟合训练样本这个目标。我们实验发现,这个模型使用在训练集上(包含全新的ID)时,它的泛化能力是相对较弱的。

    而在步骤3之后,W里的权向量重新变的相关起来。因此,我们把这3步迭代起来,形成RRI,直最终收敛。

收获

  • 在训练过程中施加正交约束,可以采用“硬”的去相关方法,也可以采用“软”的loss或正则约束
  • 正交化可以去相关
  • 紧松交错训练

参考

https://zhuanlan.zhihu.com/p/29326061

https://arxiv.org/abs/1703.05693


Person Re-Identification by Deep Joint Learning of Multi-Loss Classification

IJCAI’17

问题

  • Existing person re-identification (re-id) methods rely mostly on either localised or global feature representation alone. This ignores their joint benefit and mutual complementary effects

方法

  • 提出 Joint Learning Multi-Loss
  • 这里写图片描述
  • This JLML model consists of a twobranches CNN network:
    • One local branch of m streams of an identical structure with each stream learning the most discriminative local visual features for one of m local image regions of a person bounding box image;
    • Another global branch responsible for learning the most discriminative global level features from the entire person image
  • Sharing the low-level conv layer reduces the m
  • 9
    点赞
  • 53
    收藏
    觉得还不错? 一键收藏
  • 7
    评论
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值