2018 ICME 深度候选人选择学习的和行人re-id的实时的多人跟踪

最新推荐文章于 2024-08-25 12:19:37 发布

计算机视觉-Archer

最新推荐文章于 2024-08-25 12:19:37 发布

阅读量1.1k

点赞数 1

分类专栏： Deep SORT Tracking-by-detection 多目标跟踪论文检测算法＋跟踪算法

本文链接：https://blog.csdn.net/zjc910997316/article/details/84893057

版权

Deep SORT 同时被 3 个专栏收录

20 篇文章 11 订阅

订阅专栏

论文

9 篇文章 1 订阅

订阅专栏

Tracking-by-detection

5 篇文章 0 订阅

订阅专栏

REAL-TIME MULTIPLE PEOPLE TRACKING WITH DEEPLY LEARNED CANDIDATE SELECTION AND PERSON RE-IDENTIFICATION

（清华chen long）

源代码：https://github.com/longcw/MOTDT

粉色：重点算法紫色：生癖词汇绿色：引文&未补充公式&链接

[1]Conﬁdence-based data association and discriminative deep appearance learning for robust online multi-object tracking,” IEEE Transactions on
PAMI, 2017

2018 IEEE International Conference on Multimedia and Expo (ICME)

摘要

Online multi-object tracking is a fundamental problem in time-critical video analysis applications.
A major challenge in the popular tracking-by-detection framework is how to associate unreliable detection results with existing tracks. In this paper, we propose to handle unreliable detection by collecting candidates from outputs of both detection and tracking.
The intuition behind generating redundant candidates is that detection and tracks can complement each other in different scenarios.

多目标在线跟踪是多目标跟踪中的一个基本问题
时间关键视频分析应用。一个重大的挑战
在流行的逐检测跟踪框架中，如何将不可靠的检测结果与现有的轨迹关联起来。
在本文提出从检测和跟踪两方面的输出中收集候选信息，从而处理不可靠的检测。
生成冗余候选项的直觉是这种检测和跟踪可以在不同的场景中互补。

Detection results of high conﬁdence prevent tracking drifts in the long term, and predictions of tracks can handle noisy detection caused by occlusion.
In order to apply optimal selection from a considerable amount of candidates in real-time, we present a novel scoring function based on a fully convolutional neural network, that shares most computations on the entire image.
Moreover, we adopt a deeply learned appearance representation, which is trained on largescale person re-identiﬁcation datasets, to improve the identi-ﬁcation ability of our tracker.
Extensive experiments show that our tracker achieves real-time and state-of-the-art performance on a widely used people tracking benchmark.

高置信度的检测结果可以长期防止跟踪漂移，对轨迹的预测可以处理由遮挡引起的噪声检测。
为了实时地从大量的候选对象中进行最优选择，提出了一种基于一个全卷积神经网络的代得分函数，在整个图像上共享大部分计算。
此外，我们采用深度学习的外观表示，这是训练在大尺度的人再识别数据集，以提高我们的跟踪识别能力。
大量的实验表明，我们的跟踪器在广泛使用的人员跟踪基准上实现了实时和最先进的性能。

INTRODUCTION

Both intra-category occlusion and unreliable detection are tremendous challenges in such a tracking framework [1, 2].
类内遮挡，不靠谱检测影响跟踪

Multiple cues,including motion, shape and object appearances, are fused to mitigate this problem [3, 4].
运动，形状，外貌融合减轻这个问题

Some studies proposed to handle unreliable detection in a batch mode [2, 5, 6].
不可靠检测问题

Yan et al. [4] proposed to treat the tracker and object detector as two independent identities, and keep results of them as candidates.
Yan等人提出将跟踪器和目标检测器作为两个独立的恒等式，并将它们的结果作为候选。
They selected candidates based on hand-crafted features, e.g., color histogram, optical ﬂow, and motion features.

本文：
On the one hand, reliable predictions from the tracker can be used for short-term association in case of missing detection or non-accurate bounding.
On the other hand, conﬁdent detection results are essential to prevent tracks drifting to backgrounds in the long term.
How to score outputs of both detection and tracks in an uniﬁed way is still an open question.
一方面，当检测缺失或边界不准确时，跟踪器的可靠预测可以用于短期关联。
另一方面，长期来看，好的检测结果对防止轨道漂移到背景是非常重要的，。
如何对检测和轨迹的输出统一打分，这仍然是一个悬而未决的问题。

In this paper, we take full advantage of deep neural networks to tackle unreliable detection and intra-category occlusion。
Our contribution is three fold.
First, we handle unreliable detection in online tracking by combining both detection and tracking results as candidates, and selecting optimal candidates based on deep neural networks.
Second, we present a hierarchical data association strategy, which utilizes spatial information and deeply learned person re-identiﬁcation (ReID) features.
Third, we demonstrate real-time and state-of-the-art performance of our tracker on a widely used people tracking benchmark.
在本文中，我们充分利用深层神经网络来处理不可靠的检测和类别内的遮挡。
我们的贡献是三部分。
首先，处理在线跟踪中的不可靠检测：我们将检测和跟踪结果都作为候选，并基于深度神经网络选择最优候选，。
其次，提出了一种利用空间信息和深度学习的人再识别(ReID)特征的分层数据关联策略。
第三，我们在广泛使用的人员跟踪基准上演示了跟踪器的实时和最先进的性能。

RELATED WORK

our framework leverages deeply learned ReID features in an online mode, to improve the identiﬁcation ability when coping with the problem of intra-category occlusion.
我们的框架利用了在线模式下深入学习的ReID特性，提高了类别内遮挡问题的识别能力。

PROPOSED METHOD

Framework Overview

In this work, we extend traditional tracking-by-detection by collecting candidates from outputs of both detection and tracks.
Our framework consists of two sequential tasks, that is, candidate selection and data association.
We ﬁrst measure all the candidates using an uniﬁed scoring function.
A discriminatively trained object classiﬁer and a well-designed tracklet conﬁdence are fused to formulate(规划) the scoring function, as described in Section 3.2 and Section 3.3.
Non-maximal suppression (NMS) is subsequently performed with the estimated scores.
After obtaining candidates without redundancy, we use both appearance representations and spatial information to hierarchically associate existing tracks with the selected candidates.
Our appearance representations are deeply learned from the person re-identiﬁcation as described in Section 3.4.
Hierarchical data association is detailed in Section 3.5.
在这项工作中，我们通过从检测和轨迹的输出中收集候选点来扩展传统的逐检测跟踪。
我们的框架由两个顺序的任务组成，即候选选择和数据关联。
我们首先使用统一的评分函数来衡量所有的候选人。
融合有区别地训练目标分类器和设计良好的轨迹置信度，形成评分函数，如 3.2和3.3节 所述。
估计得分后，进行非最大抑制(NMS)。
在不存在冗余的情况下，获取候选信息后，利用外观表示和空间信息将现有轨迹与所选候选轨迹进行分层关联。
我们的外貌表征从 3.4节 中描述的重新识别的人身上得到了深刻的学习。
分层数据关联在 3.5节 中详细介绍。

3.3. Tracklet Conﬁdence and Scoring Function

Given a new frame, we estimate the new location of each existing track using the Kalman ﬁlter.
These predictions are adopted to handle detection failures caused by varying visual properties of objects and occlusion in crowded scenes.
But they are not suitable for long-term tracking.
The accuracy of the Kalman ﬁlter could decrease if it is not updated by detection over a long time.
Tracklet conﬁdence is designed to measure the accuracy of the ﬁlter using temporal information.
给出了一个新的帧，利用卡尔曼滤波估计了每条现有轨迹的新位置。
这些预测用于处理拥挤场景中物体视觉属性变化和遮挡导致的检测失败。
但它们不适合长期跟踪。
如果长时间不进行检测更新，卡尔曼滤波器的精度会降低。
Tracklet conﬁdence是利用时间信息来测量滤波精度的一种方法。

A tracklet is generated through temporal association of candidates from consecutive frames.
We can split a track into a set of tracklets since a track can be interrupted and retrieved for times during its lifetime.
Every time a track is retrieved from lost state, the Kalman ﬁlter will be reinitialized.
Therefore, only the information of the last tracklet is utilized to formulate the conﬁdence of the track.
Here we deﬁne Ldet as the number of detection results associated to the tracklet, and
Ltrk as the number of track predictions after the last detection is associated.
The tracklet conﬁdence is deﬁned as:
轨迹是由连续帧中候选点的时间关联生成的。
我们可以将轨迹分割为一组tracklets，因为轨迹在其生命周期中可以被中断并检索多次。
每次从丢失状态补救跟踪时，都会重新初始化卡尔曼滤波器。
因此，只利用最后一个轨迹的信息来表示轨迹的置信度。
这里我们将：
Ldet定义为与tracklet相关联的检测结果的数量，
Ltrk定义为与最后一次检测相关联后，的轨迹预测的数量。
tracklet conﬁdence定义为:
strk = max(1-log(1 + Ltrk); 0) 1(Ldet 2);