AB3DMOT的github链接
AB3DMOT的环境配置参考
内容简介
这里主要是对这篇论文中的代码进行一些解析,主要是多目标跟踪过程的解析,目的是记录下来,为了能回顾。
代码解析
代码比较多,但对于多目标跟踪过程来说,主要是要抓住model.py这个文件中的track函数,如下面所示,为了方便起见,打印日志的代码都删除了。
def track(self, dets_all, frame, seq_name):
dets, info = dets_all['dets'], dets_all['info'] # dets: N x 7, float numpy array
if self.debug_id: print('\nframe is %s' % frame)
self.frame_count += 1
# recall the last frames of outputs for computing ID correspondences during affinity processing
self.id_past_output = copy.copy(self.id_now_output)
self.id_past = [trk.id for trk in self.trackers]
# process detection format
dets = self.process_dets(dets)
# tracks propagation based on velocity
trks = self.prediction()
# ego motion compensation, adapt to the current frame of camera coordinate
if (frame > 0) and (self.ego_com) and (self.oxts is not None):
trks = self.ego_motion_compensation(frame, trks)
# matching
trk_innovation_matrix = None
if self.metric == 'm_dis':
trk_innovation_matrix = [trk.compute_innovation_matrix() for trk in self.trackers]
matched, unmatched_dets, unmatched_trks, cost, affi = \
data_association(dets, trks, self.metric, self.thres, self.algm, trk_innovation_matrix)
self.update(matched, unmatched_trks, dets, info)
# create and initialise new trackers for unmatched detections
new_id_list = self.birth(dets, info, unmatched_dets)
# output existing valid tracks
results = self.output()
if len(results) > 0: results = [np.concatenate(results)] # h,w,l,x,y,z,theta, ID, other info, confidence
else: results = [np.empty((0, 15))]
self.id_now_output = results[0][:, 7].tolist() # only the active tracks that are outputed
# post-processing affinity to convert to the affinity between resulting tracklets
if self.affi_process:
affi = self.process_affi(affi, matched, unmatched_dets, new_id_list)
return results, affi
track是主要的函数,一次track代表处理了一帧数据的结果,输入的是所有的检测dets_all,还有帧的ID frame。
这个函数主要是分成下面几个步骤:
- 拿到上一帧的状态输出和跟踪目标的id数
- 对检测的信息进行信息转换,转成程序需要的格式
- 对上一帧的跟踪目标进行预测
- 如果当前帧不是第一帧,需要进行运动补偿,将目标预测结果转到当前帧的坐标系下
- 检测与预测进行数据关联
- 用对应的检测更新对应目标的预测的后验
- 为没有匹配到的检测添加跟踪器
- 将没有检测的跟踪器移除
- 最后将所有稳定的跟踪器显示出来,并保存它们的id
下面就是具体的步骤对应的代码解析:
def process_dets(self, dets)函数(对应第三步)
def prediction(self):
# get predicted locations from existing tracks
trks = []
for t in range(len(self.trackers)):
# propagate locations
kf_tmp = self.trackers[t]
kf_tmp.kf.predict()
kf_tmp.kf.x[3] = self.within_range(kf_tmp.kf.x[3])
# update statistics
kf_tmp.time_since_update += 1
trk_tmp = kf_tmp.kf.x.reshape((-1))[:7]
trks.append(Box3D.array2bbox(trk_tmp))
return trks
通过对已经存在的多个目标进行预测(匀速模型预测)
这个函数返回的数据是一个数组,包含所有已知目标的预测状态,7维(x, y, z, theta, l, w, h)
def ego_motion_compensation(self, frame, trks)函数(对应第四步)
没必要深入,就是运动补偿,应该是根据标好外参的雷达和imu,通过imu进行补偿。
def data_association()(对应第五步,较为关键的一步)
def data_association(dets, trks, metric, threshold, algm='greedy', \
trk_innovation_matrix=None, hypothesis=1):
# if there is no item in either row/col, skip the association and return all as unmatched
aff_matrix = np.zeros((len(dets), len(trks)), dtype=np.float32)
if len(trks) == 0:
return np.empty((0, 2), dtype=int), np.arange(len(dets)), [], 0, aff_matrix
if len(dets) == 0:
return np.empty((0, 2), dtype=int), [], np.arange(len(trks)), 0, aff_matrix
# prepare inverse innovation matrix for m_dis
if metric == 'm_dis':
assert trk_innovation_matrix is not None, 'error'
trk_inv_inn_matrices = [np.linalg.inv(m) for m in trk_innovation_matrix]
else:
trk_inv_inn_matrices = None
# compute affinity matrix
aff_matrix = compute_affinity(dets, trks, metric, trk_inv_inn_matrices)
# association based on the affinity matrix
if hypothesis == 1:
if algm == 'hungar':
row_ind, col_ind = linear_sum_assignment(-aff_matrix) # hougarian algorithm
matched_indices = np.stack((row_ind, col_ind), axis=1)
elif algm == 'greedy':
matched_indices = greedy_matching(-aff_matrix) # greedy matching
else: assert False, 'error'
else:
cost_list, hun_list = best_k_matching(-aff_matrix, hypothesis)
# compute total cost
cost = 0
for row_index in range(matched_indices.shape[0]):
cost -= aff_matrix[matched_indices[row_index, 0], matched_indices[row_index, 1]]
# save for unmatched objects
unmatched_dets = []
for d, det in enumerate(dets):
if (d not in matched_indices[:, 0]): unmatched_dets.append(d)
unmatched_trks = []
for t, trk in enumerate(trks):
if (t not in matched_indices[:, 1]): unmatched_trks.append(t)
# filter out matches with low affinity
matches = []
for m in matched_indices:
if (aff_matrix[m[0], m[1]] < threshold):
unmatched_dets.append(m[0])
unmatched_trks.append(m[1])
else: matches.append(m.reshape(1, 2))
if len(matches) == 0:
matches = np.empty((0, 2),dtype=int)
else: matches = np.concatenate(matches, axis=0)
return matches, np.array(unmatched_dets), np.array(unmatched_trks), cost, aff_matrix
首先理清函数的输入输出
函数的输入:
- 检测目标的数组;
- 卡尔曼预测后目标的数组;(检测和预测纬度不一定相同)
- 其他都是一系列的初始化参数。
函数的输出:
- 检测和预测匹配好的矩阵;
- 没有匹配的检测的数组
- 没有匹配的预测的数组
- 全部配对总的代价
- 亲和矩阵
最后理清函数流程:
- 如果检测为空或者预测为空,则直接返回
- 根据设定好的规则(iou或者dis等)计算亲和矩阵
- 用匈牙利算法或者用greedy算法来进行关联(注意 - 号表示求最小权匹配,因为匈牙利算法通常用于寻找最小成本的匹配)
- 计算总的代价值cost(函数需要返回的值)
- 用一个数组保存没有匹配上的检测和预测
- 根据设定好的阈值来过滤成本高的配对,过滤掉的配对重新加入到没有匹配上的检测和预测的数组中
def update(self, matched, unmatched_trks, dets, info)(对应第六步)
def update(self, matched, unmatched_trks, dets, info):
# update matched trackers with assigned detections
dets = copy.copy(dets)
for t, trk in enumerate(self.trackers):
if t not in unmatched_trks:
d = matched[np.where(matched[:, 1] == t)[0], 0] # a list of index
assert len(d) == 1, 'error'
# update statistics
trk.time_since_update = 0 # reset because just updated
trk.hits += 1
# update orientation in propagated tracks and detected boxes so that they are within 90 degree
bbox3d = Box3D.bbox2array(dets[d[0]])
trk.kf.x[3], bbox3d[3] = self.orientation_correction(trk.kf.x[3], bbox3d[3])
# kalman filter update with observation
trk.kf.update(bbox3d)
trk.kf.x[3] = self.within_range(trk.kf.x[3])
trk.info = info[d, :][0]
这个更新比较好懂,有检测的跟踪进行更新,注意的是:
- 检测和预测角度的问题,这个文章里讲过处理两者角度大于90度的情况和处理方法。
- 还有需要注意trk.time_since_update = 0,将这个跟踪器的这个值置0,因为跟踪器每预测一次,这个计数要加一,如果预测步数大于设定的阈值就抛弃。
def birth(self, dets, info, unmatched_dets)(对应第七步)
def birth(self, dets, info, unmatched_dets):
# create and initialise new trackers for unmatched detections
# dets = copy.copy(dets)
new_id_list = list() # new ID generated for unmatched detections
for i in unmatched_dets: # a scalar of index
trk = KF(Box3D.bbox2array(dets[i]), info[i, :], self.ID_count[0])
self.trackers.append(trk)
new_id_list.append(trk.id)
# print('track ID %s has been initialized due to new detection' % trk.id)
self.ID_count[0] += 1
return new_id_list
很简单,为每一个没有匹配的检测创建一个kf跟踪器,分配好id,加入到self.trackers中
def output(self):(对应第八步)
def output(self):
# output exiting tracks that have been stably associated, i.e., >= min_hits
# and also delete tracks that have appeared for a long time, i.e., >= max_age
num_trks = len(self.trackers)
results = []
for trk in reversed(self.trackers):
# change format from [x,y,z,theta,l,w,h] to [h,w,l,x,y,z,theta]
d = Box3D.array2bbox(trk.kf.x[:7].reshape((7, ))) # bbox location self
d = Box3D.bbox2array_raw(d)
if ((trk.time_since_update < self.max_age) and (trk.hits >= self.min_hits or self.frame_count <= self.min_hits)):
results.append(np.concatenate((d, [trk.id], trk.info)).reshape(1, -1))
num_trks -= 1
# deadth, remove dead tracklet
if (trk.time_since_update >= self.max_age):
self.trackers.pop(num_trks)
return results
主要是用于输出已稳定关联的跟踪结果。注意的是:
- 判断稳定的跟踪结果的条件是
跟踪器的预测次数小于给定阈值且(跟踪器的更新次数大于给定阈值或者整个多目标跟踪的次数小于这个阈值时,主要是对付在跟踪开始时的情况) - 判断丢失的跟踪结果条件是
预测次数已经大于给定阈值了,就从self.trackers抛弃出去。
总而言之,self.trackers是一个卡尔曼滤波器的容器,每一个卡尔曼滤波器代表着一个物体的跟踪器,里面会有新生的跟踪器,也会有丢失的跟踪器,在output这个函数中,会把稳定的跟踪器筛选出来,丢失的跟踪器抛弃,新生的跟踪器暂时不会显示出来,要的等到有足够多的更新次数就会进化成稳定的跟踪器,就会显示出来。
def process_affi(self, affi, matched, unmatched_dets, new_id_list):(这个函数对亲和矩阵后处理,不关心)
总结
原始代码写的注释已经比较详细,只是我对python不太熟悉,都是用c++多,最终是靠着gpt帮忙整理代码帮助阅读。
对这篇文章来说,是一个非常好的开源工作,实现了一个很完整简洁的多目标跟踪系统,值得我们学习的地方很多。
我的这篇解析也可能会出现个别问题,还望大佬评论区指正。
提出问题
最后提出点自己的疑问:
- m_dis应该是指马氏距离指标,但是不太清楚在多目标跟踪过程,每个跟踪目标与每个观测之间的这个协方差矩阵是怎么来的,虽然在这个代码中给了实现方法,但不清楚具体原理。在SORT_deep那篇也用的马氏距离,但是讲的不太详细。
- 检测的置信度在这个跟踪过程中没体现出作用,感觉没有用在卡尔曼更新上。
- 没有引入形状或者运动的信息做关联,应该也会出现SORT_deep那篇提到的问题,就是跟踪过程中,跟踪物体的总id数会比较多,重识别应该做的不好,但是这篇文章的重点也不是在此。