SORT源码来自于源码,这里使用的是更改后的源码,加入了删除id输出,将匹配的距离矩阵计算方法从IOU变为GIOU,卡尔曼滤波采用定义的类别进行计算更方便于理解,至于匈牙利匹配,本文只做了简单介绍,详细流程请看链接匈牙利算法&KM算法
卡尔曼滤波
卡尔曼滤波可以看作是一种运动模型,用来对目标的轨迹进行预测,即通过过去的状态向量 X i − 1 X_{i-1} Xi−1预测现在的状态向量 X i X_i Xi并且使用确信度较高的检测结果进行预测结果的修正。
状态向量 X = [ u v s r u ˙ v ˙ s ˙ ] X=\begin{bmatrix}u \\ v \\ s \\ r \\ \dot{u} \\ \dot{v} \\ \dot{s}\\ \end{bmatrix} X=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡uvsru˙v˙s˙⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤包括4个状态和3个速度,检测框的中心坐标 ( u , v ) (u,v) (u,v),检测框的面积 s = w ∗ h s = w * h s=w∗h,检测框的宽高比 r = w h r = \frac{w}{h} r=hw,以及对应的速度 u ˙ , v ˙ , s ˙ \dot{u} , \dot{v} , \dot{s} u˙,v˙,s˙(初始速度为0)
- 可以将状态向量理解为均值
向量X的协方差矩阵 P = E ( ( X − E ( X ) ) ( X − E ( X ) T ) ) = [ C o v ( u , u ) C o v ( u , v ) C o v ( u , s ) C o v ( u , r ) C o v ( u , u ˙ ) C o v ( u , v ˙ ) C o v ( u , s ˙ ) C o v ( v , u ) C o v ( v , v ) C o v ( v , s ) C o v ( v , r ) C o v ( v , u ˙ ) C o v ( v , v ˙ ) C o v ( v , s ˙ ) C o v ( s , u ) C o v ( s , v ) C o v ( s , s ) C o v ( s , r ) C o v ( s , u ˙ ) C o v ( s , v ˙ ) C o v ( s , s ˙ ) C o v ( r , u ) C o v ( r , v ) C o v ( r , s ) C o v ( r , r ) C o v ( r , u ˙ ) C o v ( r , v ˙ ) C o v ( r , s ˙ ) C o v ( u ˙ , u ) C o v ( u ˙ , v ) C o v ( u ˙ , s ) C o v ( u ˙ , r ) C o v ( u ˙ , u ˙ ) C o v ( u ˙ , v ˙ ) C o v ( u ˙ , s ˙ ) C o v ( v ˙ , u ) C o v ( v ˙ , v ) C o v ( v ˙ , s ) C o v ( v ˙ , r ) C o v ( v ˙ , u ˙ ) C o v ( v ˙ , v ˙ ) C o v ( v ˙ , s ˙ ) C o v ( s ˙ , u ) C o v ( s ˙ , v ) C o v ( s ˙ , s ) C o v ( s ˙ , r ) C o v ( s ˙ , u ˙ ) C o v ( s ˙ , v ˙ ) C o v ( s ˙ , s ˙ ) ] P = E((X-E(X))(X-E(X)^T)) = \begin{bmatrix}Cov(u,u)&Cov(u,v)&Cov(u,s)&Cov(u,r)&Cov(u,\dot{u})&Cov(u,\dot{v})&Cov(u,\dot{s}) \\ Cov(v,u)&Cov(v,v)&Cov(v,s)&Cov(v,r)&Cov(v,\dot{u})&Cov(v,\dot{v})&Cov(v,\dot{s}) \\ Cov(s,u)&Cov(s,v)&Cov(s,s)&Cov(s,r)&Cov(s,\dot{u})&Cov(s,\dot{v})&Cov(s,\dot{s}) \\ Cov(r,u)&Cov(r,v)&Cov(r,s)&Cov(r,r)&Cov(r,\dot{u})&Cov(r,\dot{v})&Cov(r,\dot{s}) \\ Cov(\dot{u},u)&Cov(\dot{u},v)&Cov(\dot{u},s)&Cov(\dot{u},r)&Cov(\dot{u},\dot{u})&Cov(\dot{u},\dot{v})&Cov(\dot{u},\dot{s}) \\ Cov(\dot{v},u)&Cov(\dot{v},v)&Cov(\dot{v},s)&Cov(\dot{v},r)&Cov(\dot{v},\dot{u})&Cov(\dot{v},\dot{v})&Cov(\dot{v},\dot{s}) \\ Cov(\dot{s},u)&Cov(\dot{s},v)&Cov(\dot{s},s)&Cov(\dot{s},r)&Cov(\dot{s},\dot{u})&Cov(\dot{s},\dot{v})&Cov(\dot{s},\dot{s})\\ \end{bmatrix} P=E((X−E(X))(X−E(X)T))=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡Cov(u,u)Cov(v,u)Cov(s,u)Cov(r,u)Cov(u˙,u)Cov(v˙,u)Cov(s˙,u)Cov(u,v)Cov(v,v)Cov(s,v)Cov(r,v)Cov(u˙,v)Cov(v˙,v)Cov(s˙,v)Cov(u,s)Cov(v,s)Cov(s,s)Cov(r,s)Cov(u˙,s)Cov(v˙,s)Cov(s˙,s)Cov(u,r)Cov(v,r)Cov(s,r)Cov(r,r)Cov(u˙,r)Cov(v˙,r)Cov(s˙,r)Cov(u,u˙)Cov(v,u˙)Cov(s,u˙)Cov(r,u˙)Cov(u˙,u˙)Cov(v˙,u˙)Cov(s˙,u˙)Cov(u,v˙)Cov(v,v˙)Cov(s,v˙)Cov(r,v˙)Cov(u˙,v˙)Cov(v˙,v˙)Cov(s˙,v˙)Cov(u,s˙)Cov(v,s˙)Cov(s,s˙)Cov(r,s˙)Cov(u˙,s˙)Cov(v˙,s˙)Cov(s˙,s˙)⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤
- 其中 C o v ( u , v ) = E ( ( u − E ( u ) ) ( v − E ( v ) ) ) = E ( u v ) − E ( u ) E ( v ) = 1 n ∑ i = 1 n ( u i − u ~ ) ( ( v i − v ~ ) ) Cov(u,v) = E((u-E(u))(v-E(v)))=E(uv)-E(u)E(v) = \frac{1}{n} \sum_{i=1}^n (u_i - \tilde{u})((v_i - \tilde{v})) Cov(u,v)=E((u−E(u))(v−E(v)))=E(uv)−E(u)E(v)=n1∑i=1n(ui−u~)((vi−v~))为变量 u u u与变量 v v v的协方差, C o v ( u , u ) Cov(u,u) Cov(u,u)为变量 u u u的方差
# 初始化状态向量和协方差矩阵
z = np.array([324,148,24336,1]).reshape(-1,1) # 首次检测到的状态向量
def initiate(z):
mean_pos = z
mean_vel = np.zeros((3,1))
mean = np.r_[mean_pos, mean_vel]
covariance = np.eye(4+3, 4+3)
covariance[4:,4:] *= 1000. # 对未观测到的初始速度给出高的不确定性
covariance *= 10. # 赋予初始状态不确定性
return mean, covariance
mean, covariance = initiate(z)
print("初始化状态:\n",mean)
print("初始化协方差矩阵:\n",covariance)
初始化状态:
[[3.2400e+02]
[1.4800e+02]
[2.4336e+04]
[1.0000e+00]
[0.0000e+00]
[0.0000e+00]
[0.0000e+00]]
初始化协方差矩阵:
[[ 10. 0. 0. 0. 0. 0. 0.]
[ 0. 10. 0. 0. 0. 0. 0.]
[ 0. 0. 10. 0. 0. 0. 0.]
[ 0. 0. 0. 10. 0. 0. 0.]
[ 0. 0. 0. 0. 10000. 0. 0.]
[ 0. 0. 0. 0. 0. 10000. 0.]
[ 0. 0. 0. 0. 0. 0. 10000.]]
- 默认目标进行匀速运动且目标的宽高比不会变化,所以卡尔曼滤波的预测状态方程为 X i ^ = F X i − 1 + 0 \hat{X_i} = F X_{i-1} + 0 Xi^=FXi−1+0, F F F为状态转移矩阵(如下所示),即 当前状态 = 上一次的状态 + 速度 * 时间(时间默认为1)
- 对应的协方差矩阵 P i ^ = F P i − 1 F T + Q \hat{P_i} = FP_{i-1}F_T + Q Pi^=FPi−1FT+Q
- Q Q Q代表的是均值为0,方差为Q的高斯噪声的方差的矩阵,即认为预测不准确,添加高斯噪声(预测的不准确性为定值,即 Q Q Q为定值)
# 状态转移矩阵定义
import numpy as np
F = np.eye(4+3, 4+3)
for i in range(3):
F[i, 4 + i] = 1
print("状态转移矩阵:\n",F)
状态转移矩阵:
[[1. 0. 0. 0. 1. 0. 0.]
[0. 1. 0. 0. 0. 1. 0.]
[0. 0. 1. 0. 0. 0. 1.]
[0. 0. 0. 1. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0.]
[0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 1.]]
# 预测噪声定义
import numpy as np
Q = np.eye(4+3, 4+3)
Q[-1,-1] *= 0.01 # 预测时面积变化速度的噪声较小
Q[4:,4:] *= 0.01 # 预测时4个状态变化的噪声较小
print("预测噪声协方差矩阵:\n",Q)
预测噪声协方差矩阵:
[[1.e+00 0.e+00 0.e+00 0.e+00 0.e+00 0.e+00 0.e+00]
[0.e+00 1.e+00 0.e+00 0.e+00 0.e+00 0.e+00 0.e+00]
[0.e+00 0.e+00 1.e+00 0.e+00 0.e+00 0.e+00 0.e+00]
[0.e+00 0.e+00 0.e+00 1.e+00 0.e+00 0.e+00 0.e+00]
[0.e+00 0.e+00 0.e+00 0.e+00 1.e-02 0.e+00 0.e+00]
[0.e+00 0.e+00 0.e+00 0.e+00 0.e+00 1.e-02 0.e+00]
[0.e+00 0.e+00 0.e+00 0.e+00 0.e+00 0.e+00 1.e-04]]
# 卡尔曼预测方程
def predict(mean, covariance):
mean = np.dot(F, mean)
covariance = np.linalg.multi_dot((
F, covariance, F.T)) + Q
return mean, covariance
mean, covariance = predict(mean, covariance)
print("预测状态:\n",mean)
print("预测协方差矩阵:\n",covariance)
预测状态:
[[3.2400e+02]
[1.4800e+02]
[2.4336e+04]
[1.0000e+00]
[0.0000e+00]
[0.0000e+00]
[0.0000e+00]]
预测协方差矩阵:
[[10011. 0. 0. 0. 10000. 0.
0. ]
[ 0. 10011. 0. 0. 0. 10000.
0. ]
[ 0. 0. 10011. 0. 0. 0.
10000. ]
[ 0. 0. 0. 11. 0. 0.
0. ]
[10000. 0. 0. 0. 10000.01 0.
0. ]
[ 0. 10000. 0. 0. 0. 10000.01
0. ]
[ 0. 0. 10000. 0. 0. 0.
10000.0001]]
预测得到的
X
i
^
\hat{X_i}
Xi^为当前的状态和速度,转换为观测值通过观测转移矩阵
H
H
H(使用np.eye(4, 4+3)
),即 观测向量
=
H
X
i
^
+
0
= H\hat{X_i}+0
=HXi^+0,但通过传感器检测(这里为检测模型)得到的状态为
z
i
=
[
u
v
s
r
]
z_i = \begin{bmatrix}u \\ v \\ s \\ r \\ \end{bmatrix}
zi=⎣⎢⎢⎡uvsr⎦⎥⎥⎤
这时需要一个系数来判断应该相信预测的
X
i
^
\hat{X_i}
Xi^还是传感器检测到的
z
i
z_i
zi,这个系数就是卡尔曼增益矩阵
K
=
P
i
^
H
T
(
H
P
i
^
H
T
+
R
)
−
1
K = \hat{P_i}H^T(H\hat{P_i}H^T+R)^{-1}
K=Pi^HT(HPi^HT+R)−1
- R R R代表的是均值为0,方差为R的高斯噪声的方差的矩阵,即认为传感器检测不准确,添加高斯噪声(传感器的不准确性为定值,即 R R R为定值),在状态观测 H X i ^ + 0 H\hat{X_i}+0 HXi^+0,协方差 H P i ^ H T + R H\hat{P_i}H^T+R HPi^HT+R
- ( H P i ^ H T + R ) (H\hat{P_i}H^T+R) (HPi^HT+R)为观测向量的协方差矩阵,协方差矩阵为正定矩阵, P T = P − 1 P^T=P^{-1} PT=P−1
求到 K K K后,可以使用卡尔曼增益矩阵对状态进行更新,即预测的状态和传感器得到的状态通过卡尔曼增益进行合并得到的新状态为目标实际的状态
更新状态方程为
X
i
=
X
i
^
+
K
(
z
i
−
H
X
i
^
)
X_i=\hat{X_i} + K(z_i - H\hat{X_i})
Xi=Xi^+K(zi−HXi^)
协方差更新
P
i
=
P
i
^
−
K
H
P
i
^
P_i=\hat{P_i}-KH\hat{P_i}
Pi=Pi^−KHPi^
# 观测转移矩阵
H = np.eye(4, 4+3)
print("观测转移矩阵:\n",H)
观测转移矩阵:
[[1. 0. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0.]]
# 观测噪声定义
R = np.eye(4, 4)
R[2:,2:] *= 10. # s和r观测噪声*10
print("观测噪声协方差矩阵:\n",R)
观测噪声协方差矩阵:
[[ 1. 0. 0. 0.]
[ 0. 1. 0. 0.]
[ 0. 0. 10. 0.]
[ 0. 0. 0. 10.]]
# 卡尔曼状态更新
import scipy.linalg
def project( mean, covariance): # 观测转移
mean = np.dot(H, mean)
covariance = np.linalg.multi_dot((
H, covariance, H.T))
return mean, covariance + R # 状态向量4*1,协方差矩阵4*4
def update( mean, covariance, z):
projected_mean, projected_cov = project(mean, covariance) # 观测向量4*1,协方差矩阵4*4
chol_factor, lower = scipy.linalg.cho_factor(
projected_cov, lower=True, check_finite=False) # 分解协方差矩阵 P(正定矩阵)=L*L’
kalman_gain = scipy.linalg.cho_solve(
(chol_factor, lower), np.dot(covariance, H.T).T,
check_finite=False).T # 求卡尔曼增益矩阵
innovation = z - projected_mean
new_mean = mean + np.dot(kalman_gain,innovation) # 更新状态向量
new_covariance = covariance - np.linalg.multi_dot(( # 更新协方差矩阵
kalman_gain, projected_cov, kalman_gain.T))
return new_mean, new_covariance
z = np.array([404,198,42336,0.907407]).reshape(-1,1) # 第二次检测到的状态向量
new_mean, new_covariance = update( mean, covariance, z)
print("更新状态:\n",new_mean)
print("更新协方差矩阵:\n",new_covariance)
更新状态:
[[4.03992010e+02]
[1.97995006e+02]
[4.23180377e+04]
[9.51498905e-01]
[7.99041151e+01]
[4.99400719e+01]
[1.79622792e+04]]
更新协方差矩阵:
[[ 0.99990012 0. 0. 0. 0.99880144 0.
0. ]
[ 0. 0.99990012 0. 0. 0. 0.99880144
0. ]
[ 0. 0. 9.99002096 0. 0. 0.
9.97904401]
[ 0. 0. 0. 5.23809524 0. 0.
0. ]
[ 0.99880144 0. 0. 0. 11.99561726 0.
0. ]
[ 0. 0.99880144 0. 0. 0. 11.99561726
0. ]
[ 0. 0. 9.97904401 0. 0. 0.
20.95609242]]
匈牙利匹配
求解二分图的最大匹配问题,即集合U与集合V中匹配的问题,详细流程可看教程匈牙利算法&KM算法
使用函数_, x, y = lap.lapjv(cost_matrix, extend_cost=True)
或x, y = linear_sum_assignment(cost_matrix)
求出匹配关系
- cost_matrix为关系矩阵,即u中每个元素与v中每个元素的距离,距离越小越相似
linear_sum_assignment()
方法中x为列表表示匹配上的U中元素的索引,y为匹配上的V中的元素的索引
def linear_assignment(cost_matrix): # 匈牙利匹配法
try:
import lap
_, x, y = lap.lapjv(cost_matrix, extend_cost=True)
return np.array([[y[i],i] for i in x if i >= 0]) #
except ImportError:
from scipy.optimize import linear_sum_assignment
x, y = linear_sum_assignment(cost_matrix)
return np.array(list(zip(x, y)))
SORT跟踪流程
- 第一帧检测到的 m m m个检测框送入跟踪器,加入到跟踪目标中,此时跟踪器中有 m m m个目标
- 第二帧检测到 n n n个检测框,先对 m m m个目标依次进行卡尔曼状态预测,使用预测得到的 m m m个目标状态与检测得到的 n n n个检测框计算 1 − I O U 1-IOU 1−IOU(表示距离,距离越小越相似),得到 m ∗ n m* n m∗n的距离矩阵
- m ∗ n m* n m∗n的距离矩阵送入匈牙利算法进行匹配得到 a ∗ 2 ( a ≤ m , b ≤ n ) a* 2(a\leq m,b\leq n) a∗2(a≤m,b≤n)的匹配矩阵,为匹配上的目标中的索引和检测框中的索引
- 对成功匹配到的 a a a个目标使用其匹配到的检测框进行卡尔曼更新
- 没有匹配到的 n − a n-a n−a个检测框,初始化加入到跟踪目标中,此时跟踪器中有 m + n − a m+n-a m+n−a个目标
- 没有匹配到的 m − a m-a m−a个目标,判断其连续消失帧数是否超过阈值,超过阈值则删除
- 判断所有目标的连续跟踪帧数是否超过阈值,超过则输出状态向量 X X X,即这个目标确实存在
- 后面的帧处理方法同第二帧
定义卡尔曼计算过程
class KalmanFilter(object):
def __init__(self):
self.ndim,self.v_dim, dt = 4,3, 1.
# Create Kalman filter model matrices.
self.F = np.eye(self.ndim+self.v_dim, self.ndim+self.v_dim)
for i in range(self.v_dim):
self.F[i, self.ndim + i] = dt #F
self.H = np.eye(self.ndim, self.ndim+self.v_dim) #H
# Motion and observation uncertainty are chosen relative to the current
# state estimate. These weights control the amount of uncertainty in
# the model. This is a bit hacky.
self.R = np.eye(self.ndim, self.ndim)
self.Q = np.eye(self.ndim+self.v_dim, self.ndim+self.v_dim)
self.R[2:,2:] *= 10. # 除x,y,其他的观测噪声*10
self.Q[-1,-1] *= 0.01 # 预测噪声
self.Q[4:,4:] *= 0.01
def initiate(self, measurement): # 初始化状态和协方差矩阵
mean_pos = measurement
mean_vel = np.zeros((3,1))
mean = np.r_[mean_pos, mean_vel]
covariance = np.eye(self.ndim+self.v_dim, self.ndim+self.v_dim)
covariance[4:,4:] *= 1000. # 对未观测到的初始速度给出高的不确定性,k时刻的状态变量的协方差矩阵P(k) = F*P(k-1)*F’+ Q
covariance *= 10. # 默认定义的状态变量的协方差矩阵(误差相关矩阵P, 度量估计值的精确程度)是np.eye(dim_x),将P中的数值与10, 1000相乘,赋值不确定性
return mean, covariance
def predict(self, mean, covariance): # 状态转移预测
mean = np.dot(self.F, mean)
covariance = np.linalg.multi_dot((
self.F, covariance, self.F.T)) + self.Q
return mean, covariance
def project(self, mean, covariance): # 观测转移
mean = np.dot(self.H, mean)
covariance = np.linalg.multi_dot((
self.H, covariance, self.H.T))
return mean, covariance + self.R # 状态向量4*1,协方差矩阵4*4
def update(self, mean, covariance, measurement): # 更新目标
projected_mean, projected_cov = self.project(mean, covariance) # 状态向量4*1,协方差矩阵4*4
chol_factor, lower = scipy.linalg.cho_factor(
projected_cov, lower=True, check_finite=False) # 分解协方差矩阵,返回下三角矩阵和lower布尔值,a(正定矩阵)=L*L’
kalman_gain = scipy.linalg.cho_solve(
(chol_factor, lower), np.dot(covariance, self.H.T).T,
check_finite=False).T # 更新卡尔曼增益矩阵,给定协方差矩阵的分解矩阵,求协方差矩阵*x’ = b,即K = x = 协方差矩阵的逆矩阵*b,协方差矩阵的转置=协方差矩阵的逆
innovation = measurement - projected_mean
new_mean = mean + np.dot(kalman_gain,innovation) # 更新状态向量
new_covariance = covariance - np.linalg.multi_dot(( # 更新协方差矩阵
kalman_gain, projected_cov, kalman_gain.T))
return new_mean, new_covariance
定义跟踪目标
def convert_bbox_to_z(bbox): #将bbox由[x1,y1,x2,y2]形式转为 [框中心点x,框中心点y,框面积s,宽高比例r]^T
w = bbox[2] - bbox[0]
h = bbox[3] - bbox[1]
x = bbox[0] + w/2.
y = bbox[1] + h/2.
s = w * h #scale is just area
r = w / float(h)
return np.array([x, y, s, r]).reshape((4, 1)) #将数组转为4行一列形式,即[x,y,s,r]^T
def convert_x_to_bbox(x,score=None): #将[x,y,s,r]形式的bbox,转为[x1,y1,x2,y2]形式
w = np.sqrt(x[2]*x[3]) #w=sqrt(w*h * w/h)
h = x[2]/w #h=w*h/w
if(score==None): #如果检测框不带置信度
return np.array([x[0]-w/2.,x[1]-h/2.,x[0]+w/2.,x[1]+h/2.]).reshape((1,4)) #返回[x1,y1,x2,y2]
else: #如果加测框带置信度
return np.array([x[0]-w/2.,x[1]-h/2.,x[0]+w/2.,x[1]+h/2.,score]).reshape((1,5)) #返回[x1,y1,x2,y2,score]
class KalmanBoxTracker(object):
def __init__(self,bbox,id,kf):
#定义匀速模型
self.x,self.P = kf.initiate(convert_bbox_to_z(bbox)) #将bbox转为 [x,y,s,r]^T形式,赋给状态变量X的前4位
self.time_since_update = 0 # 距离上次更新次数
self.id = id
self.history = []
self.hits = 0
self.hit_streak = 0 # 连续更新次数
self.age = 0
self.score = bbox[4]
def update(self,bbox,kf): # 匹配到才更新
"""
Updates the state vector with observed bbox.
"""
self.time_since_update = 0
self.history = []
self.hits += 1
self.hit_streak += 1
self.score = bbox[4]
self.x, self.P = kf.update(self.x,self.P,convert_bbox_to_z(bbox))
def predict(self,kf): # 先预测
if((self.x[6]+self.x[2])<=0): #当面积预测<0时,将面积速度改为0,面积不变
self.x[6] *= 0.0
self.x, self.P = kf.predict(self.x, self.P)
self.age += 1
if(self.time_since_update>0):
self.hit_streak = 0
self.time_since_update += 1
self.history.append(convert_x_to_bbox(self.x))
return self.history[-1]
def get_state(self):
return convert_x_to_bbox(self.x,self.score)
定义匹配过程
def linear_assignment(cost_matrix): # 匈牙利匹配法,cost_matrix为左目标与右目标的每个权重(这里为iou)
try:
import lap
_, x, y = lap.lapjv(cost_matrix, extend_cost=True)
return np.array([[y[i],i] for i in x if i >= 0]) #
except ImportError:
from scipy.optimize import linear_sum_assignment
x, y = linear_sum_assignment(cost_matrix)
return np.array(list(zip(x, y)))
def iou_batch(bb_test, bb_gt): # 实际框与预测框进行iou计算
"""
From SORT: Computes IOU between two bboxes in the form [x1,y1,x2,y2]
"""
bb_gt = np.expand_dims(bb_gt, 0) # 1*n*4
bb_test = np.expand_dims(bb_test, 1) # m*1*4
C_x1 = np.minimum(bb_test[..., 0], bb_gt[..., 0])
C_y1 = np.minimum(bb_test[..., 1], bb_gt[..., 1])
C_x2 = np.maximum(bb_test[..., 2], bb_gt[..., 2])
C_y2 = np.maximum(bb_test[..., 3], bb_gt[..., 3])
C_w = np.maximum(0., C_x2 - C_x1)
C_h = np.maximum(0., C_y2 - C_y1)
g = C_w*C_h
xx1 = np.maximum(bb_test[..., 0], bb_gt[..., 0]) # 1*n m*1 自动扩展为m*n
yy1 = np.maximum(bb_test[..., 1], bb_gt[..., 1])
xx2 = np.minimum(bb_test[..., 2], bb_gt[..., 2])
yy2 = np.minimum(bb_test[..., 3], bb_gt[..., 3])
w = np.maximum(0., xx2 - xx1)
h = np.maximum(0., yy2 - yy1)
i = w * h
u = ((bb_test[..., 2] - bb_test[..., 0]) * (bb_test[..., 3] - bb_test[..., 1]) #IOU=(bb_test和bb_gt框相交部分面积)/(bb_test框面积+bb_gt框面积 - 两者相交面积)
+ (bb_gt[..., 2] - bb_gt[..., 0]) * (bb_gt[..., 3] - bb_gt[..., 1]) - i)
#iou = i/u
giou = i/u - (g-u)/g
return(1-giou)
def associate_detections_to_trackers(detections,trackers,iou_threshold = 0.3): #用于将检测与跟踪进行匹配
if(len(trackers)==0):
return np.empty((0,2),dtype=int), np.arange(len(detections)), np.empty((0,5),dtype=int)
iou_matrix = iou_batch(detections, trackers)# 实际框与预测框的iou
if min(iou_matrix.shape) > 0:
a = (iou_matrix < iou_threshold).astype(np.int32)
if a.sum(1).max() == 1 and a.sum(0).max() == 1: # 只有一个实际框与预测框的iou>阈值
matched_indices = np.stack(np.where(a), axis=1) # 匹配上的每个实际框detections的索引对应预测框trackers的索引
else:
matched_indices = linear_assignment(iou_matrix) # 匈牙利匹配
else:
matched_indices = np.empty(shape=(0,2))
unmatched_detections = [] # 没有匹配到的检测框与预测框
for d, det in enumerate(detections):
if(d not in matched_indices[:,0]):
unmatched_detections.append(d) #如果检测器中第d个检测结果不在匹配结果索引中,则d未匹配上
unmatched_trackers = []
for t, trk in enumerate(trackers):
if(t not in matched_indices[:,1]): # 如果跟踪器中第t个跟踪结果不在匹配结果索引中,则t未匹配上
unmatched_trackers.append(t)
#filter out matched with low IOU
matches = []
for m in matched_indices: # 匹配到的预测框与检测框必须>阈值
if(iou_matrix[m[0], m[1]]>iou_threshold):
unmatched_detections.append(m[0])
unmatched_trackers.append(m[1])
else:
matches.append(m.reshape(1,2))
if(len(matches)==0):
matches = np.empty((0,2),dtype=int)
else:
matches = np.concatenate(matches,axis=0)
return matches, np.array(unmatched_detections), np.array(unmatched_trackers)
定义跟踪器
class Sort(object): # 跟踪器
def __init__(self, max_age=1, min_hits=3, iou_threshold=0.8):
"""
Sets key parameters for SORT
"""
self.max_age = max_age # 最大遗失帧数
self.min_hits = min_hits # 确认目标存在的最小连续跟踪帧数
self.iou_threshold = iou_threshold # iou阈值
self.trackers = [] # 存放所有目标,每个目标都是KalmanBoxTracker对象
self.frame_count = 0 # 帧计数
self.next_id = 0 # id计数,下一个目标的id
self.kf = KalmanFilter() # 卡尔曼滤波器
def update(self, dets=np.empty((0, 5))): # 输入的是检测结果[x1,y1,x2,y2,score]形式
trackers_pop_id = []
self.frame_count += 1
# get predicted locations from existing trackers.
trks = np.zeros((len(self.trackers), 5)) # 根据当前所有卡尔曼跟踪器的个数创建二维零矩阵,维度为:卡尔曼跟踪器ID个数x 5 (这5列内容为bbox与ID)
to_del = [] # 存放待删除
ret = [] # 存放最后返回的结果
for t, trk in enumerate(trks): # 循环遍历卡尔曼跟踪器列表
pos = self.trackers[t].predict(self.kf)[0] # 用卡尔曼跟踪器t 预测 对应物体在当前帧中的bbox
trk[:] = [pos[0], pos[1], pos[2], pos[3], 0]
if np.any(np.isnan(pos)): # 预测位置为空直接删除目标
to_del.append(t)
trks = np.ma.compress_rows(np.ma.masked_invalid(trks)) # 将预测为空的卡尔曼跟踪器所在行删除,最后trks中存放的是上一帧中被跟踪的所有物体在当前帧中预测的非空bbox
for t in reversed(to_del):
trackers_pop_id.apend(self.trackers[t].id+1)
self.trackers.pop(t) # 如果预测的bbox为空,那么将第t个卡尔曼跟踪目标删除
matched, unmatched_dets, unmatched_trks = associate_detections_to_trackers(dets,trks, self.iou_threshold) #对传入的检测结果 与 上一帧跟踪物体在当前帧中预测的结果做关联,返回匹配的目标矩阵matched, 新增目标的矩阵unmatched_dets, 离开画面的目标矩阵unmatched_trks
for m in matched: # 匹配成功的目标进行卡尔曼更新
self.trackers[m[1]].update(dets[m[0], :],self.kf)
#对于新增的未匹配的检测结果,创建并初始化跟踪器
for i in unmatched_dets: # 新增目标
trk = KalmanBoxTracker(dets[i,:],self.next_id,self.kf) # 将新增的未匹配的检测结果dets[i,:]传入KalmanBoxTracker
self.trackers.append(trk) # 将新创建和初始化的跟踪器trk 传入trackers
self.next_id += 1
i = len(self.trackers)
for trk in reversed(self.trackers): # 对新的卡尔曼跟踪器集进行倒序遍历
d = trk.get_state()[0] # 获取trk跟踪器的状态 [x1,y1,x2,y2,score]
if (trk.time_since_update < 1) and (trk.hit_streak >= self.min_hits or self.frame_count <= self.min_hits):
ret.append(np.concatenate((d,[trk.id+1])).reshape(1,-1))
i -= 1
# remove dead tracklet
if(trk.time_since_update > self.max_age):
trackers_pop_id.append(self.trackers[i].id+1)
self.trackers.pop(i)
if(len(ret)>0):
return np.concatenate(ret),trackers_pop_id
ret.append(np.concatenate((d,[trk.id+1])).reshape(1,-1))
i -= 1
# remove dead tracklet
if(trk.time_since_update > self.max_age):
trackers_pop_id.append(self.trackers[i].id+1)
self.trackers.pop(i)
if(len(ret)>0):
return np.concatenate(ret),trackers_pop_id
return np.empty((0,6)),trackers_pop_id