论文1: Soft-NMS – Improving Object Detection With One Line of Code (ICCV2017) 速达>>
论文2: Softer-NMS–Bounding Box Regression with Uncertainty for Accurate Object Detection(CVPR 2019)速达>>
文章目录
针对问题
传统的NMS存在的问题:
- 同类的两目标重合度比较大时,容易误删,如 Figure 1
- 如果过目标附近的预测框均不好呢?Figure 2 (a)的情况如何抉择,两个框都不是好的选择
- IoU 和分类 score 并不强相关,最高 score 的框不一定是最好的,如 Figure 2 (b)
![]() |
![]() |
Soft-NMS
Soft NMS算法流程
NMS 删除框的方式太 Hard,容易误删,针对该问题改进 NMS 得到了Soft NMS:IOU超过阈值时不是立马将其当做重复框剔除,而是降低其分数,最后剔除分数低的,大致流程如下:
传统NMS处理方式比较刚,超过设定阈值就删除该框,容易误伤友军(两个目标的框被当做一个目标的给处理了):
s
i
=
{
s
i
,
i
o
u
(
M
,
b
i
)
≤
N
t
0
,
i
o
u
(
M
,
b
i
)
≥
N
t
\begin{aligned}s_i = \left\{\begin{aligned} s_i,\qquad iou(\mathcal M,b_i)\leq N_t \\ 0,\qquad iou(\mathcal M,b_i)\geq N_t \end{aligned}\right.\end{aligned}
si={si,iou(M,bi)≤Nt0,iou(M,bi)≥Nt
Soft NMS则更温柔,多给了重合度高的框一个证明自己的机会,以一个降低该框 score 方式(IoU越高则分数应该越低),让其重新去后面排队。最容易想到的就是用线性的方法,将该框 score 和 IoU 直接相乘,论文中给出的是乘以
(
1
−
i
o
u
(
M
,
b
i
)
(1-iou(\mathcal M,b_i)
(1−iou(M,bi):
s
i
=
{
s
i
,
i
o
u
(
M
,
b
i
)
≤
N
t
s
i
(
1
−
i
o
u
(
M
,
b
i
)
)
,
i
o
u
(
M
,
b
i
)
≥
N
t
\begin{aligned}s_i = \left\{\begin{aligned} &s_i,\qquad\qquad\qquad\qquad\quad iou(\mathcal M,b_i)\leq N_t \\ &s_i(1-iou(\mathcal M,b_i)),\qquad iou(\mathcal M,b_i)\geq N_t \end{aligned}\right.\end{aligned}
si={si,iou(M,bi)≤Ntsi(1−iou(M,bi)),iou(M,bi)≥Nt
但上式不连续,所以实际上用 高斯函数:
s
i
=
s
i
e
−
i
o
u
(
M
,
b
i
)
2
σ
,
∀
b
i
∉
D
s_i = s_ie^{-\frac{iou(\mathcal M,b_i)^2}{\sigma}},\forall b_i \notin \mathcal D
si=sie−σiou(M,bi)2,∀bi∈/D
当然,最后还是得选择一个合适的score阈值来去掉那些重复框
Soft-NMS算法实现
def cpu_soft_nms(np.ndarray[float, ndim=2] boxes, float sigma=0.5, float Nt=0.3, float threshold=0.001, unsigned int method=0):
cdef unsigned int N = boxes.shape[0]
cdef float iw, ih, box_area
cdef float ua
cdef int pos = 0
cdef float maxscore = 0
cdef int maxpos = 0
cdef float x1,x2,y1,y2,tx1,tx2,ty1,ty2,ts,area,weight,ov
for i in range(N):
# 在i之后找到confidence最高的框,标记为max_pos
maxscore = boxes[i, 4]
maxpos = i
tx1 = boxes[i,0]
ty1 = boxes[i,1]
tx2 = boxes[i,2]
ty2 = boxes[i,3]
ts = boxes[i,4]
pos = i + 1
# 找到max的框
while pos < N:
if maxscore < boxes[pos, 4]:
maxscore = boxes[pos, 4]
maxpos = pos
pos = pos + 1
# 交换max_pos位置和i位置的数据
# add max box as a detection
boxes[i,0] = boxes[maxpos,0]
boxes[i,1] = boxes[maxpos,1]
boxes[i,2] = boxes[maxpos,2]
boxes[i,3] = boxes[maxpos,3]
boxes[i,4] = boxes[maxpos,4]
# swap ith box with position of max box
boxes[maxpos,0] = tx1
boxes[maxpos,1] = ty1
boxes[maxpos,2] = tx2
boxes[maxpos,3] = ty2
boxes[maxpos,4] = ts
tx1 = boxes[i,0]
ty1 = boxes[i,1]
tx2 = boxes[i,2]
ty2 = boxes[i,3]
ts = boxes[i,4]
# 交换完毕
# 开始循环
pos = i + 1
while pos < N:
# 先记录内层循环的数据bi
x1 = boxes[pos, 0]
y1 = boxes[pos, 1]
x2 = boxes[pos, 2]
y2 = boxes[pos, 3]
s = boxes[pos, 4]
# 计算iou
area = (x2 - x1 + 1) * (y2 - y1 + 1)
iw = (min(tx2, x2) - max(tx1, x1) + 1) # 计算两个框交叉矩形的宽度,如果宽度小于等于0,即没有相交,因此不需要判断
if iw > 0:
ih = (min(ty2, y2) - max(ty1, y1) + 1) # 同理
if ih > 0:
ua = float((tx2 - tx1 + 1) * (ty2 - ty1 + 1) + area - iw * ih) #计算union面积
ov = iw * ih / ua #iou between max box and detection box
if method == 1: # linear
if ov > Nt:
weight = 1 - ov
else:
weight = 1
elif method == 2: # gaussian
weight = np.exp(-(ov * ov)/sigma)
else: # original NMS
if ov > Nt:
weight = 0
else:
weight = 1
boxes[pos, 4] = weight*boxes[pos, 4]
# if box score falls below threshold, discard the box by swapping with last box
# update N
if boxes[pos, 4] < threshold:
boxes[pos,0] = boxes[N-1, 0]
boxes[pos,1] = boxes[N-1, 1]
boxes[pos,2] = boxes[N-1, 2]
boxes[pos,3] = boxes[N-1, 3]
boxes[pos,4] = boxes[N-1, 4]
N = N - 1
pos = pos - 1
pos = pos + 1
keep = [i for i in range(N)]
return keep
实验
Softer-NMS
Soft-NMS 针对的是误删的问题,对另外两个问题没有考虑,而要解释 Softer-NMS 首先得介绍 Bounding Box Regression with KL Loss
Bounding Box Regression with KL Loss
在原本分类和回归两个支路的基础上,增加了一条关于Box std(目标框与对应预测框的距离)的回归支路,定位的同时估计定位置信度,指导修正预测框位置
假设预测框位置与目标框位置间的距离分布为高斯分布,
x
e
x_e
xe 表示预测框位置,用一维高斯分布描述如下:
P
Θ
(
x
)
=
1
2
π
σ
2
e
−
(
x
−
x
e
)
2
2
σ
2
P_\Theta(x) = \frac1{2\pi\sigma^2}e^{-\frac{(x-x_e)^2}{2\sigma^2}}
PΘ(x)=2πσ21e−2σ2(x−xe)2
目标框位置视为狄拉克分布(只存在有没有的问题)
P
D
(
x
)
=
δ
(
x
−
x
g
)
P_D(x) = \delta(x - x_g)
PD(x)=δ(x−xg)
狄拉克函数性质: ∫ − ∞ + ∞ P D ( x ) d x = 1 \int^{+\infty}_{-\infty}P_D(x)dx=1 ∫−∞+∞PD(x)dx=1
KL距离:衡量两个分布间的差异,也称为 KL散度(Kullback-Leibler divergence)、相对熵(relative entropy),令
p
,
q
p,q
p,q分别为真实和假设两个分布,则两分布间的 KL 距离为:
D
K
L
(
p
∣
∣
q
)
=
E
p
[
log
p
(
x
)
⏞
真实分布
q
(
x
)
⏟
假设分布
]
=
∑
x
∈
χ
p
(
x
)
log
p
(
x
)
q
(
x
)
=
∑
x
∈
χ
[
p
(
x
)
log
p
(
x
)
−
p
(
x
)
log
q
(
x
)
]
=
∑
x
∈
χ
p
(
x
)
log
p
(
x
)
−
∑
x
∈
χ
p
(
x
)
log
q
(
x
)
\begin{aligned} D_{KL}(p||q)&=E_p\bigg[\log\frac{\overbrace{p(x)}^{\color{blue}\text{真实分布}} }{\underbrace{q(x)}_{\color{blue}\text{假设分布}} } \bigg]=\sum_{x∈χ} p(x)\log\frac{p(x)}{q(x)}\\ &=\sum_{x∈χ}[p(x)\log p(x)−p(x)\log q(x)]\\ &=\sum_{x∈χ}p(x)\log p(x)−\sum_{x∈χ}p(x)\log q(x)\\ \end{aligned}
DKL(p∣∣q)=Ep[log假设分布
q(x)p(x)
真实分布]=x∈χ∑p(x)logq(x)p(x)=x∈χ∑[p(x)logp(x)−p(x)logq(x)]=x∈χ∑p(x)logp(x)−x∈χ∑p(x)logq(x)
所以优化目标就是让预测框分布与目标框分布接近:
又有:
对
x
e
x_e
xe 和
σ
\sigma
σ 分别求偏导:
σ
\sigma
σ 作为分母,为避免梯度爆炸,令
α
=
log
(
σ
2
)
\alpha=\log(\sigma^2)
α=log(σ2) 代替
σ
\sigma
σ,:
参照
Smooth
L
1
L
o
s
s
\text{Smooth}\ {L_1}\ Loss
Smooth L1 Loss 的形式:
Smooth
L
1
(
x
)
=
{
0.5
x
2
i
f
∣
x
∣
<
1
∣
x
∣
−
0.5
o
t
h
e
r
w
i
s
e
\text{Smooth}_{L_1}(x)=\bigg\{\begin{aligned} &0.5x^2 \qquad \quad\; if\;|x|<1\\ &|x|-0.5\qquad otherwise \end{aligned}
SmoothL1(x)={0.5x2if∣x∣<1∣x∣−0.5otherwise
当
∣
x
g
−
x
e
>
1
∣
|x_g-x_e >1|
∣xg−xe>1∣时,
L
r
e
g
L_{reg}
Lreg 取下列形式
最终的损失形式为:
Smooth
L
r
e
g
(
x
)
=
{
e
−
α
2
(
x
g
−
x
e
)
2
+
1
2
α
i
f
∣
x
g
−
x
e
∣
<
1
e
−
α
(
∣
x
g
−
x
e
∣
−
1
2
)
+
1
2
α
o
t
h
e
r
w
i
s
e
{\color{blue}\text{Smooth}_{L_{reg}}}(x)=\left\{\begin{aligned} &\frac{e^{-\alpha}}{2}(x_g-x_e)^2+\frac 12 \alpha \qquad \qquad\;\; if\;|xg-x_e|<1\\ &{e^{-\alpha}}\left(|x_g-x_e|-\frac12\right)+\frac 12 \alpha\qquad otherwise \end{aligned} \right.
SmoothLreg(x)=⎩⎪⎪⎨⎪⎪⎧2e−α(xg−xe)2+21αif∣xg−xe∣<1e−α(∣xg−xe∣−21)+21αotherwise
Softer-NMS算法
Softer-NMS 与标准 NMS 不同的是:超过阈值的框根据 IoU 置信度加权合并多个框得到最终框,而不是直接舍弃
IoU置信度与两个因素有关:
- 方差:方差大置信度低(认为离目标远)
- IOU:IOU小置信度低(与 score 最高的Box)
分类分数与权重无关,因为得分较低的盒子可能具有较高的定位置信度
Softer-NMS 算法流程
实验
参考文献
【1】Soft(er)-NMS:非极大值抑制算法的两个改进算法
【2】softer-nms论文学习详解
【3】Soft NMS算法笔记