论文:Improving Object Detection With One Line of Code
Github:https://github.com/bharatsingh430/soft-nms
ICCV 2017
Hard nms VS soft nms:
B:候选框
S:候选框得分score,和B一一对应。
D:最终的输出候选框结果
Nt:IOU阈值
M:当前循环中得分最高的候选框
Hard nms和soft nms的区别就在于红色框和绿色框的区别。
Hard nms直接将IOU得分大于预设阈值的候选框去掉。而soft nms没有去掉该候选框,而是将该候选框的得分score基于函数f进行了降低。其中函数f是一个和iou相关的函数。
物体检测中,边框提取流程:
Hard nms:
正常的nms当框之间的IOU阈值大于Nt时,就会将score分数较低的候选框去掉,也就是将其得分score置为0。而对于候选框之间IOU下雨阈值Nt时,不做任何处理,保持原始的阈值si。
Soft nms:
soft nms的思想,就是要降低IOU大于阈值Nt的候选框的得分score,而不是将其去掉。这样很自然的想到上面的公式。
当框框之间IOU小于阈值Nt时候,不做处理,大于阈值Nt的时候,将得分score si和(1-iou)相乘,得到处理后的得分。由于1-iou介于0-1之间,这样就实现了高IOU候选框得分得到降低操作。
但是这样会引入一些问题,
- 该soft nms函数是一个分段函数,而不是一个连续的函数。
- 只对大于IOU阈值的候选框得分进行缩小,容易产生最终所有候选框得分的断层。
为了解决上面的问题,提出了下面的处理方法,后续实现都是基于下面的方法。
上面的式子,类似高斯惩罚函数。
根据指数函数的小于0区间段可以看出,该函数实现了对于所有的候选框的得分score的缩小。对于IOU越大的候选框得分,惩罚的越大,而对于IOU很小的候选框得分惩罚的较少。
代码实现:
hard nms:
def hard_nms(box_scores, iou_threshold, top_k=-1, candidate_size=200):
"""
Args:
box_scores (N, 5): boxes in corner-form and probabilities.
iou_threshold: intersection over union threshold.
top_k: keep top_k results. If k <= 0, keep all the results.
candidate_size: only consider the candidates with the highest scores.
Returns:
picked: a list of indexes of the kept boxes
"""
scores = box_scores[:, -1]
boxes = box_scores[:, :-1]
picked = []
_, indexes = scores.sort(descending=True)
indexes = indexes[:candidate_size]
while len(indexes) > 0:
current = indexes[0]
picked.append(current.item())
if 0 < top_k == len(picked) or len(indexes) == 1:
break
current_box = boxes[current, :]
indexes = indexes[1:]
rest_boxes = boxes[indexes, :]
iou = iou_of(
rest_boxes,
current_box.unsqueeze(0),
)
indexes = indexes[iou <= iou_threshold]
return box_scores[picked, :]
soft nms:
def soft_nms(box_scores, score_threshold, sigma=0.5, top_k=-1):
"""Soft NMS implementation.
References:
https://arxiv.org/abs/1704.04503
https://github.com/facebookresearch/Detectron/blob/master/detectron/utils/cython_nms.pyx
Args:
box_scores (N, 5): boxes in corner-form and probabilities.
score_threshold: boxes with scores less than value are not considered.
sigma: the parameter in score re-computation.
scores[i] = scores[i] * exp(-(iou_i)^2 / simga)
top_k: keep top_k results. If k <= 0, keep all the results.
Returns:
picked_box_scores (K, 5): results of NMS.
"""
picked_box_scores = []
while box_scores.size(0) > 0:
max_score_index = torch.argmax(box_scores[:, 4])
cur_box_prob = torch.tensor(box_scores[max_score_index, :])
picked_box_scores.append(cur_box_prob)
if len(picked_box_scores) == top_k > 0 or box_scores.size(0) == 1:
break
cur_box = cur_box_prob[:-1]
box_scores[max_score_index, :] = box_scores[-1, :]
box_scores = box_scores[:-1, :]
ious = iou_of(cur_box.unsqueeze(0), box_scores[:, :-1])
box_scores[:, -1] = box_scores[:, -1] * torch.exp(-(ious * ious) / sigma)
box_scores = box_scores[box_scores[:, -1] > score_threshold, :]
if len(picked_box_scores) > 0:
return torch.stack(picked_box_scores)
else:
return torch.tensor([])
实验结果:
总结;
(1)对于遮挡较大,密度较大的场合,优先使用soft nms,可以增加模型的准确性。
(2)在PASCAL VOC 2007数据集上,使用soft nms可以使得RFCN提升1.7%准确性,Faster-RCNN提升1.7%准确性。MS-COCO数据集上,RFCN提升1.3%准确性,Faster-RCNN提升1.1%准确性。