目标检测中的NMS，soft NMS，softer NMS，Weighted Boxes Fusion_softer-nms需要重新训练模型吗-CSDN博客

本文链接：https://blog.csdn.net/practical_sharp/article/details/114980578

本文介绍了非最大值抑制(NMS)算法及其在目标检测中的应用，并探讨了NMS存在的问题。随后，文章深入讲解了Soft-NMS算法的原理与实现，以及Softer-NMS和WBF等更先进的算法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

NMS

非最大值抑制算法，诞生至少50年了。

在经典的两阶段目标检测算法中，为了提高对于目标的召回率，在anchor阶段会生成密密麻麻的anchor框。

所以在后处理的时候，会存在着很多冗余框对应着同一个目标。

因此NMS就是后处理中去除冗余框的必不可少的步骤。

在这里插入图片描述

NMS算法的具体流程：
输入 boxes，scores， iou_threshold

step-1：将所有检出的output_bbox按cls score划分（如pascal voc分20个类，也即将output_bbox按照其对应的cls score划分为21个集合，1个bg类，背景类直接剔除）；
step-2：在每个类集合内根据各个bbox的cls score做降序排列，得到一个降序的list_k；
step-3：对所有的list_k进行遍历，如有20个类，就得对这20个list都进行遍历。从list_k中top1 cls score开始，计算该bbox_x与list中其他bbox_y的IoU，若IoU大于阈值T，则剔除该bbox_y，最终保留bbox_x，从list_k中取出，保存在output_k中最后作为结果输出；
step-4：继续选择list_k中top1 cls score，重复step-3中的迭代操作，直至list_k中所有bbox都完成筛选；
step-5：对每个集合的list_k，重复step-3、4中的迭代操作，直至所有list_k都完成筛选；

在这里插入图片描述

import torch
from torch import Tensor
from typing import Tuple
from ._box_convert import _box_cxcywh_to_xyxy, _box_xyxy_to_cxcywh, _box_xywh_to_xyxy, _box_xyxy_to_xywh
import torchvision
from torchvision.extension import _assert_has_ops


[docs]def nms(boxes: Tensor, scores: Tensor, iou_threshold: float) -> Tensor:
    """
    Performs non-maximum suppression (NMS) on the boxes according
    to their intersection-over-union (IoU).

    NMS iteratively removes lower scoring boxes which have an
    IoU greater than iou_threshold with another (higher scoring)
    box.

    If multiple boxes have the exact same score and satisfy the IoU
    criterion with respect to a reference box, the selected box is
    not guaranteed to be the same between CPU and GPU. This is similar
    to the behavior of argsort in PyTorch when repeated values are present.

    Args:
        boxes (Tensor[N, 4])): boxes to perform NMS on. They
            are expected to be in ``(x1, y1, x2, y2)`` format with ``0 <= x1 < x2`` and
            ``0 <= y1 < y2``.
        scores (Tensor[N]): scores for each one of the boxes
        iou_threshold (float): discards all overlapping boxes with IoU > iou_threshold

    Returns:
        keep (Tensor): int64 tensor with the indices
            of the elements that have been kept
            by NMS, sorted in decreasing order of scores
    """
    _assert_has_ops()
    return torch.ops.torchvision.nms(boxes, scores, iou_threshold)

NMS存在的一些问题

物体重叠：如下面第一张图，会有一个最高分数的框，如果使用nms的话就会把其他置信度稍低，但是表示另一个物体的预测框删掉（由于和最高置信度的框overlap过大）
存在一些，所有的bbox都预测不准，不是所有的框都那么精准，有时甚至会出现某个物体周围的所有框都标出来了，但是都不准的情况
传统的NMS方法是基于分类分数的，只有最高分数的预测框能留下来，但是大多数情况下IoU和分类分数不是强相关，很多分类标签置信度高的框都位置都不是很准

Soft NMS

在这里插入图片描述

Improving Object Detection With One Line of Code

论文发表于 ICCV 2017

http://arxiv.org/abs/1704.04503

在这里插入图片描述

传统NMS的不足之处：

为了尽可能较小的增加假阳性率，临近检测的分数应该被抑制，同时要保证其得分在所有检测结果中明显高于假阳性样本；
在使用较低的NMS阈值来去除临近检测冗余的时候效果是sub-optimal，当使用高阈值的时候又会产生漏检率；
当使用比较高的NMS阈值的时候，在测量有重叠阈值范围内的mAP效果会下降；

引入soft NMS的核心就是不会直接通过一个NMS阈值去去除冗余检测，而是对于高度冗余的检测结果通过惩罚函数进行抑制，使得其得分下降；

IOU冗余的越厉害，其得分下降的越厉害；

论文中描述了两种惩罚函数，一种线性函数，一种高斯函数

线性函数，最简单的就是 f = 1- iou()
在这里插入图片描述
高斯函数

在这里插入图片描述
惩罚函数的自变量是冗余样本与最大得分样本之间的IOU。

Soft-NMS也是一种贪婪算法，没有找到检测盒的全局最优重新评分。检测盒的重新评分是以贪婪的方式进行的，因此那些具有较高局部评分的检测不会被抑制。

在soft NMS流程的最后也是有一个阈值，对于列表中所有检测结果的置信度低于阈值的检测结果进行淘汰；

它仅需要对传统的NMS算法进行简单的改动且不增额外的参数。该Soft-NMS算法在标准数据集PASCAL
VOC2007（较R-FCN和Faster-RCNN提升1.7%）和MS-COCO（较R-FCN提升1.3%，较Faster-RCNN提升1.1%）上均有提升。 Soft-NMS具有与传统NMS相同的算法复杂度，使用高效。
Soft-NMS不需要额外的训练，并易于实现，它可以轻松的被集成到任何物体检测流程中。

在这里插入图片描述

优点：

1、Soft-NMS可以很方便地引入到object detection算法中，不需要重新训练原有的模型、代码容易实现，不增加计算量（计算量相比整个object detection算法可忽略）。并且很容易集成到目前所有使用NMS的目标检测算法。

2、soft-NMS在训练中采用传统的NMS方法，仅在推断代码中实现soft-NMS。 作者应该做过对比试验，在训练过程中采用soft-NMS没有显著提高。

3、NMS是Soft-NMS特殊形式，当得分重置函数采用二值化函数时，Soft-NMS和NMS是相同的。soft-NMS算法是一种更加通用的非最大抑制算法。

在这里插入图片描述

缺点：

soft-NMS也是一种贪心算法，并不能保证找到全局最优的检测框分数重置。除了以上这两种分数重置函数，我们也可以考虑开发其他包含更多参数的分数重置函数，比如Gompertz函数等。但是它们在完成分数重置的过程中增加了额外的参数。

soft NMS的pytorch代码实现

在Pytorch版本的Faster RCNN当中，roi_head.py文件中执行了nms处理，

所以我要做的是基于pytorch官方实现的nms函数的参数构建复现soft_nms函数。
在这里插入图片描述

#= 计算面积
def area_of(left_top, right_bottom):

    """Compute the areas of rectangles given two corners.
    Args:
        left_top (N, 2): left top corner.
        right_bottom (N, 2): right bottom corner.
    Returns:
        area (N): return the area.
        return types: torch.Tensor
    """
    hw = torch.clamp(right_bottom - left_top, min=0.0)
    return hw[..., 0] * hw[..., 1]
# 计算IOU
def iou_of(boxes0, boxes1, eps=1e-5):
    """Return intersection-over-union (Jaccard index) of boxes.
    Args:
        boxes0 (N, 4): ground truth boxes.
        boxes1 (N or 1, 4): predicted boxes.
        eps: a small number to avoid 0 as denominator.
    Returns:
        iou (N): IoU values.
    """
    overlap_left_top = torch.max(boxes0[..., :2], boxes1[..., :2])
    overlap_right_bottom = torch.min(boxes0[..., 2:], boxes1[..., 2:])

    overlap_area = area_of(overlap_left_top, overlap_right_bottom)
    area0 = area_of(boxes0[..., :2], boxes0[..., 2:])
    area1 = area_of(boxes1[..., :2], boxes1[..., 2:])
    return overlap_area / (area0 + area1 - overlap_area)
# 自定义复现 soft_nms函数
def soft_nms(boxes, scores, score_threshold=0.001, sigma=0.5, top_k=-1):
    """Soft NMS implementation.
    References:
        https://arxiv.org/abs/1704.04503
        https://github.com/facebookresearch/Detectron/blob/master/detectron/utils/cython_nms.pyx
    Args:
        pytorch 官方实现的nms当中传入的是
                boxes:Tensor[N,4] 
                scores:Tensor[N], N表示的是每张图片所有检测框的数量
        box_scores (N, 5): boxes in corner-form and probabilities.
        score_threshold: boxes with scores less than value are not considered.
        sigma: the parameter in score re-computation.
            scores[i] = scores[i] * exp(-(iou_i)^2 / simga)
        top_k: keep top_k results. If k <= 0, keep all the results.
    Returns:
        keep : Tensor    1维的tensor  保存的是boxes中保留下来的框的序号  比如 [2,1,0]
        int64 tensor with the indices
        of the elements that have been kept
        by NMS, sorted in decreasing order of scores
        #picked_box_scores (K, 5): results of NMS.sc
    """
    scores = scores.unsqueeze(1)
    box_scores = torch.cat((boxes,scores), dim=1)  # 将boxes和scores合并为[N,5]的tensor
    picked_box_scores = []
    while box_scores.size(0) > 0:
        max_score_index = torch.argmax(box_scores[:, 4])  # 取出得分最高的框的索引
        #print(box_scores[:, 4])
        cur_box_prob = box_scores[max_score_index, :].clone() # 拷贝得到cur_box_prob:得分最高框的x1,y1,x2,y2和score
        #print(cur_box_prob)
        picked_box_scores.append(cur_box_prob)  # 将其填入到框中
        if len(picked_box_scores) == top_k > 0 or box_scores.size(0) == 1:  # 所有框都遍历则退出
            break
        cur_box = cur_box_prob[:-1]   # 取出当前得分最高的框的四维坐标 x1,y1,x2,y2
        #print(cur_box)
        box_scores[max_score_index, :] = box_scores[-1, :]
        #print("box_scores[-1, :]")
        #print(box_scores[-1,:])
        box_scores = box_scores[:-1, :]
        ious = iou_of(cur_box.unsqueeze(0), box_scores[:, :-1])
        box_scores[:, -1] = box_scores[:, -1] * torch.exp(-(ious * ious) / sigma)   # 用的是soft NMS论文中的高斯函数
        box_scores = box_scores[box_scores[:, -1] > score_threshold, :]
    if len(picked_box_scores) > 0:
        end_socre = torch.stack(picked_box_scores)[:,-1] # 先转为tensor，然后取出最后的得分一列
        keep = end_socre.sort(descending = True).indices  # 根据降序排序 并取得排序前的indices
        return keep  # 返回所有检测框的得分排名，不进行淘汰
        #return torch.stack(picked_box_scores)
    else:
        return torch.tensor([])