NMS详解及pytorch实现:hard-nms(diou\overlap\merge\batched),soft-nms

NMS详解及pytorch实现:hard-nms(diou\overlap\merge\batched),soft-nms

1 简介

非极大值抑制算法(Non-maximum suppression, NMS)是有anchor系列目标检测的标配,如今大部分的One-StageTwo-Stage算法在推断(Inference)阶段都使用了NMS作为网络的最后一层,例如YOLOv3、SSD、Faster-RCNN等。

当然NMS在目前最新的anchor-free(https://www.cnblogs.com/yumoye/p/11022800.html)目标检测算法中(CornerNet、CenterNet等)并不是必须的,对这种检测算法提升的精度也有限,但是这并不影响我们学习NMS。

NMS的本质是搜索局部极大值,抑制非极大值元素,在目标检测中,我们经常将其用于消除多余的检测框(从左到右消除了重复的检测框,只保留当前最大confidence的检测框):

《手撕非极大值抑制算法NMS与soft-NMS》

NMS有很多种变体,最为常见的Hard-NMS,我们通常所说的NMS就是指Hard-NMS,还有另外一种NMS叫做Soft-NMS,是Hard-NMS的变体,还有一些其他的一些变体(batched\diou\or\and\merge-nms)。

2 原理

最为常见的,也就是咱们提到的nms及为hard-nms,所以这里将以hard-nms入手,剖析内部操作原理。

  • 选取当前类别box中scores最大的那一个,记为current_box,并保留它(为什么保留它,因为它预测出当前位置有物体的概率最大啊,对于我们来说当前confidence越大说明当前box中包含物体的可能行就越大)
  • 计算current_box与其余的box的IOU
  • 如果其IOU大于我们设定的阈值,那么就舍弃这些boxes(由于可能这两个box表示同一目标,因此这两个box的IOU就比较大,会超过我们设定的阈值,所以就保留分数高的那一个)
  • 从最后剩余的boxes中,再找出最大scores的那一个(之前那个大的已经保存到输出的数组中,这个是从剩下的里面再挑一个最大的),如此循环往复

3 实现

3.1 伪代码

在这里插入图片描述

各种nms特点一句话总结:

Hard-nms–直接删除相邻的同类别目标,密集目标的输出不友好。

Soft-nms–改变其相邻同类别目标置信度(有关iou的函数),后期通过置信度阈值进行过滤,适用于目标密集的场景。

or-nms–hard-nms的非官方实现形式,只支持cpu。

vision-nms–hard-nms的官方实现形式(c函数库),可支持gpu(cuda),只支持单类别输入。

vision-batched-nms–hard-nms的官方实现形式(c函数库),可支持gpu(cuda),支持多类别输入。

and-nms–在hard-nms的逻辑基础上,增加是否为单独框的限制,删除没有重叠框的框(减少误检)。

merge-nms–在hard-nms的基础上,增加保留框位置平滑策略(重叠框位置信息求解平均值),使框的位置更加精确。

diou-nms–在hard-nms的基础上,用diou替换iou,里有参照diou的优势。

3.2 pytorch源码

nms实现函数:

def non_max_suppression(prediction, conf_thres=0.5, nms_thres=0.5, multi_cls=True, method='vision'):
    """
    Removes detections with lower object confidence score than 'conf_thres'
    Non-Maximum Suppression to further filter detections.
    Returns detections with shape:
        (x1, y1, x2, y2, object_conf, conf, class)
    """
    # NMS methods https://github.com/ultralytics/yolov3/issues/679 'or', 'and', 'merge', 'vision', 'vision_batch'

    # Box constraints
    min_wh, max_wh = 2, 4096  # (pixels) minimum and maximium box width and height

    output = [None] * len(prediction)
    for image_i, pred in enumerate(prediction):
        # Apply conf constraint
        pred = pred[pred[:, 4] > conf_thres]

        # Apply width-height constraint
        pred = pred[(pred[:, 2:4] > min_wh).all(1) & (pred[:, 2:4] < max_wh).all(1)]

        # If none remain process next image
        if len(pred) == 0:
            continue

        # Compute conf
        torch.sigmoid_(pred[..., 5:])
        pred[..., 5:] *= pred[..., 4:5]  # conf = obj_conf * cls_conf

        # Box (center x, center y, width, height) to (x1, y1, x2, y2)
        box = xywh2xyxy(pred[:, :4])

        # Detections matrix nx6 (xyxy, conf, cls)
        if multi_cls or conf_thres < 0.01:
            i, j = (pred[:, 5:] > conf_thres).nonzero().t()
            pred = torch.cat((box[i], pred[i, j + 5].unsqueeze(1), j.float().unsqueeze(1)), 1)
        else:  # best class only
            conf, j = pred[:, 5:].max(1)
            pred = torch.cat((box, conf.unsqueeze(1), j.float().unsqueeze(1)), 1)

        # Apply finite constraint
        pred = pred[torch.isfinite(pred).all(1)]

        # Get detections sorted by decreasing confidence scores
        pred = pred[pred[:, 4].argsort(descending=True)]

        # Batched NMS
        if method == 'vision_batch':
            output[image_i] = pred[torchvision.ops.boxes.batched_nms(pred[:, :4], pred[:, 4], pred[:, 5], nms_thres)]
            continue

        # All other NMS methods
        det_max = []
        cls = pred[:, -1]
        for c in cls.unique():
            dc = pred[cls == c]  # select class c
            n = len(dc)
            if n == 1:
                det_max.append(dc)  # No NMS required if only 1 prediction
                continue
            elif n > 500:
                dc = dc[:500]  # limit to first 500 boxes: https://github.com/ultralytics/yolov3/issues/117

            if method == 'vision':
                det_max.append(dc[torchvision.ops.boxes.nms(dc[:, :4], dc[:, 4], nms_thres)])

            elif method == 'or':  # default
                while dc.shape[0]:
                    det_max.append(dc[:1])  # save highest conf detection
                    if len(dc) == 1:  # Stop if we're at the last detection
                        break
                    iou = bbox_iou(dc[0], dc[1:])  # iou with other boxes
                    dc = dc[1:][iou < nms_thres]  # remove ious > threshold

            elif method == 'and':  # requires overlap, single boxes erased
                while len(dc) > 1:
                    iou = bbox_iou(dc[0], dc[1:])  # iou with other boxes
                    if iou.max() > 0.5:
                        det_max.append(dc[:1])
                    dc = dc[1:][iou < nms_thres]  # remove ious > threshold

            elif method == 'merge':  # weighted mixture box
                while len(dc):
                    if len(dc) == 1:
                        det_max.append(dc)
                        break
                    i = bbox_iou(dc[0], dc) > nms_thres  # iou with other boxes
                    weights = dc[i, 4:5]
                    dc[0, :4] = (weights * dc[i, :4]).sum(0) / weights.sum()
                    det_max.append(dc[:1])
                    dc = dc[i == 0]
			elif method == 'diounms':  # use diou 
                while dc.shape[0]:
                    det_max.append(dc[:1])  # save highest conf detection
                    if len(dc) == 1:  # Stop if we're at the last detection
                        break
                    diou = bbox_iou(dc[0], dc[1:],DIoU=True)  # diou with other boxes
                    dc = dc[1:][diou < nms_thres]  # remove dious > threshold
            
            elif method == 'soft':  # soft-NMS https://arxiv.org/abs/1704.04503
                sigma = 0.5  # soft-nms sigma parameter
                while len(dc):
                    if len(dc) == 1:
                        det_max.append(dc)
                        break
                    det_max.append(dc[:1])
                    iou = bbox_iou(dc[0], dc[1:])  # iou with other boxes
                    dc = dc[1:]
                    dc[:, 4] *= torch.exp(-iou ** 2 / sigma)  # decay confidences
                    dc = dc[dc[:, 4] > conf_thres]  # https://github.com/ultralytics/yolov3/issues/362

        if len(det_max):
            det_max = torch.cat(det_max)  # concatenate
            output[image_i] = det_max[(-det_max[:, 4]).argsort()]  # sort

    return output

iou计算函数:

def bbox_iou(box1, box2, x1y1x2y2=True, GIoU=False, DIoU=False, CIoU=False):
    # Returns the IoU of box1 to box2. box1 is 4, box2 is nx4
    box2 = box2.t()

    # Get the coordinates of bounding boxes
    if x1y1x2y2:  # x1, y1, x2, y2 = box1
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
    else:  # x, y, w, h = box1
        b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
        b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
        b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
        b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2

    # Intersection area
    inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
            (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)

    # Union Area
    w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1
    w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1
    union = (w1 * h1 + 1e-16) + w2 * h2 - inter

    iou = inter / union  # iou
    if GIoU or DIoU or CIoU:
        cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)  # convex (smallest enclosing box) width
        ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)  # convex height
        if GIoU:  # Generalized IoU https://arxiv.org/pdf/1902.09630.pdf
            c_area = cw * ch + 1e-16  # convex area
            return iou - (c_area - union) / c_area  # GIoU
        if DIoU or CIoU:  # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1
            # convex diagonal squared
            c2 = cw ** 2 + ch ** 2 + 1e-16
            # centerpoint distance squared
            rho2 = ((b2_x1 + b2_x2) - (b1_x1 + b1_x2)) ** 2 / 4 + ((b2_y1 + b2_y2) - (b1_y1 + b1_y2)) ** 2 / 4
            if DIoU:
                return iou - rho2 / c2  # DIoU
            elif CIoU:  # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47
                v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)
                with torch.no_grad():
                    alpha = v / (1 - iou + v)
                return iou - (rho2 / c2 + v * alpha)  # CIoU

    return iou

pytorch外接c函数库:

#多类别nms时,需要分类别灌入。
#调用方式:torchvision.ops.boxes.nms()
def nms(boxes, scores, iou_threshold):
    """
    Performs non-maximum suppression (NMS) on the boxes according
    to their intersection-over-union (IoU).

    NMS iteratively removes lower scoring boxes which have an
    IoU greater than iou_threshold with another (higher scoring)
    box.

    Parameters
    ----------
    boxes : Tensor[N, 4])
        boxes to perform NMS on. They
        are expected to be in (x1, y1, x2, y2) format
    scores : Tensor[N]
        scores for each one of the boxes
    iou_threshold : float
        discards all overlapping
        boxes with IoU < iou_threshold

    Returns
    -------
    keep : Tensor
        int64 tensor with the indices
        of the elements that have been kept
        by NMS, sorted in decreasing order of scores
    """
    _C = _lazy_import()
    return _C.nms(boxes, scores, iou_threshold)
#多类别nms时,不需要分类别的灌入,内部已通过idxs区分实现。
#调用方式:torchvision.ops.boxes.batched_nms()
def batched_nms(boxes, scores, idxs, iou_threshold):
    """
    Performs non-maximum suppression in a batched fashion.

    Each index value correspond to a category, and NMS
    will not be applied between elements of different categories.

    Parameters
    ----------
    boxes : Tensor[N, 4]
        boxes where NMS will be performed. They
        are expected to be in (x1, y1, x2, y2) format
    scores : Tensor[N]
        scores for each one of the boxes
    idxs : Tensor[N]
        indices of the categories for each one of the boxes.
    iou_threshold : float
        discards all overlapping boxes
        with IoU < iou_threshold

    Returns
    -------
    keep : Tensor
        int64 tensor with the indices of
        the elements that have been kept by NMS, sorted
        in decreasing order of scores
    """
    if boxes.numel() == 0:
        return torch.empty((0,), dtype=torch.int64, device=boxes.device)
    # strategy: in order to perform NMS independently per class.
    # we add an offset to all the boxes. The offset is dependent
    # only on the class idx, and is large enough so that boxes
    # from different classes do not overlap
    max_coordinate = boxes.max()
    offsets = idxs.to(boxes) * (max_coordinate + 1)
    boxes_for_nms = boxes + offsets[:, None]
    keep = nms(boxes_for_nms, scores, iou_threshold)
    return keep

3.3 知识点

1.nms的gpu版本实现:

如有需求请参考:https://blog.csdn.net/qq_21368481/article/details/85722590

2.nms的应用范围:

只应用在前向推理的过程中,在训练中不进行此步。

3.以上几种nms的性能表现:

https://github.com/ultralytics/yolov3/issues/679

Speed mm:ssCOCO mAP @0.5…0.95COCO mAP @0.5
ultralytics 'OR'8:2039.760.3
ultralytics 'AND'7:3839.660.1
ultralytics 'SOFT'12:0039.158.7
ultralytics 'MERGE'11:2540.260.4
torchvision.ops.boxes.nms()5:0839.760.3
torchvision.ops.boxes.batched_nms()6:0039.760.3

参考资料

https://oldpan.me/archives/write-hard-nms-c

https://github.com/ultralytics/yolov3

  • 12
    点赞
  • 55
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
### 回答1: soft-nms pytorch源代码是一个基于PyTorch框架实现的软非极大值抑制算法的源代码。该算法可以在目标检测中用于抑制重叠的检测框,从而提高检测的准确性和效率。该源代码提供了一个简单易用的接口,可以方便地集成到现有的目标检测系统中。 ### 回答2: Soft-NMS是一种针对目标检测中非极大值抑制(Non-Maximum Suppression, NMS)的改进算法。Soft-NMS的目的是尽可能保留更多的检测结果,其中,保留哪些检测结果以及去除哪些检测结果是通过计算每个检测框和其他检测框的IoU(Intersection over Union)来决定的。 PyTorch是一个流行的深度学习框架,在PyTorch实现Soft-NMS非常简单。下面是PyTorch官方提供的Soft-NMS源代码: ```python def soft_nms(dets, sigma=0.5, Nt=0.3, threshold=0.001, method=2): """ :param dets: [[x1, y1, x2, y2, score], ...] :param sigma: :param Nt: :param threshold: :param method: :return: indexes to keep """ N = dets.shape[0] indexes = np.arange(N) for i in range(N): pos = i + 1 if i != N - 1: maximum = np.argmax(dets[pos:, 4]) if dets[i, 4] < dets[maximum + pos, 4]: dets[[i, maximum + pos], :] = dets[[maximum + pos, i], :] indexes[[i, maximum + pos]] = indexes[[maximum + pos, i]] t_bbox, t_score = dets[i, :4], dets[i, 4] if method == 1: weight = np.exp(-(np.power((dets[i + 1:, :4] - t_bbox), 2).sum(1) / sigma)) else: weight = np.zeros((N - i - 1,)) _idx = np.where(dets[i + 1:, 4] >= Nt)[0] + i + 1 if len(_idx) > 0: ex_dets = dets[_idx, :] overlap = iou(t_bbox.reshape(1, -1), ex_dets[:, :4]) weight[_idx - (i + 1)] = overlap weight = np.exp(-(weight * weight) / sigma) dets[i + 1:, 4] *= weight score_mask = dets[i + 1:, 4] >= threshold indexes_mask = indexes[i + 1:][score_mask] dets[i + 1:, :] = dets[indexes_mask, :] indexes[i + 1:] = indexes[indexes_mask] if len(indexes_mask) == 0: break keep = np.zeros(N, dtype=np.intp) keep[indexes] = 1 return np.where(keep == 1)[0] ``` 这里,soft_nms函数接受一个包含检测结果的二维数组dets,其中每个元素包含四个坐标[x1, y1, x2, y2]和一个分数score。函数的其他参数包括sigma、Nt、threshold和method等。 对于每个检测框,软NMS算法首先检查其余检测框的IoU。如果另一个检测框的IoU高于给定的Nt阈值,则它的分数将被调整到小于原始分数因子*(1-IoU)。然后,score_mask掩码被用来删掉具有较低分数的检测框,直到达到需要的最终物体数或所有框都不再死亡。最后,保留下来的检测框的索引被返回给用户。 这个PyTorch源代码是一个很好的例子,可以帮助我们了解如何使用Python来自定义各种优化算法,包括Soft-NMS。 ### 回答3: Soft-NMS是一种基于非极大抑制(NMS)的目标检测算法。与传统的NMS算法不同,Soft-NMS通过对重叠框之间的分数进行加权来降低框的置信度,而不是直接将框丢弃。这样可以在一定程度上保留重叠框中的信息。 PyTorch是一种流行的深度学习框架,提供了供用户使用的许多预定义算法和函数。PyTorch实现Soft-NMS的源代码,这使得用户可以直接使用这种功能而不必手动实现。 在PyTorch中,Soft-NMS实现是在torchvision.ops.nms.soft_nms中完成的。它有以下四个参数: - boxes: 包含所有检测框的tensor。 - scores: 所有检测框的分数。 - iou_threshold: 重叠IOU阈值。如果两个框之间的IOU低于此阈值,则不会执行Soft-NMS。 - score_threshold: 分数阈值。如果框的分数低于此阈值,则不会执行Soft-NMS。 整个NMS算法的流程如下: 1. 将分数从高到低排序,并将相应的边界框记录下来。 2. 选择分数最高的框,并记录下该框的索引号。 3. 计算该框与所有其他框之间的IOU值,并使用Soft-NMS算法对其进行加权。 4. 删除其余重叠框,并在剩下的框中重复步骤2-4,直到所有框都被处理。 调用soft_nms函数后,将返回保留下来框的索引列表。然后可以使用这个索引列表来关联boxes和scores,并对检测到的目标进行可视化或其他后处理操作。 在使用Soft-NMS时,需要根据具体情况调整两个阈值。如果两个阈值设置得太低,可能会误保留一些低质量的检测框。如果设置得过高,可能会丢失一些真实的目标。因此,建议在数据集上进行测量和调整。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值