【目标检测NMS总结】nms技术原理＋代码（可以直接运行，也可以根据自己的情况修改）

计算机Rookie

于 2024-08-30 14:12:12 发布

阅读量413

点赞数 8

文章标签：目标检测目标跟踪人工智能算法深度学习 python 计算机视觉

本文链接：https://blog.csdn.net/syu_acm/article/details/141716450

版权

NMS原理

以目标检测为例：

目标检测推理过程中会产生很多检测框（目标检测推理过程中会产生很多检测框（A,B,C,D,E,F等），其中很多检测框都是检测同一个目标，但最终每个目标只需要一个检测框。
NMS选择那个得分最高的检测框（假设是C），再将C与剩余框计算相应的IOU值，当IOU值超过所设定的阈值（普遍设置为0.5），即对超过阈值的框进行抑制，抑制的做法是将检测框的得分设置为0，
如此一轮过后，在剩下检测框中继续寻找得分最高的，再抑制与之IOU超过阈值的框，直到最后会保留几乎没有重叠的框。
这样基本可以做到每个目标只剩下一个检测框。

在这里插入图片描述

NMS代码模板

形式是 x1 y1 x2 y2 conf label顺序坐标是恢复成原图数值的(可以看下面的例子)

import numpy as np

def yolo_nms_multi_class(predictions, iou_threshold=0.45, conf_threshold=0.25):
    """
    """
    predictions = np.array(predictions)
    
    # 过滤掉置信度低于阈值的框 x1 y1 x2 y2 conf label顺序 坐标是恢复成原图数值的
    keep_indices = predictions[:, 4].astype(float) > conf_threshold
    predictions = predictions[keep_indices]

    if len(predictions) == 0:
        return []

    # 获取唯一类别标签（字符串形式）
    unique_classes = np.unique(predictions[:, 5])

    nms_results = []

    # 对每个类别分别进行 NMS
    for cls in unique_classes:
        cls_predictions = predictions[predictions[:, 5] == cls]
        boxes = cls_predictions[:, :4].astype(float)
        scores = cls_predictions[:, 4].astype(float)

        x1 = boxes[:, 0]
        y1 = boxes[:, 1]
        x2 = boxes[:, 2]
        y2 = boxes[:, 3]

        areas = (x2 - x1 + 1) * (y2 - y1 + 1)
        order = scores.argsort()[::-1]

        keep = []
        while order.size > 0:
            i = order[0]
            keep.append(i)

            xx1 = np.maximum(x1[i], x1[order[1:]])
            yy1 = np.maximum(y1[i], y1[order[1:]])
            xx2 = np.minimum(x2[i], x2[order[1:]])
            yy2 = np.minimum(y2[i], y2[order[1:]])

            w = np.maximum(0, xx2 - xx1 + 1)
            h = np.maximum(0, yy2 - yy1 + 1)

            inter = w * h
            iou = inter / (areas[i] + areas[order[1:]] - inter)

            inds = np.where(iou <= iou_threshold)[0]
            order = order[inds + 1]

        nms_results.append(cls_predictions[keep])

    # 合并所有类别的 NMS 结果
    if len(nms_results) > 0:
        nms_results = np.vstack(nms_results)
    else:
        nms_results = []

    return nms_results

# 使用示例
predictions = [
    [244, 87, 776, 236, 0.151, 'jyz_pl'],
    [245, 91, 785, 433, 0.575, 'jyz_pl'],
    [245, 91, 784, 429, 0.179, 'jyz_pl'],
    [245, 90, 785, 430, 0.279, 'jyz_pl'],
    # 添加更多检测结果
]

nms_results = yolo_nms_multi_class(predictions)
print(nms_results)

可以更换更多结果：

predictions = [[244, 87, 776, 236, 0.151, 'jyz_pl', None], [244, 87, 776, 236, 0.75, 'test', None], [244, 87, 776, 235, 0.72, 'test', None],[245, 91, 785, 433, 0.175, 'jyz_pl', None], [245, 91, 784, 429, 0.179, 'jyz_pl', None], [245, 90, 785, 430, 0.179, 'jyz_pl', None], [245, 91, 776, 315, 0.187, 'jyz_pl', None], [244, 91, 777, 314, 0.214, 'jyz_pl', None], [244, 91, 778, 313, 0.214, 'jyz_pl', None], [244, 88, 775, 233, 0.239, 'jyz_pl', None], [245, 88, 774, 234, 0.251, 'jyz_pl', None], [244, 90, 769, 241, 0.251, 'jyz_pl', None], [244, 90, 770, 241, 0.254, 'jyz_pl', None], [247, 90, 771, 260, 0.265, 'jyz_pl', None], [248, 90, 771, 252, 0.347, 'jyz_pl', None], [247, 91, 771, 251, 0.364, 'jyz_pl', None], [244, 92, 771, 299, 0.393, 'jyz_pl', None], [245, 92, 771, 299, 0.396, 'jyz_pl', None], [244, 91, 771, 288, 0.432, 'jyz_pl', None], [244, 91, 771, 289, 0.432, 'jyz_pl', None], [245, 90, 777, 304, 0.482, 'jyz_pl', None], [244, 90, 777, 304, 0.484, 'jyz_pl', None], [245, 89, 777, 304, 0.49, 'jyz_pl', None], [245, 90, 775, 288, 0.527, 'jyz_pl', None], [245, 90, 775, 284, 0.559, 'jyz_pl', None], [245, 90, 775, 284, 0.56, 'jyz_pl', None]]

还原原图坐标：

如果是需要还原成原图的坐标的话加以下代码

def xyxy2xywh(x):
    # Convert nx4 boxes from [x1, y1, x2, y2] to [x, y, w, h] where xy1=top-left, xy2=bottom-right
    y = np.copy(x)
    y[:, 0] = (x[:, 0] + x[:, 2]) / 2  # x center
    y[:, 1] = (x[:, 1] + x[:, 3]) / 2  # y center
    y[:, 2] = x[:, 2] - x[:, 0]  # width
    y[:, 3] = x[:, 3] - x[:, 1]  # height
    return y


def xywh2xyxy(x):
    # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
    y = np.copy(x)
    y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
    y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
    y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
    y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
    return y