使用NMS算法去除目标中存在包含关系的矩形框

最新推荐文章于 2024-06-07 20:42:48 发布

绛洞花主敏明

最新推荐文章于 2024-06-07 20:42:48 发布

阅读量2.2k

点赞数 1

分类专栏： python

本文链接：https://blog.csdn.net/qq_39852676/article/details/111066575

版权

python 专栏收录该内容

44 篇文章 2 订阅

订阅专栏

一、NMS算法

NMS算法（非极大值抑制算法）的主要作用是通过搜索局部极大值的思想来实现，主要的实现步骤可以分为以下几个步骤：

1、设定目标框的置信度阈值，常用的阈值为0.5左右。
2、根据置信度排序排列候选框列表。
3、选取置信度最高的框A添加到输出列表，并将其从候选框的列表中删除。
4、计算A与候选框列表中的所有框的IOU值，删除大于阈值的候选框。
5、重复上述过程，直到候选框的列表为空，返回输出列表。

其中计算IOU值的公式为：
在这里插入图片描述
NMS算法的代码为：

#coding:utf-8
import numpy as np
 
def py_cpu_nms(dets, thresh):
    """Pure Python NMS baseline."""
    x1 = dets[:, 0]
    y1 = dets[:, 1]
    x2 = dets[:, 2]
    y2 = dets[:, 3]
    scores = dets[:, 4]
 
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    #从大到小排列，取index
    order = scores.argsort()[::-1]
    #keep为最后保留的边框
    keep = []
    while order.size > 0:
        #order[0]是当前分数最大的窗口，之前没有被过滤掉，肯定是要保留的
        i = order[0]
        keep.append(i)
        #计算窗口i与其他所以窗口的交叠部分的面积
        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])
 
        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        #交/并得到iou值
        ovr = inter / (areas[i] + areas[order[1:]] - inter)
        #ind为所有与窗口i的iou值小于threshold值的窗口的index，其他窗口此次都被窗口i吸收
        inds = np.where(ovr <= thresh)[0]
        #下一次计算前要把窗口i去除，所有i对应的在order里的位置是0，所以剩下的加1
        order = order[inds + 1]
 
    return keep

NMS算法的输入数据应当为：

boxes=np.array([[100,100,210,210,0.72],
        [250,250,420,420,0.8],
        [220,220,320,330,0.92],
        [100,100,210,210,0.72],
        [230,240,325,330,0.81],
        [220,230,315,340,0.9]])

数据中包括的位置信息为极坐标点，

[ x1, y1, x2, y2, thread ] = [ xmin, ymax, xmax, ymax , confidence]

NMS算法实现的效果为：
在这里插入图片描述
NMS算法的效果是将数据中的重叠部分的阈值低的进行过滤。

二、使用NMS算法实现矩形框中套矩形框的数据进行剔除。

import numpy as np

def nms(bboxes):
    """非极大抑制过程
    :param bboxes: 同类别候选框坐标
    :param confidence: 同类别候选框分数
    :param threshold: iou阈值
    :return:
    """
    # 1、传入无候选框返回空
    if len(bboxes) == 0:
        return [], []
    # 强转数组
    bboxes = np.array(bboxes)

    # 从x,y,w,h四个值转换为左上角顶点和右下角顶点
    center_x = bboxes[:, 0]
    center_y = bboxes[:, 1]
    w = bboxes[:, 2]
    h = bboxes[:, 3]

    # 取出n个的极坐标点
    x1 = np.maximum(0.0, center_x - (w / 2))
    y1 = np.maximum(0.0, center_y - (h / 2))
    x2 = np.maximum(0.0, center_x + (w / 2))
    y2 = np.maximum(0.0, center_y + (h / 2))

    # 2、对候选框进行NMS筛选
    # 返回的框坐标和分数
    picked_boxes = []
    # 对置信度进行排序, 获取排序后的下标序号, argsort默认从小到大排序
    order = np.argsort(np.ones(len(bboxes)))
    areas = (x2 - x1) * (y2 - y1)
    while order.size > 0:
        # 将当前置信度最大的框加入返回值列表中
        index = order[-1]
        picked_boxes.append(bboxes[index])

        # 获取当前置信度最大的候选框与其他任意候选框的相交面积
        x11 = np.maximum(x1[index], x1[order[:-1]])
        y11 = np.maximum(y1[index], y1[order[:-1]])
        x22 = np.minimum(x2[index], x2[order[:-1]])
        y22 = np.minimum(y2[index], y2[order[:-1]])

        # 计算当前矩形框与其余框的比值
        rate = areas[index] / areas[order[:-1]]
        # 计算其余框于u当前框的比值
        rate1 = areas[order[:-1]] / areas[index]

        w = np.maximum(0.0, x22 - x11)
        h = np.maximum(0.0, y22 - y11)
        intersection = w * h

        # 利用相交的面积和两个框自身的面积计算框的交并比, 保留大于阈值的框
        ratio = intersection / (areas[index] + areas[order[:-1]] - intersection)

        # rate==ratio表示包含关系，保留不为包含关系的框
        keep_boxes_indics = np.where(ratio != rate)
        keep_boxes_indics1 = np.where(ratio != rate1)
        
        if keep_boxes_indics.__len__() < keep_boxes_indics1.__len__():
            order = order[keep_boxes_indics]
        else:
            order = order[keep_boxes_indics1]
    return picked_boxes


if __name__ == '__main__':
    bounding = [(267.5509948730469, 291.084228515625, 11.240452766418457, 13.544210433959961), (150, 67, 15, 14), (246, 121, 11, 25), (189, 85, 14, 23)]
    picked_boxes = nms(bounding)
    print('最终bbox列表：', picked_boxes)

输出为：

最终bbox列表： [array([189.,  85.,  14.,  23.]), array([246., 121.,  11.,  25.]), array([150.,  67.,  15.,  14.]), array([267.55099487, 291.08422852,  11.24045277,  13.54421043])]

其中输入的数据为：

[ x, y, w, h ] = [中心点x值，中心点y值， 宽， 高 ]

实现的效果为：
在这里插入图片描述

绛洞花主敏明

关注

1
点赞
踩
14

收藏

觉得还不错? 一键收藏
打赏
2
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录