非极大抑制NMS与Soft-NMS

最新推荐文章于 2024-06-25 20:48:55 发布

Aced96

最新推荐文章于 2024-06-25 20:48:55 发布

阅读量445

点赞数

原文链接：https://blog.csdn.net/weixin_44791964/article/details/106222846

版权

什么是非极大抑制？

未经过非极大抑制的图：
在这里插入图片描述
经过非极大抑制的图：

可以用一句话概括非极大抑制的功能就是：
筛选出一定区域内属于同一种类得分最大的框。

非极大抑制NMS的实现过程

实现的多分类非极大抑制，输入 $b o x e s = [b a t c h s i z e, a n c h o r s, 5 + n u m c l a s s e s]$
$b a t c h s i z e$ : 图片的数量
$a n c h o r s$ : 所有的预测框
$5 + n u m c l a s s e s$ ：所有的预测框的预测结果。[x,y,w,h,置信度，numclasses]

非极大抑制的执行过程如下：
a.对 $b a t c h s i z e$ 循环
b.清除图片中：置信度<conf_thres的图片。在进行重合框筛选前就进行得分的筛选可以大幅度减少框的数量。
c.判断第2步中获得的框的种类与得分。取出预测结果中框的位置与之进行堆叠。此时最后一维度里面的内容由5+num_classes变成了4+1+2。4：代表框的位置，1：代表预测框是否包含物体，2：分别代表种类的置信度与种类。
d.对种类进行循环，非极大抑制的作用是筛选出一定区域内属于同一种类得分最大的框，对种类进行循环可以帮助我们对每一个类分别进行非极大抑制。
e.根据得分对该种类进行从大到小排序。
f.每次取出得分最大的框，计算其与其它所有预测框的重合程度，重合程度过大的则剔除。

实现代码：

import numpy as np


#boxes=[batch_size,all_anchors,5+num_classes]      5:[x,y,w,h,1]  1为置信度   num_classes为分类结果
def non_max_suppression(boxes,num_classes,conf_thres=0.5,nms_thres=0.4):
    bs=np.shape(boxes)[0]       #batch_size

    #将框转换成左上角右下角的形式
    shape_boxes=np.zeros_like(boxes[:,:,4])    #shape_boxes.shape=[bs,anchors,4]
    shape_boxes[:,:,0] = boxes[:,:,0] - boxes[:,:,2]/2       #x-w/2
    shape_boxes[:,:,1] = boxes[:,:,1] - boxes[:,:,3]/2       #y-h/2
    shape_boxes[:,:,2] = boxes[:,:,0] + boxes[:,:,2]/2       #x+w/2
    shape_boxes[:,:,1] = boxes[:,:,1] + boxes[:,:,3]/2       #y+h/2

    boxes[:,:,:4] = shape_boxes
    output=[]
    #1.对所有图片进行循环
    for i in range(bs):
        prediction=boxes[i]                       ###prediction.shape= [anchors,(5+num_classes)]
        #2.找出图片中得分大于门限函数的框。在进行重合框筛选前就进行得分的筛选可以大幅减少框的数量
        mask = prediction[:,4] >= conf_thres      #取出置信度
        prediction = prediction[mask]             #prediction[mask]： 筛选后anchorsx（5+num_classes）。在prediction中去掉了置信度<conf_thres的框
        if not np.shape(prediction)[0]:
            continue
        #3.判断第2步中获得的框的种类与得分。
        #取出预测结果中框的位置与之进行堆叠
        #此时最后一个维度里面的内容由 5+num_classes变成 4+1+2
        #4：框的左上角与右下角坐标； 1：代表预测框内是否包含物体； 2：种类的置信度与种类
        class_conf = np.expand_dims(np.max(prediction[:,5:5+num_classes],1),-1)   #np.expand_dims(a,axis=):即在相应的axis轴上扩展数据   筛选后anchorsx（num_classes+1）
        class_prd  = np.expand_dims(np.argmax(prediction[:,5:5+num_classes],1),-1)
        detections = np.concatenate((prediction[:,:5],class_conf,class_prd),1)    #np.concatenate((a1,a2),axis=0)
        unique_class = np.unique(detections[:,-1])

        if len(unique_class) == 0 :
            continue

        best_box = []
        #4.对种类进行循环
        #非极大抑制的作用：筛选出一定区域内属于同一种类得分最大的框
        #对种类进行循环可以帮助我们对每一个类分别进行非极大抑制
        for  c in unique_class:
            cls_mask = detections[:,-1] ==c

            detection = detections[cls_mask]
            scores=detection[:,4]
            #5.根据得分对该种类进行从小到大排序
            arg_sort = np.argsort(scores)[::-1]
            detection = detection[arg_sort]
            print(detection)
            while np.shape(detection)[0]>0:
                #6.每次取出得分最大的框，计算其与其它所有预测框的重合程度，重合程度过大的则剔除。
                best_box.append(detection[0])
                if len(detection) == 1:
                    break
                ious = iou(best_box[-1],detection[1:])
                detection = detection[1:][ious<nms_thres]
        output.append(best_box)
    return np.array(output)

def iou(b1,b2):
    b1_x1, b1_y1, b1_x2, b1_y2 = b1[0], b1[1], b1[2], b1[3]
    b2_x1, b2_y1, b2_x2, b2_y2 = b2[:, 0], b2[:, 1], b2[:, 2], b2[:, 3]

    inter_rect_x1 = np.maximum(b1_x1, b2_x1)
    inter_rect_y1 = np.maximum(b1_y1, b2_y1)
    inter_rect_x2 = np.minimum(b1_x2, b2_x2)
    inter_rect_y2 = np.minimum(b1_y2, b2_y2)

    inter_area = np.maximum(inter_rect_x2 - inter_rect_x1, 0) * \
                 np.maximum(inter_rect_y2 - inter_rect_y1, 0)

    area_b1 = (b1_x2 - b1_x1) * (b1_y2 - b1_y1)
    area_b2 = (b2_x2 - b2_x1) * (b2_y2 - b2_y1)

    iou = inter_area / np.maximum((area_b1 + area_b2 - inter_area), 1e-6)
    return iou

柔性非极大抑制Soft-NMS的实现过程

柔性非极大抑制和普通的非极大抑制相差不大，只差了几行代码。

柔性非极大抑制认为不应该直接只通过重合程度进行筛选，如图所示，很明显图片中存在两匹马，但是此时两匹马的重合程度较高，此时我们如果使用普通nms，后面那匹得分比较低的马会直接被剔除。

Soft-NMS认为在进行非极大抑制的时候要同时考虑得分和重合程度。
在这里插入图片描述
我们直接看NMS和Soft-NMS的代码差别：
如下为NMS：

while np.shape(detection)[0]>0:
    # 6、每次取出得分最大的框，计算其与其它所有预测框的重合程度，重合程度过大的则剔除。
    best_box.append(detection[0])
    if len(detection) == 1:
        break
    ious = iou(best_box[-1],detection[1:])
    detection = detection[1:][ious<nms_thres]

如下为Soft-NMS：

while np.shape(detection)[0]>0:
    best_box.append(detection[0])
    if len(detection) == 1:
        break
    ious = iou(best_box[-1],detection[1:])
    detection[1:,4] = np.exp(-(ious * ious) / sigma)*detection[1:,4]
    detection = detection[1:]
    scores = detection[:,4]
    arg_sort = np.argsort(scores)[::-1]
    detection = detection[arg_sort]

我们可以看到，对于NMS而言，其直接将 与得分最大的框重合程度较高的其它预测剔除。而Soft-NMS则以一个权重的形式，将获得的IOU取高斯指数后乘上原得分，之后重新排序。继续循环。
实现代码如下：

import numpy as np
def non_max_suppression(boxes, num_classes, conf_thres=0.5, sigma=0.5, nms_thres=0.4):
    bs = np.shape(boxes)[0]
    # 将框转换成左上角右下角的形式
    shape_boxes = np.zeros_like(boxes[:,:,:4])
    shape_boxes[:,:,0] = boxes[:,:,0] - boxes[:,:,2]/2
    shape_boxes[:,:,1] = boxes[:,:,1] - boxes[:,:,3]/2
    shape_boxes[:,:,2] = boxes[:,:,0] + boxes[:,:,2]/2
    shape_boxes[:,:,3] = boxes[:,:,1] + boxes[:,:,3]/2

    boxes[:,:,:4] = shape_boxes
    output = []
    # 1、对所有图片进行循环。
    for i in range(bs):
        prediction = boxes[i]
        # 2、找出该图片中得分大于门限函数的框。在进行重合框筛选前就进行得分的筛选可以大幅度减少框的数量。
        mask = prediction[:,4] >= conf_thres
        prediction = prediction[mask]
        if not np.shape(prediction)[0]:
            continue

        # 3、判断第2步中获得的框的种类与得分。
        # 取出预测结果中框的位置与之进行堆叠。
        # 此时最后一维度里面的内容由5+num_classes变成了4+1+2，
        # 四个参数代表框的位置，一个参数代表预测框是否包含物体，两个参数分别代表种类的置信度与种类。
        class_conf = np.expand_dims(np.max(prediction[:, 5:5 + num_classes], 1),-1)
        class_pred = np.expand_dims(np.argmax(prediction[:, 5:5 + num_classes], 1),-1)
        detections = np.concatenate((prediction[:, :5], class_conf, class_pred), 1)
        unique_class = np.unique(detections[:,-1])
        
        if len(unique_class) == 0:
            continue
        
        best_box = []
        # 4、对种类进行循环，
        # 非极大抑制的作用是筛选出一定区域内属于同一种类得分最大的框，
        # 对种类进行循环可以帮助我们对每一个类分别进行非极大抑制。
        for c in unique_class:
            cls_mask = detections[:,-1] == c

            detection = detections[cls_mask]
            scores = detection[:,4]
            # 5、根据得分对该种类进行从大到小排序。
            arg_sort = np.argsort(scores)[::-1]
            detection = detection[arg_sort]
            print(detection)
            while np.shape(detection)[0]>0:
                best_box.append(detection[0])
                if len(detection) == 1:
                    break
                ious = iou(best_box[-1],detection[1:])
                # 将获得的IOU取高斯指数后乘上原得分，之后重新排序
                detection[1:,4] = np.exp(-(ious * ious) / sigma)*detection[1:,4]
                detection = detection[1:]
                scores = detection[:,4]
                arg_sort = np.argsort(scores)[::-1]
                detection = detection[arg_sort]
        output.append(best_box)
    return np.array(output)

def iou(b1,b2):
    b1_x1, b1_y1, b1_x2, b1_y2 = b1[0], b1[1], b1[2], b1[3]
    b2_x1, b2_y1, b2_x2, b2_y2 = b2[:, 0], b2[:, 1], b2[:, 2], b2[:, 3]

    inter_rect_x1 = np.maximum(b1_x1, b2_x1)
    inter_rect_y1 = np.maximum(b1_y1, b2_y1)
    inter_rect_x2 = np.minimum(b1_x2, b2_x2)
    inter_rect_y2 = np.minimum(b1_y2, b2_y2)
    
    inter_area = np.maximum(inter_rect_x2 - inter_rect_x1, 0) * \
                 np.maximum(inter_rect_y2 - inter_rect_y1, 0)
    
    area_b1 = (b1_x2-b1_x1)*(b1_y2-b1_y1)
    area_b2 = (b2_x2-b2_x1)*(b2_y2-b2_y1)
    
    iou = inter_area/np.maximum((area_b1+area_b2-inter_area),1e-6)
    return iou