soft-NMS算法理解

最新推荐文章于 2024-01-18 15:11:34 发布

luoganttcc

最新推荐文章于 2024-01-18 15:11:34 发布

阅读量994

点赞数

分类专栏：机器视觉 yolo

本文链接：https://blog.csdn.net/luoganttcc/article/details/117990132

版权

机器视觉同时被 2 个专栏收录

155 篇文章 15 订阅

订阅专栏

yolo

21 篇文章 0 订阅

订阅专栏

code

NMS算法的大致过程可以看原文这段话：

First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes.
那么传统的NMS算法存在什么问题呢？可以看Figure1。在Fiugre1中，检测算法本来应该输出两个框，但是传统的NMS算法可能会把score较低的绿框过滤掉（如果绿框和红框的IOU大于设定的阈值就会被过滤掉），导致只检测出一个object（一个马），显然这样object的recall就比较低了。

在这里插入图片描述

可以看出NMS算法是略显粗暴（hard），因为NMS直接将和得分最大的box的IOU大于某个阈值的box的得分置零，那么有没有soft一点的方法呢？这就是本文提出Soft NMS。那么Soft-NMS算法到底是什么样呢？简单讲就是：An algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. 换句话说就是用稍低一点的分数来代替原有的分数，而不是直接置零。另外由于Soft NMS可以很方便地引入到object detection算法中，不需要重新训练原有的模型，因此这是该算法的一大优点。

在这里插入图片描述

Figure2是Soft NMS算法的伪代码。首先是关于三个输入B、S、Nt，在FIgure2中已经介绍很清楚了。D集合用来放最终的box，在boxes集合B非空的前提下，搜索score集合S中数值最大的数，假设其下标为m，那么bm（也是M）就是对应的box。然后将M和D集合合并，并从B集合中去除M。再循环集合B中的每个box，这个时候就有差别了，如果是传统的NMS操作，那么当B中的box bi和M的IOU值大于阈值Nt，那么就从B和S中去除该box；如果是Soft NMS，则对于B中的box bi也是先计算其和M的IOU，然后该IOU值作为函数f()的输入，最后和box bi的score si相乘作为最后该box bi的score。就是这么简单！

————————————————
版权声明：本文为CSDN博主「AI之路」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/u014380165/article/details/79502197

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Jun 16 17:16:31 2021

@author: ledi
"""

# -*- coding:utf-8 -*-
# Author:Richard Fang
"""
This is a Python version used to implement the Soft NMS algorithm.
Original Paper：Improving Object Detection With One Line of Code
"""
import numpy as np
import tensorflow as tf
import tensorflow as  K
import time


def py_cpu_softnms(dets, sc, Nt=0.3, sigma=0.5, thresh=0.001, method=2):
    """
    py_cpu_softnms
    :param dets:   boexs 坐标矩阵 format [y1, x1, y2, x2]
    :param sc:     每个 boxes 对应的分数
    :param Nt:     iou 交叠门限
    :param sigma:  使用 gaussian 函数的方差
    :param thresh: 最后的分数门限
    :param method: 使用的方法
    :return:       留下的 boxes 的 index
    """

    # indexes concatenate boxes with the last column
    N = dets.shape[0]                                # 5
    indexes = np.array([np.arange(N)])               # array([[0, 1, 2, 3, 4, 5]])
    
    
    
    '''
    dets=array([[200., 200., 400., 400.,   0.],
               [220., 220., 420., 420.,   1.],
               [200., 240., 400., 440.,   2.],
               [240., 200., 440., 400.,   3.],
               [  1.,   1.,   2.,   2.,   4.]])
    '''
    
    dets = np.concatenate((dets, indexes.T), axis=1) #

    # the order of boxes coordinate is [y1,x1,y2,x2]
    y1 = dets[:, 0]
    x1 = dets[:, 1]
    y2 = dets[:, 2]
    x2 = dets[:, 3]
    scores = sc     # array([0.9, 0.5, 0. , 0. , 0. ], dtype=float32)
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)

    for i in range(N):
        # intermediate parameters for later parameters exchange
        
        
        #当前变量
        tBD = dets[i, :].copy()
        tscore = scores[i].copy()
        tarea = areas[i].copy()
        pos = i + 1

        #判断是不是最后一个box
        if i != N-1:
            
            #除去第一个box 的score 最大score 和其相应的index(位置)
            maxscore = np.max(scores[pos:], axis=0)
            maxpos = np.argmax(scores[pos:], axis=0)
        else:
            maxscore = scores[-1]
            maxpos = 0
            
        #比较当前的score 和余下的score的最大值
        #如果当前的score 比余下的score的最大值小，就交换对应的(x1,y1,x2,y2),score和area
        
        #总之只有一点就是把score 最大的那个box 放在第一个
        if tscore < maxscore:
            
            #将首位的box 换成，score最大的那个box
            #交换坐标　(x1,y1,x2,y2)
            dets[i, :] = dets[maxpos + i + 1, :]
            dets[maxpos + i + 1, :] = tBD
            tBD = dets[i, :]
            
            #交换score 
            scores[i] = scores[maxpos + i + 1]
            scores[maxpos + i + 1] = tscore
            tscore = scores[i]
            
            #交换面积
            areas[i] = areas[maxpos + i + 1]
            areas[maxpos + i + 1] = tarea
            tarea = areas[i]

        # IoU calculate
        #分别比较当前的box的某个坐标与其他坐标的最大与最小值，
        #为计算交并比做准备
        xx1 = np.maximum(dets[i, 1], dets[pos:, 1])
        yy1 = np.maximum(dets[i, 0], dets[pos:, 0])
        xx2 = np.minimum(dets[i, 3], dets[pos:, 3])
        yy2 = np.minimum(dets[i, 2], dets[pos:, 2])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        
        #交并比
        ovr = inter / (areas[i] + areas[pos:] - inter)
        
        # print('ovr=',ovr)
        
        #传统的NMS算法与soft-NMS算法的比较差别就是weight 的计算方式不同，
        #NMS算法比较粗暴，直接　kill 掉　交并比大于阈值的box
        #而soft-NMS则将大于阈值的box降低一定的权重

        # Three methods: 1.linear 2.gaussian 3.original NMS
        if method == 1:  # linear
            weight = np.ones(ovr.shape)
            weight[ovr > Nt] = weight[ovr > Nt] - ovr[ovr > Nt]
        elif method == 2:  # gaussian
            weight = np.exp(-(ovr * ovr) / sigma)
        else:  # original NMS
            weight = np.ones(ovr.shape)
            weight[ovr > Nt] = 0

        scores[pos:] = weight * scores[pos:]
        
        print('weight=',weight)
        # print('scores=',scores)
        

    # select the boxes and keep the corresponding indexes
    inds = dets[:, 4][scores > thresh]
    keep = inds.astype(int)

    return keep


def speed():
    boxes =1000* np.random.rand(1000, 100, 4)
    boxscores = np.random.rand(1000, 100)

    start = time.time()
    for i in range(1000):
        py_cpu_softnms(boxes[i], boxscores[i], method=2)
    end = time.time()
    print("Average run time: %f ms" % (end-start))


def test():
    # boxes and scores
    boxes = np.array([[200, 200, 400, 400], [220, 220, 420, 420], [200, 240, 400, 440], [240, 200, 440, 400], [1, 1, 2, 2]], dtype=np.float32)
    boxscores = np.array([0.9, 0.8, 0.7, 0.6, 0.5], dtype=np.float32)

    # tf.image.non_max_suppression 中 boxes 是 [y1,x1,y2,x2] 排序的。

    index = py_cpu_softnms(boxes, boxscores, method=2)
    selected_boxes = boxes[index]
    
    return selected_boxes


if __name__ == '__main__':
    test()
    # speed()