soft nms 论文Improving Object DetectionWith One Line of Code笔记

论文:Improving Object DetectionWith One Line of Code
论文链接:https://arxiv.org/pdf/1704.04503.pdf

这是ICCV2017的文章,是NMS算法的改进,

原始nms

假设网络预测出了6个矩形框A1~A6,对每个box按照置信度从小到大做排序,排队结果为:A1,A2,A3,A4,A5,A6
(1)从最大置信度矩形框A6开始,分别判断A1,A2,A3,A4,A5与A6的重叠度IOU是否大于某个设定的阈值(假设阈值a=0.7);
(2)假设A2、A3与A6的重叠度超过阈值,那么就扔掉A2、A3;并标记第一个矩形框A6,此时只剩下了矩形框A1,A4,A5,A6;
(3)从剩下的矩形框A1、A4、A5中,选择概率最大的A5,然后判断A1,A4与A5的重叠度,若重叠度大于设定的阈值,那么就扔掉这个box;假设扔掉了A1,则此时只剩下了矩形框A4,A5,A6;

(4)这样就找到所有被保留下来的矩形框A4,A5,A6,而且这些框之间的相互重叠度不高,即iou较小 

soft nms

假设网络预测出了6个矩形框A1~A6,对每个box按照置信度从小到大做排序,排队结果为:B0XES=[A1,A2,A3,A4,A5,A6]

假设各自的置信度为(s1,s2,s3,s4,s5,s6)
(1)从最大置信度矩形框A6开始,分别判断A1,A2,A3,A4,A5与A6的重叠度IOU是否大于某个设定的阈值(假设阈值a=0.3);
(2)假设A2、A3与A6的IOU分别为0.4,0.45超过阈值,那么就修改A2、A3的置信度,

s2=s2*(1-iou(A2,A6)=s2*0.6

s3=s3*(1-iou(A3,A6)=s3*0.55

若发现更新后的s2<阈值=0.001,就会舍弃A2,否则就会保留A2,现在假设更新后的s2小于阈值,但s3没有小于阈值=0.001,则A2舍弃,A3保留下来,此时矩形框B0XES=[A1,A3,A4,A5,A6]

(3) 对B0XES按照置信度从小到大做排序,假设排队结果为:B0XES=[A3,A1,A4,A5,A6],注意因为在step2中A3的置信度被降低了,所以这次排队后,A3就可能位置会发生变化

(4)分别判断A3,A1,A2,A4与A5的重叠度IOU是否大于某个设定的阈值,重复步骤(2)

(5) 其他的类同NMS

问题;

1.即使把A2、A3的置信度改小了,造成的后果是按照置信度排队,影响到了排队结果,原来为A1,A2,A3,A4,A5,A6

,修改后可能为A3,A2,A1,A4,A5,A6, 但这个重新排队后,还是需要计算A3,A2,A1,A4,A5与A6的iou, 这时A3,A2与A6的iou没有变,不是照样被舍弃吗?与原始的nms没区别啊

答: 尽管A3,A2与A6的iou没有变,但其置信度被降低,若被降低之后的置信度<给定阈值,A3,A2仍山会被舍弃的,但若被降低之后的置信度>给定阈值,则这个box就会被保留下来,因此相比原始的NMS来讲,soft nms被保留下来的box会比较多.

还是看代码吧,这段代码来自KL-LOSS-master/detectron/utils/cython_nms.pyx-->soft_nms(),

代码地址:https://github.com/yihui-he/KL-Loss

def soft_nms(
    np.ndarray[float, ndim=2] boxes_in,
    float sigma=0.5,
    float Nt=0.3,
    float threshold=0.001,
    unsigned int method=0
):
    boxes = boxes_in.copy()
    cdef unsigned int N = boxes.shape[0] #输入的box的个数
    cdef float iw, ih, box_area #每个box的宽高,面积
    cdef float ua
    cdef int pos = 0
    cdef float maxscore = 0
    cdef int maxpos = 0
    cdef float x1, x2, y1, y2, tx1, tx2, ty1, ty2, ts, area, weight, ov
    inds = np.arange(N)

    for i in range(N):#遍历所有的box
        maxscore = boxes[i, 4]#记录的这个box的置信度
        maxpos = i #不知道哪个最高的时候,假设第一个就是最大的置信度值

        tx1 = boxes[i,0]
        ty1 = boxes[i,1]
        tx2 = boxes[i,2]
        ty2 = boxes[i,3]
        ts = boxes[i,4]
        ti = inds[i]

        pos = i + 1
        # get max box
        while pos < N:
            if maxscore < boxes[pos, 4]:#当当前记录的最大置信度小于新出现的box的置信度时
                maxscore = boxes[pos, 4] #会更新最大值信度值
                maxpos = pos #记录到底是哪个box的置信度最大
            pos = pos + 1

        # add max box as a detection
        boxes[i,0] = boxes[maxpos,0]
        boxes[i,1] = boxes[maxpos,1]
        boxes[i,2] = boxes[maxpos,2]
        boxes[i,3] = boxes[maxpos,3]
        boxes[i,4] = boxes[maxpos,4]
        inds[i] = inds[maxpos]

        # swap ith box with position of max box
        boxes[maxpos,0] = tx1
        boxes[maxpos,1] = ty1
        boxes[maxpos,2] = tx2
        boxes[maxpos,3] = ty2
        boxes[maxpos,4] = ts
        inds[maxpos] = ti

        tx1 = boxes[i,0]
        ty1 = boxes[i,1]
        tx2 = boxes[i,2]
        ty2 = boxes[i,3]
        ts = boxes[i,4]

        pos = i + 1
        # NMS iterations, note that N changes if detection boxes fall below
        # threshold
        while pos < N:#现在boxes中第一个box的b1置信度最高,现在把b1与其他的所有box算iou
            x1 = boxes[pos, 0]#比如我现在取出了一个box b2
            y1 = boxes[pos, 1]
            x2 = boxes[pos, 2]
            y2 = boxes[pos, 3]
            s = boxes[pos, 4]

            area = (x2 - x1 + 1) * (y2 - y1 + 1)#计算b2的面积,
            iw = (min(tx2, x2) - max(tx1, x1) + 1)#计算重合部分的宽
            if iw > 0:
                ih = (min(ty2, y2) - max(ty1, y1) + 1)#计算重合部分的高
                if ih > 0:
                    ua = float((tx2 - tx1 + 1) * (ty2 - ty1 + 1) + area - iw * ih)#两个box并集的面积
                    ov = iw * ih / ua #iou between max box and detection box#两个box重合部分的面积与并集的面积的比值=iou

                    if method == 1: # linear 这里是soft nms的部分
                        if ov > Nt:#若两个box的iou大于阈值
                            weight = 1 - ov#这个box b2的权重就是1-iou, 可以看到iou越大,这个box的权重越小
                        else: #若两个box的iou小于阈值,也就是这俩box .b2 和b1重合了,但重合部分比较少
                            weight = 1
                    elif method == 2: # gaussian
                        weight = np.exp(-(ov * ov)/sigma)
                    else: # original NMS
                        if ov > Nt:
                            weight = 0
                        else:
                            weight = 1

                    boxes[pos, 4] = weight*boxes[pos, 4]#更新box b2的置信度=score_new=score_old*weight, 可以看出若b2与b1的iou比较大,其置信度会被降低

                    # if box score falls below threshold, discard the box by
                    # swapping with last box update N
                    if boxes[pos, 4] < threshold: #若经过更新后b2的置信度小于了阈值=0.001,
                        boxes[pos,0] = boxes[N-1, 0]#就会更新b2=bN
                        boxes[pos,1] = boxes[N-1, 1]
                        boxes[pos,2] = boxes[N-1, 2]
                        boxes[pos,3] = boxes[N-1, 3]
                        boxes[pos,4] = boxes[N-1, 4]
                        inds[pos] = inds[N-1]
                        N = N - 1 #把整个boxes的长度减小1,比如原来boxes=[b1,b2,b3,b4,b5,b6]的话,N=6,其中b1是置信度最大的,这里若b2的置信度经过更新后比阈值小,
                        #b2=b6,N=5,即boxes更新为boxes=[b1,b6,b3,b4,b5],即把原来的b2舍弃了,当然,若b2与b1 iou很大,但更新后的b2的置信度仍然比较大的化,b2就被保留下来,
                        #而原始的NMS方法不会保留b2,而是直接删除了b2
                        pos = pos - 1

            pos = pos + 1

    return boxes[:N], inds[:N]

还有一份python 版,比较直接的

def softnms(dets, sc, Nt=0.3, sigma=0.5, thresh=0.001, method=2):
    """
    py_cpu_softnms
    :param dets:   boexs 坐标矩阵 format [y1, x1, y2, x2]
    :param sc:     每个 boxes 对应的分数
    :param Nt:     iou 交叠门限
    :param sigma:  使用 gaussian 函数的方差
    :param thresh: 最后的分数门限
    :param method: 使用的方法
    :return:       留下的 boxes 的 index
    """

    # indexes concatenate boxes with the last column
    N = dets.shape[0] # K.gather(boxes, index2)#传入的box的总的个数,比若,若传进来的
    # box=[[200. 200. 400. 400.]
    #      [220. 220. 420. 420]], 此时N=2

    indexes = np.array([np.arange(N)])#标记每个box的索引
    dets = np.concatenate((dets, indexes.T), axis=1)#把box的每一组坐标与索引号拼接起来,现在变为
    # [[200. 200. 400. 400.   0.]
      # [220. 220. 420. 420.   1.]
    # the order of boxes coordinate is [y1,x1,y2,x2]
    y1 = dets[:, 0]
    x1 = dets[:, 1]
    y2 = dets[:, 2]
    x2 = dets[:, 3]
    scores = sc
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)

    for i in range(N):
        # intermediate parameters for later parameters exchange
        tBD = dets[i, :].copy()
        tscore = scores[i].copy()
        tarea = areas[i].copy()
        pos = i + 1

        #
        if i != N-1:
            maxscore = np.max(scores[pos:], axis=0)
            maxpos = np.argmax(scores[pos:], axis=0)
        else:
            maxscore = scores[-1]
            maxpos = 0
        if tscore < maxscore:
            dets[i, :] = dets[maxpos + i + 1, :]
            dets[maxpos + i + 1, :] = tBD
            tBD = dets[i, :]

            scores[i] = scores[maxpos + i + 1]
            scores[maxpos + i + 1] = tscore
            tscore = scores[i]

            areas[i] = areas[maxpos + i + 1]
            areas[maxpos + i + 1] = tarea
            tarea = areas[i]

        # IoU calculatescores[0]
        xx1 = np.maximum(dets[i, 1], dets[pos:, 1])
        yy1 = np.maximum(dets[i, 0], dets[pos:, 0])
        xx2 = np.minimum(dets[i, 3], dets[pos:, 3])
        yy2 = np.minimum(dets[i, 2], dets[pos:, 2])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        ovr = inter / (areas[i] + areas[pos:] - inter)

        # Three methods: 1.linear 2.gaussian 3.original NMS
        if method == 1:  # linear
            weight = np.ones(ovr.shape)
            weight[ovr > Nt] = weight[ovr > Nt] - ovr[ovr > Nt]
        elif method == 2:  # gaussian
            weight = np.exp(-(ovr * ovr) / sigma)
        else:  # original NMS
            weight = np.ones(ovr.shape)
            weight[ovr > Nt] = 0# print('匹配的',keep)

        scores[pos:] = weight * scores[pos:]

    # select the boxes and keep the corresponding indexes
    inds = dets[:, 4][scores > thresh]
    # print('inds',inds)
    keep = inds.astype(int)[:20]
    # print('keep',len(keep),keep)
    #
    # print('keep box',dets[keep])
    # print('dets', len(dets), dets)
    #
    # print('keepscore',sc[keep])
    # print('sc', sc)

    return keep


if  __name__ == '__main__':
    # boxes and scores
    boxes = np.array([[200, 200, 400, 400], [220, 220, 420, 420], [200, 240, 400, 440], [240, 200, 440, 400], [1, 1, 2, 2]], dtype=np.float32)
    boxscores = np.array([0.9, 0.8, 0.7, 0.6, 0.5], dtype=np.float32)

    # tf.image.non_max_suppression 中 boxes 是 [y1,x1,y2,x2] 排序的。
    with tf.Session() as sess:

        index = sess.run(tf.image.non_max_suppression(boxes=boxes, scores=boxscores, iou_threshold=0.5, max_output_size=5))
        print(sess.run(K.gather(boxes, index)))
        index2=softnms(boxes, boxscores, method=2)

        selected_boxes = sess.run(K.gather(boxes, index2))
        print(selected_boxes)#[[200. 200. 400. 400.]
    # [  1.   1.   2.   2.]]

 

 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值