NMS，SoftNMS以及ROIPooling，ROIAlign

最新推荐文章于 2024-01-08 10:52:34 发布

qq_41131535

最新推荐文章于 2024-01-08 10:52:34 发布

阅读量628

点赞数

本文链接：https://blog.csdn.net/qq_41131535/article/details/89948218

版权

一 NMS

参考代码，此为Pytorch版本：

def NMS(boxs, scores, nms_threshold):
    # input boxes(N*4), scores(N,)
    x1 = boxs[:,0]
    y1 = boxs[:,1]
    x2 = boxs[:,2]
    y2 = boxs[:,3]
    area = (boxs[:,2]-boxs[:,0]) * (boxs[:,3]-boxs[:,1])
    _, order = scores.sort(descending=True)
    keep = []
    while order.numel()>0:
        if order.numel() == 1:
            keep.append(order[0].item())
            break
        i = order[-1]
        keep.append(order[0].item())
        order = order[:-1]
        xx1 = x1[order].clamp(min=x1[i])
        yy1 = y1[order].clamp(min=y1[i])
        xx2 = x2[order].clamp(max=x2[i])
        yy2 = y2[order].clamp(max=y2[i])
        inter = (xx2-xx1).clamp(min=0) * (yy2-yy1).clamp(min=0)
        iou = inter / (area[order] + area[i] -inter)
        # torch版本>=1.6 以上除法不支持了，用如下除法
        # iou = torch.true_divide(inter, area[order] + area[i] -inter)
        idx = (iou<=nms_threshold).nonzero().squeeze()
        order = order[idx]
    return  keep

其中，此代码好像未考虑到一个框完全在另一个框中的情况，会把IOU计算为0；当然里面计算IOU中运用clamp的想法很好，很实用。

二 SoftNMS
首先这有三个超参：iou_thresold, softnms_thresold, sigma这里就是相对于nms，对于IOU大于iou_thresold的框，并不是舍弃，而是重新计算一个得分，一般是下图高斯加权的得分。根据重新计算的得分，然后出去保留的得分最大的框，剩余保留的框不是通过IOU小于iou_thresold去保留框，而是通过得分高于softnms_thresold去保留剩余的框
在这里插入图片描述
代码（参考https://zhuanlan.zhihu.com/p/54709759）

# 这款代码我觉得写得挺简洁，对比soft-nms官方代码更好理解
def box_soft_nms(bboxes, scores, labels, nms_threshold=0.3, soft_threshold=0.3, sigma=0.5, mode='union'):
    """
    soft-nms implentation according the soft-nms paper
    :param bboxes: all pred bbox
    :param scores: all pred cls
    :param labels: all detect class label，注：scores只是单纯的得分，需配合label才知道具体对应的是哪个类
    :param nms_threshold: origin nms thres, for judging to reduce the cls score of high IoU pred bbox
    :param soft_threshold: after cls score of high IoU pred bbox been reduced, soft_thres further filtering low score pred bbox
    :return:
    """
    unique_labels = labels.cpu().unique().cuda()    # 获取pascal voc 20类标签

    box_keep = []
    labels_keep = []
    scores_keep = []
    for c in unique_labels:             # 相当于NMS中对每一类的操作，对应step-1
        c_boxes = bboxes[labels == c]   # bboxes、scores、labels一一对应，按照label == c就可以取出对应类别 c 的c_boxes、c_scores
        c_scores = scores[labels == c]
        weights = c_scores.clone()
        x1 = c_boxes[:, 0]
        y1 = c_boxes[:, 1]
        x2 = c_boxes[:, 2]
        y2 = c_boxes[:, 3]
        areas = (x2 - x1 + 1) * (y2 - y1 + 1)         # bbox面积
        _, order = weights.sort(0, descending=True)   # bbox根据score降序排序，对应NMS中step-2
        while order.numel() > 0:                      # 对应NMS中step-5
            i = order[0]                              # 当前order中的top-1，保存之
            box_keep.append(c_boxes[i])               # 保存bbox
            labels_keep.append(c)                     # 保存cls_id
            scores_keep.append(c_scores[i])           # 保存cls_score

            if order.numel() == 1:  # 当前order就这么一个bbox了，那不玩了，下一个类的bbox操作吧
                break

            xx1 = x1[order[1:]].clamp(min=x1[i])      # 别忘了x1[i]对应x1[order[0]]，也即top-1，寻找Insection区域的坐标
            yy1 = y1[order[1:]].clamp(min=y1[i])
            xx2 = x2[order[1:]].clamp(max=x2[i])
            yy2 = y2[order[1:]].clamp(max=y2[i])

            w = (xx2 - xx1 + 1).clamp(min=0)          # Insection区域的宽、高、面积
            h = (yy2 - yy1 + 1).clamp(min=0)
            inter = w * h
            
            # IoU中U的计算模式，两种union、min，比较容易理解
            if mode == 'union':
                ovr = inter / (areas[i] + areas[order[1:]] - inter)
            elif mode == 'min':
                ovr = inter / areas[order[1:]].clamp(max=areas[i])
            else:
                raise TypeError('Unknown nms mode: %s.' % mode)
            
            # 经过origin NMS thres，得到高IoU的bboxes index，
            # origin NMS操作就直接剔除掉这些bbox了，soft-NMS就是对这些bbox对应的score做权重降低
            ids_t= (ovr>=nms_threshold).nonzero().squeeze()   # 高IoU的bbox，与inds = np.where(ovr >= nms_threshold)[0]功能类似

            # torch.exp(-(ovr[ids_t] * ovr[ids_t]) / sigma)：这个比较好理解，对score做权重降低的参数，从fig 2、公式中都可以参考
            # order[ids_t+1]：+1对应x1[order[0]]，也即top-1，若需映射回order中各个bbox，就必须+1
            # 这样整体上就容易理解了，就是soft-nms的score抑制方式，未使用NMS中粗暴的直接score = 0的抑制方式
            weights[[order[ids_t+1]]] *= torch.exp(-(ovr[ids_t] * ovr[ids_t]) / sigma)
            
            # soft-nms对高IoU pred bbox的score调整了一次，soft_threshold仅用于对score抑制，score太小就不考虑了
            ids = (weights[order[1:]] >= soft_threshold).nonzero().squeeze()   # 这一轮未被抑制的bbox
            if ids.numel() == 0:   # 竟然全被干掉了，下一个类的bbox操作吧
                break

            c_boxes = c_boxes[order[1:]][ids]   # 先取得c_boxes[order[1:]]，再在其基础之上操作[ids]，获得这一轮未被抑制的bbox
            c_scores = weights[order[1:]][ids]
            _, order = c_scores.sort(0, descending=True)
            if c_boxes.dim()==1:
                c_boxes=c_boxes.unsqueeze(0)
                c_scores=c_scores.unsqueeze(0)

            x1 = c_boxes[:, 0]   # 因为bbox已经做了筛选了，areas需要重新计算一遍，抑制的bbox剔除掉
            y1 = c_boxes[:, 1
            x2 = c_boxes[:, 2]
            y2 = c_boxes[:, 3]
            areas = (x2 - x1 + 1) * (y2 - y1 + 1)

    return box_keep, labels_keep, scores_keep    # scores_keep保存的是未做权重降低的score，降低权重的score仅用于soft-nms操作

二 ROIPooling及ROIAling

ROIPooling及ROIAling的原理参考https://www.cnblogs.com/ranjiewen/articles/8869703.html。
1 主要对对于ROIPooling层，因为misalingment问题，特别对于小目标来说，feature map中一点偏差在映射到原图之后，会造成比较大的偏差。后面采用ROIAling，主要对于映射到feature map中的值采用双线性差值进行计算，减小误差。
2 对于反向传播的计算，对于传统的池化层，最大池化，梯度传播也是将梯度全部传递到最大单元上，平均池化，将梯度平均到整个bin块中。
3 对于ROIPooling层，也是对应将梯度传递给每个bin中最大的单元。而对于ROIAling，梯度传递，对于参与计算整个双线性差值的上下左右四个点都应该有梯度传递，传递系数应该和点的h，w距离相关，但是具体如何实现仍在学习。
在这里插入图片描述