NMS流程及示例代码

我是一个对称矩阵

于 2024-08-26 13:33:04 发布

阅读量456

点赞数 4

分类专栏：深度学习DL 文章标签： YOLO

本文链接：https://blog.csdn.net/qq_40243750/article/details/141561147

版权

深度学习DL 专栏收录该内容

19 篇文章 11 订阅

订阅专栏

NMS在目标检测中的作用不再赘述，现在就该算法的方法和流程进行总结。
以某yolo模型输出的61440*6的数据为例，总共输出61440的bbox（实际只有3个目标），每个bbox的格式为[cx,cy,w,h,conf,cls_score]，分别代表bbox的4个值，置信度以及类别分类得分。在该任务中只有1个类，故cls_score≈1，在多类别中，bbox格式为[cx,cy,w,h,conf,cls1_score,cls2_score,…]，所有的类别得到之和等于1。

step 1.过滤
在61440个bbox中大部分都是在背景位置的bbox，其置信度很低，所以首先需要过滤这些低置信度的bbox。比如在ndarray格式中，可以通过下面的代码筛选出置信度大于conf_thresh（一般设为0.5）的bbox，最终filtered_data只有18个bbox

# 筛选conf大于或等于0.5的行
filtered_indices = output[:, 4] >= conf_thresh  # data[:, 4]是获取conf列
filtered_data = output[filtered_indices, :]  # 根据筛选结果保留符合条件的行

step 2. 聚类整理
在同一个位置，不同的类别的bbox可能IOU很高，但是这不属于NMS过滤的对象。所以在所有bbox中，需要按照类别将所有的bbox分类，然后在每一个类中进行IOU过滤。比如通过字典，以类名为key，所属类的bbox在value中：

for bbox in filtered_data:
    if class_bbox_dict.get(round(bbox[5])):
        class_bbox_dict[round(bbox[5])].append(bbox)  # 如果有类名key则追加新的bbox
    else:
        class_bbox_dict[round(bbox[5])] = [bbox]  # 如果没有类名key则新建list，并存储当前bbox

结果如图，因为该任务重只有类别1，所以全部bbox都归并到key=1中：
在这里插入图片描述

step3. 类内IOU过滤

类内过滤如上图所示，假设某类别有6个bbox，按conf降序排序。第一轮首先以第一个为标准，计算剩余bbox与该bbox的IOU，超过阈值则舍弃（红色bbox，在下方的python代码中用None标记）。在该过程中有bbox2和bbox3和bbox1的IOU较高，被舍弃。

第二轮以第二个有效bbox（即bbox3）为标准，计算剩余的有效box与该bbox的IOU，即bbox3和bbox5、bbox6计算IOU，发现bbox6和bbox3的IOU超过阈值，所以被舍弃。

在上面的过程中，标准bbox即绿色的bbox，在代码中会加入到结果result中，最后有3个bbox（bbox1、bbox3、bbox5）加入到result，其余的则被舍弃。

python的实现代码如下：

result = []
for classid, bboxes in class_bbox_dict.items():  # 遍历每一个类，依此对类内bbox进行NMS处理
    bboxes = sorted(list(bboxes), key=lambda x: -x[4])  # 置信度降序排序
    for i in range(bboxes.__len__()):
        if not bboxes[i] is None:
            result.append(bboxes[i])
            for j in range(i + 1, bboxes.__len__()):
                if not bboxes[j] is None:
                    if compute_iou(bboxes[i][:4], bboxes[j][:4]) > iou_thresh:
                        bboxes[j] = None	# 该bbox和标准bbox的IOU较高，被舍弃，这里用None标记

总结
详细来看NMS并不难，上面只是一个类，在多类中NMS会有所不同，但是流程一样。不同之处在于conf会和cls_score进行乘法运算。
比如上方是一个4类的bbox，类别就是4个概率中最大的位置代表的类。在NMS是conf=conf*max(cls_score)，即新的conf综合考虑了检测框的conf和分类概率。NMS的其他过程不变

完成的python代码如下：

import numpy as np
import cv2


def cxcywh2xyxy(bbox):
    """
    将yolo输出格式转为xyxy格式
    @param bbox:
    @return:
    """
    cx, cy, w, h = bbox
    x1, y1, x2, y2 = cx - w / 2, cy - h / 2, cx + w / 2, cy + h / 2
    return [x1, y1, x2, y2]


def compute_iou(box1, box2):
    """
    计算两个矩形框的IoU
    :param box1: 第一个矩形框，格式为(x1, y1, x2, y2)
    :param box2: 第二个矩形框，格式为(x1, y1, x2, y2)
    :return: 两个矩形框的IoU
    """
    # 计算交集
    box1 = cxcywh2xyxy(box1)
    box2 = cxcywh2xyxy(box2)
    xi1 = max(box1[0], box2[0])
    yi1 = max(box1[1], box2[1])
    xi2 = min(box1[2], box2[2])
    yi2 = min(box1[3], box2[3])
    inter_area = max(xi2 - xi1, 0) * max(yi2 - yi1, 0)

    # 计算并集
    box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1])
    box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1])
    union_area = box1_area + box2_area - inter_area

    # 计算IoU
    iou = inter_area / union_area
    return iou


def NMS(img, output, conf_thresh, iou_thresh):
    """
    根据置信度过滤bbox---》按类别聚集bbox并在类内按conf进行排序---》每一个类中计算bbox之间的置信度
    @param img:
    @param output:
    @param conf_thresh:
    @param iou_thresh:
    @return:
    """
    # 筛选conf大于或等于0.5的行
    filtered_indices = output[:, 4] >= conf_thresh  # data[:, 4]是获取conf列
    filtered_data = output[filtered_indices, :]  # 根据筛选结果保留符合条件的行

    class_bbox_dict = {}

    for bbox in filtered_data:
        if class_bbox_dict.get(round(bbox[5])):
            class_bbox_dict[round(bbox[5])].append(bbox)
        else:
            class_bbox_dict[round(bbox[5])] = [bbox]

    result = []
    for classid, bboxes in class_bbox_dict.items():
        bboxes = sorted(list(bboxes), key=lambda x: -x[4])  # 置信度降序排序
        for i in range(bboxes.__len__()):
            if not bboxes[i] is None:
                result.append(bboxes[i])
                for j in range(i + 1, bboxes.__len__()):
                    if not bboxes[j] is None:
                        if compute_iou(bboxes[i][:4], bboxes[j][:4]) > iou_thresh:
                            print(f"del {j}")
                            bboxes[j] = None
    print(result)


if __name__ == "__main__":
    output = np.load("sky500_coarse_2024_08_15_00_22_31_34.npy") # 加载一个yolo输出的数据，对齐进行NMS

    img = cv2.imread("sky500_coarse_2024_08_15_00_22_31_34.jpg")

    conf_thresh = 0.5
    iou_thresh = 0.5

    NMS(img, output, conf_thresh, iou_thresh)

一个c++版本：

void nmsDet(std::vector<Detection> &src, std::vector<Detection> &res, float nms_thresh)
{
  int det_size = sizeof(Detection) / sizeof(float);
  std::map<float, std::vector<Detection>> m;
  for (int i = 0; i < src.size() && i < kMaxNumOutputBbox; i++)
  {
    Detection det = src[i];
    // 先查询map中key为det类别的个数，如果为0证明还没有创建该类的容器，否则直接push_back
    if (m.count(det.class_id) == 0)   // count 返回与特定key匹配的元素的数量
    {
      m.emplace(det.class_id, std::vector<Detection>());  // 插入一个新的类别子容器
    }
    m[det.class_id].push_back(det);   // 向类别子容器中插入det
  }
  for (auto it = m.begin(); it != m.end(); it++)  // 遍历每一个类别子容器
  {
    auto &dets = it->second;           // 获取存放目标det的Vetor，并按照置信度排序
    std::sort(dets.begin(), dets.end(), cmp); 
    // 保证结果中，任意两个目标的IOU都符合条件，
    for (size_t m = 0; m < dets.size(); ++m)    // 遍历每一个Detection
    {
      auto &item = dets[m];
      res.push_back(item);        //将当前最高Conf存入结果容器
      for (size_t n = m + 1; n < dets.size(); ++n)  // 然后剩余的Detection与当前Detection进行IOU计算，如果IOU大于nms_thresh，则删除当前Detection
      {
        if (iou(item.bbox, dets[n].bbox) > nms_thresh)
        {
          dets.erase(dets.begin() + n);  // 删除当前Detection
          --n;
        }
      }
    }
  }
}