YOLOv3源码阅读之五:nms_utils.py

55 篇文章 6 订阅
7 篇文章 7 订阅
一、YOLO简介

  YOLO(You Only Look Once)是一个高效的目标检测算法,属于One-Stage大家族,针对于Two-Stage目标检测算法普遍存在的运算速度慢的缺点,YOLO创造性的提出了One-Stage。也就是将物体分类和物体定位在一个步骤中完成。YOLO直接在输出层回归bounding box的位置和bounding box所属类别,从而实现one-stage。

  经过两次迭代,YOLO目前的最新版本为YOLOv3,在前两版的基础上,YOLOv3进行了一些比较细节的改动,效果有所提升。

  本文正是希望可以将源码加以注释,方便自己学习,同时也愿意分享出来和大家一起学习。由于本人还是一学生,如果有错还请大家不吝指出。

  本文参考的源码地址为:https://github.com/wizyoung/YOLOv3_TensorFlow

二、代码和注释

  文件目录:YOUR_PATH\YOLOv3_TensorFlow-master\utils\nms_utils.py

  这一部分代码主要是非最大值抑制(NMS)的实现,原理都是相同,过程大致如下:

  • 首先按照目标的置信度从大到小排序
  • 取出当前最大的置信度的目标框
  • 计算剩下的目标框和取出的目标框的iou
  • 依次检查iou的大小,如果iou高于一定的阈值,则说明对应的目标框被取出的目标框抑制了,因此只留下iou小于一定阈值的框。
  • 重复2~4步骤,直至处理完所有的目标框
  • 返回所有取出的目标框,就是NMS的结果

  需要注意的是,NMS只针对于一类类别的数据,如果有多个类别,则需要分别处理。

# coding: utf-8

from __future__ import division, print_function

import numpy as np
import tensorflow as tf


def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5):
    """
    Perform NMS on GPU using TensorFlow.

    params:
        boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image
        scores: tensor of shape [1, 10647, num_classes], score=conf*prob
        num_classes: total number of classes
        max_boxes: integer, maximum number of predicted boxes you'd like, default is 50
        score_thresh: if [ highest class probability score < score_threshold]
                        then get rid of the corresponding box
        nms_thresh: real value, "intersection over union" threshold used for NMS filtering
    """

    boxes_list, label_list, score_list = [], [], []
    max_boxes = tf.constant(max_boxes, dtype='int32')

    # since we do nms for single image, then reshape it
    boxes = tf.reshape(boxes, [-1, 4])  # '-1' means we don't konw the exact number of boxes
    score = tf.reshape(scores, [-1, num_classes])

    # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold".
    mask = tf.greater_equal(score, tf.constant(score_thresh))
    # Step 2: Do non_max_suppression for each class
    for i in range(num_classes):
        # Step 3: Apply the mask to scores, boxes and pick them out
        filter_boxes = tf.boolean_mask(boxes, mask[:, i])
        filter_score = tf.boolean_mask(score[:, i], mask[:, i])
        nms_indices = tf.image.non_max_suppression(boxes=filter_boxes,
                                                   scores=filter_score,
                                                   max_output_size=max_boxes,
                                                   iou_threshold=nms_thresh, name='nms_indices')
        label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32') * i)
        boxes_list.append(tf.gather(filter_boxes, nms_indices))
        score_list.append(tf.gather(filter_score, nms_indices))

    boxes = tf.concat(boxes_list, axis=0)
    score = tf.concat(score_list, axis=0)
    label = tf.concat(label_list, axis=0)

    return boxes, score, label


def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5):
    """
    Pure Python NMS baseline.

    Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the
                      exact number of boxes
               scores: shape of [-1,]
               max_boxes: representing the maximum of boxes to be selected by non_max_suppression
               iou_thresh: representing iou_threshold for deciding to keep boxes
    """
    assert boxes.shape[1] == 4 and len(scores.shape) == 1

    # 下面几行的代码主要是用于求解每个box的面积,然后按照每个box的score的大小进行排序
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]

    areas = (x2 - x1) * (y2 - y1)
    # 按照每个box的score大小进行排序,这里返回的是排序之后的box的index。
    # 本质上order储存的是需要处理的box的索引
    order = scores.argsort()[::-1]

    # keep用于储存保留下来的box的索引index
    keep = []

    # 如果还存在没有被处理的box的索引
    while order.size > 0:
        # 由于之前进行了排序,所以order的第一个肯定是score最高的
        i = order[0]
        # 将这个索引保存起来
        keep.append(i)

        # 下面的代码主要是求解第一个box和剩下的所有的box的IOU,
        # 因为第一个是目标box,所以在order的选取上需要加上[1:],取遍剩下的所有的box
        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        # IOU计算
        ovr = inter / (areas[i] + areas[order[1:]] - inter)

        # 将和目标box的IOU小于一定阈值的box的索引取出,因为高于这一阈值的box都已经被目标box抑制了
        inds = np.where(ovr <= iou_thresh)[0]
        # 然后更新我们的order,重复下一轮循环。
        order = order[inds + 1]

    # 最后返回给定数目的box的索引
    return keep[:max_boxes]


def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5):
    """
    Perform NMS on CPU.
    Arguments:
        boxes: shape [1, 10647, 4]
        scores: shape [1, 10647, num_classes]
    """

    boxes = boxes.reshape(-1, 4)
    scores = scores.reshape(-1, num_classes)
    # Picked bounding boxes
    picked_boxes, picked_score, picked_label = [], [], []

    for i in range(num_classes):
        indices = np.where(scores[:, i] >= score_thresh)
        filter_boxes = boxes[indices]
        filter_scores = scores[:, i][indices]
        if len(filter_boxes) == 0:
            continue
        # do non_max_suppression on the cpu
        indices = py_nms(filter_boxes, filter_scores,
                         max_boxes=max_boxes, iou_thresh=iou_thresh)
        picked_boxes.append(filter_boxes[indices])
        picked_score.append(filter_scores[indices])
        picked_label.append(np.ones(len(indices), dtype='int32') * i)
    if len(picked_boxes) == 0:
        return None, None, None

    boxes = np.concatenate(picked_boxes, axis=0)
    score = np.concatenate(picked_score, axis=0)
    label = np.concatenate(picked_label, axis=0)

    return boxes, score, label

  • 0
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
根据引用\[1\]中的博客,要在iou3d_nms_setup.py文件中进行构建,可以进行以下修改: 在setup.py文件中添加以下代码: ``` include_dirs = \[os.path.realpath('../include'), '/usr/local/cuda/include/'\] ``` 修改make_cuda_ext函数: ``` def make_cuda_ext(name, module, sources): cuda_ext = CUDAExtension( name='%s.%s' % (module, name), sources=\[os.path.join(*module.split('.'), src) for src in sources\], include_dirs=include_dirs ) return cuda_ext ``` 然后,根据引用\[2\]中的说明,可以运行以下命令进行构建: ``` python demo/image_demo.py demo/demo.jpg yolov3_mobilenetv2_320_300e_coco.py yolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth --device cpu --out-file result.jpg ``` 如果需要安装MMCV,可以根据引用\[3\]中的步骤进行安装: ``` Step 0.InstallMMCVusingMIM. pip install -U openmim mim install mmcv-full==1.5.0 (版本必须在\[1.3.17,1.5.0\]区间内) ``` 这样就可以进行iou3d_nms_setup.py的构建了。 #### 引用[.reference_title] - *1* [fatal error: cuda.h: 没有那个文件或目录( pcdet/ops/iou3d_nms/src/iou3d_cpu.cpp:12:18: fatal error: cuda.h: )](https://blog.csdn.net/jiachang98/article/details/121933403)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^koosearch_v1,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* *3* [AssertionError: iou3d_boxes_iou_bev_forward miss in module _ext ----安装mmdetaction过程](https://blog.csdn.net/weixin_44717949/article/details/126105865)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^koosearch_v1,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值