pytorch下yolov8打包onnx模型并使用NMS对数据清洗（最后有完整代码）

焚詩作薪

已于 2024-07-15 15:27:47 修改

阅读量467

点赞数 3

文章标签：目标检测 YOLO pytorch 人工智能深度学习

于 2024-07-15 15:21:21 首次发布

本文链接：https://blog.csdn.net/qq_64809150/article/details/140436333

版权

本文对该模型进onnx打包并在pytorch环境下运行（其实我打包onnx的目的是为了部署，大家可以不打包的）

一、打包onnx

首先对训练得到的模型进行打包，我用官方训练方法得到的模型默认名是best.pt：

from ultralytics import YOLO
model = YOLO('runs/detect/train/weights/best.pt')
success = model.export(format="onnx", simplify=True)
assert success
print("转换成功")

然后得到一个best.onnx的模型文件，接下来对该模型进行运行处理

二、定义函数方法

定义三个函数方法，分别计算标准输出、非极大值抑制（NMS）、交并比（IuO）

def std_output(pred):
    pred = np.squeeze(pred)
    pred = np.transpose(pred, (1, 0))
    pred_class = pred[..., 4:]
    pred_conf = np.max(pred_class, axis=-1)
    pred = np.insert(pred, 4, pred_conf, axis=-1)
    return pred


def nms(detections, iou_threshold):
    if len(detections) == 0:
        return []

    # 按置信度排序
    detections = detections[detections[:, 4].argsort()[::-1]]

    keep = []
    while len(detections) > 0:
        # 取置信度最高的框
        highest = detections[0]
        keep.append(highest)

        if len(detections) == 1:
            break

        # 计算剩余框与最高置信度框的IOU
        rest_boxes = detections[1:]
        iou = compute_iou(highest, rest_boxes)

        # 保留IOU小于阈值的框
        detections = rest_boxes[iou < iou_threshold]

    return np.array(keep)


def compute_iou(box1, boxes):
    x1 = np.maximum(box1[0] - box1[2] / 2, boxes[:, 0] - boxes[:, 2] / 2)
    y1 = np.maximum(box1[1] - box1[3] / 2, boxes[:, 1] - boxes[:, 3] / 2)
    x2 = np.minimum(box1[0] + box1[2] / 2, boxes[:, 0] + boxes[:, 2] / 2)
    y2 = np.minimum(box1[1] + box1[3] / 2, boxes[:, 1] + boxes[:, 3] / 2)

    inter_area = np.maximum(0, x2 - x1) * np.maximum(0, y2 - y1)
    box1_area = box1[2] * box1[3]
    boxes_area = boxes[:, 2] * boxes[:, 3]

    iou = inter_area / (box1_area + boxes_area - inter_area)
    return iou

1、数据结构介绍

这部分就要对多维数组结构进行说明，比如模型输出数组形状是(1, 7, 8400)，这是yolov8的默认输出，那么可以看作有1个7*8400的矩阵，只需要看7*8400这个矩阵，其中8400是锚框数量，7是包括锚框4个坐标+3个识别目标的类型（可以看我上一篇文章配置文件是不是有3个类型），运行模型得到的数组前4位是锚框在图上的四点坐标，后面的位数是置信度，也就是告诉我们这个锚框里面是什么的可能性最大，有多少种类的标签，那么后面就有多少种类的置信度，所以这个数据长度是根据你的模型能识别多少标签来决定的，但是至少有4位是坐标。

2、函数方法解释

接下来对三个方法进行解释，首先是std_output，如上述所言，我们需要去除第一个维度然后进行转置操作，为什么转置？因为8400*7的矩阵便于切片操作，这样每一行都是一个锚框，否则处理起来异常麻烦。切片后计算置信度，我们只需要4位坐标后面的3个置信度，接着计算每个预测框中所有类别的最大置信度，算完后把最大置信度重新插入原来的数组中，并且插入位置是在4个坐标后面，所以该函数处理后输出的数组形状是(8400, 8)

接下来是NMS和IoU，先解释原理，假设图上有一只小狗，蓝色的锚框覆盖整个小狗（这是最大框，计算来自刚刚插入的第5位最大置信度），而红色的锚框覆盖小狗的半身，那么这段代码会计算这两个锚框的交并比，即交集面积除以并集面积，如果这个交并比小于我设定的比例（比如本案例中我的交并比设置为0.2），那么这个红色锚框就会被保留下来并进行下一轮计算，而最大值的那个锚框会加入keep数组中。8400个锚框最终都会被筛选剩下置信度最大的那个，其实这个算法就是选取局部最大值的思想。在上述的NMS方法中的循环函数就是这样和我的IoU进行比较，大的锚框保留下来，而compute_iou方法就是返回一个包含每个框与最大置信度框的 IoU 的数组，提供给NMS方法进行比较。

最后是上述函数方法的使用：

# 加载ONNX模型
onnx_model_path = 'runs/detect/train4/weights/best.onnx'
ort_session = onnxruntime.InferenceSession(onnx_model_path)

# 加载并预处理图像
image_path = 'ricedata/test/images/077.jpg'
image = Image.open(image_path)
preprocess = transforms.Compose([
    transforms.Resize((640, 640)),
    transforms.ToTensor(),
])
input_data = preprocess(image).unsqueeze(0).numpy()  # 添加批处理维度并转换为NumPy数组

# 进行推理
outputs = ort_session.run(None, {'images': input_data})  # 更新输入名称为模型期望的名称

# 获取第一个输出数组
output_array = outputs[0]

# 解析检测结果
num_detections = output_array.shape[2]

# 定义置信度阈值
confidence_threshold = 0.1
new_array = std_output(output_array)

# 过滤低置信度的检测框
filtered_detections = new_array[new_array[:, 4] > confidence_threshold]

# 进行非极大值抑制
nms_detections = nms(filtered_detections, iou_threshold=0.2)

3、效果图

最后在图上画出处理后的锚框就可以了，效果图如下（锚框颜色浅了一点哈哈哈）：

三、完整代码

# from ultralytics import YOLO
# model = YOLO('runs/detect/train3/weights/best.pt')
# success = model.export(format="onnx", simplify=True)
# assert success
# print("转换成功")

import onnxruntime
import numpy as np
from PIL import Image, ImageDraw
import torchvision.transforms as transforms


def std_output(pred):
    pred = np.squeeze(pred)
    pred = np.transpose(pred, (1, 0))
    pred_class = pred[..., 4:]
    pred_conf = np.max(pred_class, axis=-1)
    pred = np.insert(pred, 4, pred_conf, axis=-1)
    return pred


def nms(detections, iou_threshold):
    if len(detections) == 0:
        return []

    # 按置信度排序
    detections = detections[detections[:, 4].argsort()[::-1]]

    keep = []
    while len(detections) > 0:
        # 取置信度最高的框
        highest = detections[0]
        keep.append(highest)

        if len(detections) == 1:
            break

        # 计算剩余框与最高置信度框的IOU
        rest_boxes = detections[1:]
        iou = compute_iou(highest, rest_boxes)

        # 保留IOU小于阈值的框
        detections = rest_boxes[iou < iou_threshold]
    return np.array(keep)


def compute_iou(box1, boxes):
    x1 = np.maximum(box1[0] - box1[2] / 2, boxes[:, 0] - boxes[:, 2] / 2)
    y1 = np.maximum(box1[1] - box1[3] / 2, boxes[:, 1] - boxes[:, 3] / 2)
    x2 = np.minimum(box1[0] + box1[2] / 2, boxes[:, 0] + boxes[:, 2] / 2)
    y2 = np.minimum(box1[1] + box1[3] / 2, boxes[:, 1] + boxes[:, 3] / 2)

    inter_area = np.maximum(0, x2 - x1) * np.maximum(0, y2 - y1)
    box1_area = box1[2] * box1[3]
    boxes_area = boxes[:, 2] * boxes[:, 3]

    iou = inter_area / (box1_area + boxes_area - inter_area)
    return iou


# 加载ONNX模型
onnx_model_path = 'runs/detect/train4/weights/best.onnx'
ort_session = onnxruntime.InferenceSession(onnx_model_path)

# 加载并预处理图像
image_path = 'ricedata/test/images/077.jpg'
image = Image.open(image_path)
preprocess = transforms.Compose([
    transforms.Resize((640, 640)),
    transforms.ToTensor(),
])
input_data = preprocess(image).unsqueeze(0).numpy()  # 添加批处理维度并转换为NumPy数组

# 进行推理
outputs = ort_session.run(None, {'images': input_data})  # 更新输入名称为模型期望的名称

# 获取第一个输出数组
output_array = outputs[0]

# 解析检测结果
num_detections = output_array.shape[2]

# 定义置信度阈值
confidence_threshold = 0.1
new_array = std_output(output_array)

# 过滤低置信度的检测框
filtered_detections = new_array[new_array[:, 4] > confidence_threshold]

# 进行非极大值抑制
nms_detections = nms(filtered_detections, iou_threshold=0.2)

# 打开原始图像
image = Image.open(image_path)

# 创建绘图对象
draw = ImageDraw.Draw(image)

# 计算特征图与原始图像的尺寸比例
original_image_size = image.size
feature_map_size = (640, 640)  
scale_x = original_image_size[0] / feature_map_size[0]
scale_y = original_image_size[1] / feature_map_size[1]

# 绘制经过NMS处理后的检测框
for detection in nms_detections:
    bbox = detection[:4]
    x0 = (bbox[0] - bbox[2] / 2) * scale_x
    y0 = (bbox[1] - bbox[3] / 2) * scale_y
    x1 = (bbox[0] + bbox[2] / 2) * scale_x
    y1 = (bbox[1] + bbox[3] / 2) * scale_y

    # 绘制边界框
    draw.rectangle([(x0, y0), (x1, y1)], outline="red")

# 显示图像
image.show()

上一篇文章：

pytorch下yolov8模型实现目标检测（全网最简洁快速，一眼懂）-CSDN博客https://blog.csdn.net/qq_64809150/article/details/140435678?spm=1001.2014.3001.5501

焚詩作薪

关注

3
点赞
踩
9

收藏

觉得还不错? 一键收藏
打赏
0
评论
pytorch下yolov8打包onnx模型并使用NMS对数据清洗（最后有完整代码）

本文对该模型进onnx打包并在pytorch环境下运行。打包是我为了部署方便，也可以不打包，本文重点在于非极大值抑制对锚框进行清洗，文章末尾有完整代码
复制链接

扫一扫