把sahi结果可视化

最新推荐文章于 2025-06-11 10:52:50 发布

Python_jerry720

最新推荐文章于 2025-06-11 10:52:50 发布

阅读量200

点赞数 2

文章标签：人工智能深度学习

本文链接：https://blog.csdn.net/Python_jerry/article/details/147399148

版权

继续上文的任务，同时可视化分割效果和检测效果。这次将yolo的目标检测替换成sahi结果。

首先加入sahi推理模型

    detection_model = AutoDetectionModel.from_pretrained(
        model_type="ultralytics",
        model_path=model_path,
        confidence_threshold=0.3,
        device='cuda:0',  # or 'cuda:0'
    )

之后就是如何可视化sahi结果

from sahi.predict import predict, get_sliced_prediction
from sahi.utils.cv import read_image_as_pil, visualize_object_predictions

第一个方法就是前文用来切片推理的，第二个方法中的第二个就是可视化

        annotated_frame=frame.copy()
        results = get_sliced_prediction(
            annotated_frame,
            detection_model,
            slice_height=640,
            slice_width=640,
            overlap_height_ratio=0.2,
            overlap_width_ratio=0.2,
        )
        fig=visualize_object_predictions(
            image=annotated_frame,
            object_prediction_list=results.object_prediction_list,
            output_dir="output_visuals",
            file_name="yolov5_prediction",
            rect_th=3,
            text_size=1.0,
            hide_conf=False  # 显示置信度分数
        )

对视频帧进行推理并将结果可视化，返回的是一个sahi的可视化对象，但是这其实是一个字典，字典中的第一个键就是image，是一个ndarray。只需要yolo的分割结果可视化到这个结果上就可以实现了。

        annotated_frame1=results1[0].plot(
            img=fig['image'],
            conf=False,  # 显示置信度
            boxes=True,

整体代码如下，虽然引入了sahi，但是发现sahi不能追踪，后面尝试给sahi加入追踪或者放弃

import warnings
warnings.filterwarnings('ignore')
from ultralytics import YOLO
from sahi import AutoDetectionModel
from sahi.predict import predict, get_sliced_prediction
from sahi.utils.cv import read_image_as_pil, visualize_object_predictions
import numpy as np
import cv2
if __name__ == '__main__':
    model1 = YOLO('roadseg.pt') # 分割模型
    model2 = YOLO('visdrone.pt') # 目标检测模型
    model_path = 'visdrone.pt'
    detection_model = AutoDetectionModel.from_pretrained(
        model_type="ultralytics",
        model_path=model_path,
        confidence_threshold=0.3,
        device='cuda:0',  # or 'cuda:0'
    )
    #iou越小，筛掉的框越多，conf越大筛掉的框越多
    video_path = "nx-fraction.mp4"
    cap = cv2.VideoCapture(video_path)
    success, test_frame = cap.read()
    video_height, video_width = test_frame.shape[:2]
    cap.set(cv2.CAP_PROP_POS_FRAMES, 0)
    # 在初始化部分添加视频写入器
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    out = cv2.VideoWriter('seg-det.mp4', fourcc, 30.0, (video_width, video_height))

    while cap.isOpened():
        success, frame = cap.read()
        if not success:
            break
        annotated_frame=frame.copy()
        results = get_sliced_prediction(
            annotated_frame,
            detection_model,
            slice_height=640,
            slice_width=640,
            overlap_height_ratio=0.2,
            overlap_width_ratio=0.2,
        )
        fig=visualize_object_predictions(
            image=annotated_frame,
            object_prediction_list=results.object_prediction_list,
            output_dir="output_visuals",
            file_name="yolov5_prediction",
            rect_th=3,
            text_size=1.0,
            hide_conf=False  # 显示置信度分数
        )
        results1=model1.track(source=annotated_frame,
                              persist=True,
                    imgsz=1280,
                    project='runs/track',
                    name='exp',
                    save=False,
                    show=False,
                    iou=0
                    , conf=0
                    )

        annotated_frame1=results1[0].plot(
            img=fig['image'],
            conf=False,  # 显示置信度
            boxes=True,
        )


        # 显示并保存处理后的帧
        cv2.imshow('Tracking', annotated_frame1)
        out.write(annotated_frame1)  # 写入帧到视频文件
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    # 释放资源
    cap.release()
    out.release()  # 释放视频写入器
    cv2.destroyAllWindows()