Yolo的工具包Supervision 目标检测（图片，视频）

曹操ccm

已于 2024-06-18 17:23:57 修改

阅读量948

点赞数 10

文章标签：目标跟踪人工智能计算机视觉

于 2024-06-18 17:23:03 首次发布

本文链接：https://blog.csdn.net/qq_36018871/article/details/139779168

版权

Supervision 目标检测（图片，视频）

1. 安装

pip install supervision   
pip install ultralytics   # Yolov8 提供的工具包

2. 代码实现

2.1 目标检测（图片）

采用Supervision 目标检测，工具包集成 Yolov8 模型

导入需要的包

import cv2
import supervision as sv

from ultralytics import YOLO
import matplotlib.pyplot as plt

加载模型，得到模型训练的结果

# yolov8s.pt  yolov8n.pt 
model = YOLO("yolov8x.pt")
image = cv2.imread("F:/ai/supervision/material/pic/微信图片_20240605122212.jpg")
results = model(image, verbose=False)[0]
# 从YOLOv8中加载数据结果
detections = sv.Detections.from_ultralytics(results)
# 查看输出结果
detections

其中 detections，是模型预测的结果信息

Detections(
    xyxy=array([[     1.1242,      230.52,      804.28,      741.87],
       [     668.53,      392.16,      809.58,      879.81],
       [     50.536,      398.19,      247.58,       900.9],
       [     221.96,      406.15,      343.99,      860.09],
       [    0.21836,      550.76,      78.427,      872.18],
       [    0.23819,      550.13,      79.402,      1064.2],
       [     666.25,      15.416,      748.06,      90.126]], dtype=float32), 
    mask=None, 
    confidence=array([    0.96796,     0.93112,     0.91831,     0.90192,     0.69789,     0.53844,     0.43446], dtype=float32), 
    class_id=array([5, 0, 0, 0, 0, 0, 1]), 
    tracker_id=None, 
    data={'class_name': array(['bus', 'person', 'person', 'person', 'person', 'person', 'bicycle'], dtype='<U7')}
)

输出参数

xyxy : 模型最大和最小的坐标点；

confidence : 模型预测最大概率值；

data ： 分类名称 class_name

可视化结果

我这里采用的是supervsion 自带的可视化的方法，当然你也可以根据detections 所提供的信息采用别的可视化的方法（opencv）

# 可视化识别结果
 
# 确定可视化参数
bounding_box_annotator = sv.BoundingBoxAnnotator()  # 初始化外框
label_annotator = sv.LabelAnnotator()  # 初始化文字框

labels = [
    f"{class_name} {confidence:.2f}"
    for class_name, confidence
    in zip(detections['class_name'], detections.confidence)
]
#绘制外框
annotated_image = bounding_box_annotator.annotate(
    scene=image, detections=detections)
#绘制文字
annotated_image = label_annotator.annotate(
    scene=annotated_image, detections=detections, labels=labels)
sv.plot_image(annotated_image)

结果

在这里插入图片描述

2.2 目标识别（视频）

需要安装的包

inference==0.9.17
supervision==0.19.0
tqdm
ultralytics

代码实现

该代码需要上传4个参数：

--source_weights_path: Yolo模型参数权重文件
--source_video_path: 源文件所在的文件位置
--target_video_path: 输出的文件所在位置，用于保存识别后的视频文件
--confidence_threshold (optional): 设置模型的信任值，默认是0.3。越高的值对模型具有多选择性，较低的参数模型有可能识别成为多种可能
--iou_threshold (optional): IOU 模型的阈值，默认是0.7。这个参数有助于区分不同的物体，尤其是在拥挤的场景中。

def process_video(
    source_weights_path: str,
    source_video_path: str,
    target_video_path: str,
    confidence_threshold: float = 0.3,
    iou_threshold: float = 0.7,
) -> None:
    model = YOLO(source_weights_path)
	# 追踪者，用于跟随每一个检测到的物体并标记
    tracker = sv.ByteTrack()
    box_annotator = sv.BoundingBoxAnnotator()
    label_annotator = sv.LabelAnnotator()
    frame_generator = sv.get_video_frames_generator(source_path=source_video_path)
    video_info = sv.VideoInfo.from_video_path(video_path=source_video_path)

    with sv.VideoSink(target_path=target_video_path, video_info=video_info) as sink:
        # 遍历每一帧的数据
        for frame in tqdm(frame_generator, total=video_info.total_frames):
            # 加载模型
            results = model(
                frame, verbose=False, conf=confidence_threshold, iou=iou_threshold
            )[0]
			# 从YOLOv8中加载数据结果
            detections = sv.Detections.from_ultralytics(results)
            # 更新被检测的物体的信息
            detections = tracker.update_with_detections(detections)

            annotated_frame = box_annotator.annotate(
                scene=frame.copy(), detections=detections
            )
			
            annotated_labeled_frame = label_annotator.annotate(
                scene=annotated_frame, detections=detections
            )
			# 重新绘制每一帧
            sink.write_frame(frame=annotated_labeled_frame)

完成代码

import argparse

from tqdm import tqdm
from ultralytics import YOLO

import supervision as sv
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"


def process_video(
    source_weights_path: str,
    source_video_path: str,
    target_video_path: str,
    confidence_threshold: float = 0.3,
    iou_threshold: float = 0.7,
) -> None:
    model = YOLO(source_weights_path)

    tracker = sv.ByteTrack()
    box_annotator = sv.BoundingBoxAnnotator()
    label_annotator = sv.LabelAnnotator()
    frame_generator = sv.get_video_frames_generator(source_path=source_video_path)
    video_info = sv.VideoInfo.from_video_path(video_path=source_video_path)

    with sv.VideoSink(target_path=target_video_path, video_info=video_info) as sink:
        for frame in tqdm(frame_generator, total=video_info.total_frames):
            results = model(
                frame, verbose=False, conf=confidence_threshold, iou=iou_threshold
            )[0]
            detections = sv.Detections.from_ultralytics(results)
            detections = tracker.update_with_detections(detections)

            annotated_frame = box_annotator.annotate(
                scene=frame.copy(), detections=detections
            )

            annotated_labeled_frame = label_annotator.annotate(
                scene=annotated_frame, detections=detections
            )

            sink.write_frame(frame=annotated_labeled_frame)


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Video Processing with YOLO and ByteTrack"
    )
    parser.add_argument(
        "--source_weights_path",
        required=True,
        help="Path to the source weights file",
        type=str,
    )
    parser.add_argument(
        "--source_video_path",
        required=True,
        help="Path to the source video file",
        type=str,
    )
    parser.add_argument(
        "--target_video_path",
        required=True,
        help="Path to the target video file (output)",
        type=str,
    )
    parser.add_argument(
        "--confidence_threshold",
        default=0.3,
        help="Confidence threshold for the model",
        type=float,
    )
    parser.add_argument(
        "--iou_threshold", default=0.7, help="IOU threshold for the model", type=float
    )

    args = parser.parse_args()

    process_video(
        source_weights_path=args.source_weights_path,
        source_video_path=args.source_video_path,
        target_video_path=args.target_video_path,
        confidence_threshold=args.confidence_threshold,
        iou_threshold=args.iou_threshold,
    )

执行命令

python ultralytics_example.py --source_weights_path yolov8s.pt --source_video_path input.mp4 --target_video_path tracking_result.mp4

3【参考】

https://github.com/roboflow/supervision?tab=readme-ov-file

后续更新，目标分割，速度检测，颜色识别，以及imutils工具包的使用

曹操ccm

关注

10
点赞
踩
20

收藏

觉得还不错? 一键收藏
打赏
0
评论
Yolo的工具包Supervision 目标检测（图片，视频）

Yolo目标识别算法的应用，Supervision工具包使用。后续会更新，目标分割，速度检测，颜色识别，以及imutils工具包的使用
复制链接

扫一扫