yoloV8-Predict

yyes

已于 2024-08-06 09:54:26 修改

阅读量307

点赞数 10

分类专栏： yolo 文章标签： YOLO 人工智能机器学习

于 2024-08-05 17:36:19 首次发布

本文链接：https://blog.csdn.net/qq_35633062/article/details/140932730

版权

yolo 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

from ultralytics import YOLO

# Load a model
model = YOLO("yolov8n.pt")  # pretrained YOLOv8n model

# Run batched inference on a list of images
results = model(["im1.jpg", "im2.jpg"])  # return a list of Results objects

# Process results list
for result in results:
    boxes = result.boxes  # Boxes object for bounding box outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    keypoints = result.keypoints  # Keypoints object for pose outputs
    probs = result.probs  # Probs object for classification outputs
    obb = result.obb  # Oriented boxes object for OBB outputs
    result.show()  # display to screen
    result.save(filename="result.jpg")  # save to disk

Inference Sources 推理来源

Source 源	Argument 论点	Type 类型	Notes 笔记
image	`'image.jpg'`	`str`or `Path`	Single image file. 单个映像文件。
URL	`'https://ultralytics.com/images/bus.jpg'`	`str`	URL to an image. 到图像的URL。
screenshot	`'screen'`	`str`	Capture a screenshot. 捕获屏幕截图。
PIL	`Image.open('im.jpg')`	`PIL.Image`	HWC format with RGB channels. HWC格式与RGB通道。
OpenCV	`cv2.imread('im.jpg')`	`np.ndarray`	HWC format with BGR channels `uint8 (0-255)`. 具有BGR通道的HWC格式 `uint8 (0-255)` 。
numpy	`np.zeros((640,1280,3))`	`np.ndarray`	HWC format with BGR channels `uint8 (0-255)`. 具有BGR通道的HWC格式 `uint8 (0-255)` 。
torch	`torch.zeros(16,3,320,640)`	`torch.Tensor`	BCHW format with RGB channels `float32 (0.0-1.0)`. 具有RGB通道的BCHW格式 `float32 (0.0-1.0)` 。
CSV	`'sources.csv'`	`str` or `Path`	CSV file containing paths to images, videos, or directories. 包含图像、视频或目录路径的CSV文件。
video ✅	`'video.mp4'`	`str` or `Path`	Video file in formats like MP4, AVI, etc. 视频文件格式，如MP4, AVI等。
directory ✅	`'path/'`	`str` or `Path`	Path to a directory containing images or videos. 包含图像或视频的目录路径。
glob ✅	`'path/*.jpg'`	`str`	Glob pattern to match multiple files. Use the `` character as a wildcard. 匹配多个文件的Glob模式。使用 `` 字符作为通配符。
YouTube ✅	`'https://youtu.be/LNwODJXcvt4'`	`str`	URL to a YouTube video. YouTube视频的URL。
stream ✅	`'rtsp://example.com/media.mp4'`	`str`	URL for streaming protocols such as RTSP, RTMP, TCP, or an IP address. 流协议的URL，如RTSP、RTMP、TCP或IP地址。
multi-stream ✅	`'list.streams'`	`str` or `Path`	`.streams` text file with one stream URL per row, i.e. 8 streams will run at batch-size 8. `.streams` 文本文件，每行一个流URL，即8个流将以批处理大小8运行。

from ultralytics import YOLO
import cv2
import numpy as np
import torch

# Load a pretrained YOLOv8n model
model = YOLO("yolov8n.pt")
# --------------------------------------------------------------------------------------------
# Define path to the image file
source = "path/to/image.jpg"

# Run inference on the source
results = model(source)  # list of Results objects

# --------------------------------------------------------------------------------------------
# Define current screenshot as source
source = "screen"

# --------------------------------------------------------------------------------------------
# Define remote image or video URL
source = "https://ultralytics.com/images/bus.jpg"

# --------------------------------------------------------------------------------------------
# Open an image using PIL
source = Image.open("path/to/image.jpg")

# --------------------------------------------------------------------------------------------
# Read an image using OpenCV
source = cv2.imread("path/to/image.jpg")

# --------------------------------------------------------------------------------------------
# Create a random numpy array of HWC shape (640, 640, 3) with values in range [0, 255] and type uint8
source = np.random.randint(low=0, high=255, size=(640, 640, 3), dtype="uint8")

# --------------------------------------------------------------------------------------------
# Create a random torch tensor of BCHW shape (1, 3, 640, 640) with values in range [0, 1] and type float32
source = torch.rand(1, 3, 640, 640, dtype=torch.float32)

# --------------------------------------------------------------------------------------------
# Define a path to a CSV file with images, URLs, videos and directories
source = "path/to/file.csv"

# --------------------------------------------------------------------------------------------
# Define path to video file
source = "path/to/video.mp4"

# Run inference on the source
results = model(source, stream=True)

# --------------------------------------------------------------------------------------------
# Define path to directory containing images and videos for inference
source = "path/to/dir"

# Run inference on the source
results = model(source, stream=True)  # generator of Results objects

# --------------------------------------------------------------------------------------------
# Define a glob search for all JPG files in a directory
source = "path/to/dir/*.jpg"

# OR define a recursive glob search for all JPG files including subdirectories
source = "path/to/dir/**/*.jpg"

# Run inference on the source
results = model(source, stream=True)  # generator of Results objects

# --------------------------------------------------------------------------------------------
# Define source as YouTube video URL
source = "https://youtu.be/LNwODJXcvt4"

# --------------------------------------------------------------------------------------------
# Single stream with batch-size 1 inference
source = "rtsp://example.com/media.mp4"  # RTSP, RTMP, TCP or IP streaming address

# Multiple streams with batched inference (i.e. batch-size 8 for 8 streams)
source = "path/to/list.streams"  # *.streams text file with one streaming address per row

# Run inference on the source
results = model(source, stream=True)  # generator of Results objects

Inference arguments: 推理参数:

Argument 论点	Type 类型	Default 默认的	Description 描述
`source`	`str`	`'ultralytics/assets'`	Specifies the data source for inference. Can be an image path, video file, directory, URL, or device ID for live feeds. Supports a wide range of formats and sources, enabling flexible application across different types of input. 指定用于推断的数据源。可以是图像路径、视频文件、目录、URL或实时提要的设备ID。支持多种格式和源，支持跨不同类型输入的灵活应用程序。
`conf`	`float`	`0.25`	Sets the minimum confidence threshold for detections. Objects detected with confidence below this threshold will be disregarded. Adjusting this value can help reduce false positives. 设置检测的最小置信度阈值。以低于此阈值的置信度检测到的对象将被忽略。调整这个值可以帮助减少误报。
`iou`	`float`	`0.7`	Intersection Over Union (IoU) threshold for Non-Maximum Suppression (NMS). Lower values result in fewer detections by eliminating overlapping boxes, useful for reducing duplicates. 非最大抑制(NMS)的IoU (Intersection Over Union)阈值。较低的值通过消除重叠框导致较少的检测，有助于减少重复。
`imgsz`	`int or tuple`	`640`	Defines the image size for inference. Can be a single integer `640` for square resizing or a (height, width) tuple. Proper sizing can improve detection accuracy and processing speed. 定义用于推理的图像大小。可以是单个整数 `640` 用于正方形大小调整，也可以是(height, width)元组。适当的施胶可以提高检测精度和加工速度。
`half`	`bool`	`False`	Enables half-precision (FP16) inference, which can speed up model inference on supported GPUs with minimal impact on accuracy. 支持半精度(FP16)推理，可以在支持的gpu上加速模型推理，同时对精度的影响最小。
`device`	`str`	`None`	Specifies the device for inference (e.g., `cpu`, `cuda:0` or `0`). Allows users to select between CPU, a specific GPU, or other compute devices for model execution. 指定用于推理的设备(例如 `cpu` 、 `cuda:0` 或 `0` )。允许用户在CPU、特定GPU或其他计算设备之间选择模型执行。
`max_det`	`int`	`300`	Maximum number of detections allowed per image. Limits the total number of objects the model can detect in a single inference, preventing excessive outputs in dense scenes. 每个图像允许的最大检测数。限制模型在单个推理中可以检测到的对象总数，防止在密集场景中输出过多。
`vid_stride`	`int`	`1`	Frame stride for video inputs. Allows skipping frames in videos to speed up processing at the cost of temporal resolution. A value of 1 processes every frame, higher values skip frames. 视频输入的帧跨距。允许跳过视频中的帧，以时间分辨率为代价加快处理速度。1的值处理每帧，更高的值跳过帧。
`stream_buffer`	`bool`	`False`	Determines if all frames should be buffered when processing video streams (`True`), or if the model should return the most recent frame (`False`). Useful for real-time applications. 决定在处理视频流时是否应该缓冲所有帧( `True` )，或者模型是否应该返回最近的帧( `False` )。用于实时应用程序。
`visualize`	`bool`	`False`	Activates visualization of model features during inference, providing insights into what the model is "seeing". Useful for debugging and model interpretation. 在推理过程中激活模型特征的可视化，提供对模型“看到”内容的洞察。用于调试和模型解释。
`augment`	`bool`	`False`	Enables test-time augmentation (TTA) for predictions, potentially improving detection robustness at the cost of inference speed. 为预测启用测试时间增强(TTA)，以牺牲推理速度为代价潜在地提高检测健壮性。
`agnostic_nms`	`bool`	`False`	Enables class-agnostic Non-Maximum Suppression (NMS), which merges overlapping boxes of different classes. Useful in multi-class detection scenarios where class overlap is common. 启用类无关的非最大抑制(NMS)，它合并不同类的重叠框。在类重叠很常见的多类检测场景中非常有用。
`classes`	`list[int]`	`None`	Filters predictions to a set of class IDs. Only detections belonging to the specified classes will be returned. Useful for focusing on relevant objects in multi-class detection tasks. 将预测筛选到一组类id。只返回属于指定类的检测结果。用于在多类检测任务中聚焦相关对象。
`retina_masks`	`bool`	`False`	Uses high-resolution segmentation masks if available in the model. This can enhance mask quality for segmentation tasks, providing finer detail. 如果模型中可用，则使用高分辨率分割掩码。这可以提高分割任务的掩码质量，提供更精细的细节。
`embed`	`list[int]`	`None`	Specifies the layers from which to extract feature vectors or embeddings. Useful for downstream tasks like clustering or similarity search. 指定要从中提取特征向量或嵌入的层。用于下游任务，如聚类或相似性搜索。

Visualization arguments: 可视化参数:

Argument 论点	Type 类型	Default 默认的	Description 描述
`show`	`bool`	`False`	If `True`, displays the annotated images or videos in a window. Useful for immediate visual feedback during development or testing.如果 `True`，则在窗口中显示标注的图片或视频。用于开发或测试期间的即时视觉反馈。
`save`	`bool`	`False`	Enables saving of the annotated images or videos to file. Useful for documentation, further analysis, or sharing results. 允许将注释的图像或视频保存到文件。用于文档编制、进一步分析或共享结果。
`save_frames`	`bool`	`False`	When processing videos, saves individual frames as images. Useful for extracting specific frames or for detailed frame-by-frame analysis. 在处理视频时，将单个帧保存为图像。用于提取特定帧或进行详细的逐帧分析。
`save_txt`	`bool`	`False`	Saves detection results in a text file, following the format `[class] [x_center] [y_center] [width] [height] [confidence]`. Useful for integration with other analysis tools. 将检测结果保存为文本文件，格式为 `[class] [x_center] [y_center] [width] [height] [confidence]` 。用于与其他分析工具集成。
`save_conf`	`bool`	`False`	Includes confidence scores in the saved text files. Enhances the detail available for post-processing and analysis. 包括保存的文本文件中的置信度分数。增强可用于后处理和分析的细节。
`save_crop`	`bool`	`False`	Saves cropped images of detections. Useful for dataset augmentation, analysis, or creating focused datasets for specific objects. 保存检测的裁剪图像。用于数据集增强、分析或为特定对象创建重点数据集。
`show_labels`	`bool`	`True`	Displays labels for each detection in the visual output. Provides immediate understanding of detected objects. 在可视化输出中显示每个检测的标签。提供对检测对象的即时理解。
`show_conf`	`bool`	`True`	Displays the confidence score for each detection alongside the label. Gives insight into the model's certainty for each detection. 在标签旁边显示每个检测的置信度得分。为每个检测提供洞察模型的确定性。
`show_boxes`	`bool`	`True`	Draws bounding boxes around detected objects. Essential for visual identification and location of objects in images or video frames. 在检测到的对象周围绘制边界框。对于图像或视频帧中物体的视觉识别和定位至关重要。
`line_width`	`None or int`	`None`	Specifies the line width of bounding boxes. If `None`, the line width is automatically adjusted based on the image size. Provides visual customization for clarity. 指定边界框的线宽。如果 `None`，则根据图像大小自动调整线宽。提供视觉自定义以提高清晰度。

Working with Results 结果

Attribute 属性	Type 类型	Description 描述
`orig_img`	`numpy.ndarray`	The original image as a numpy array.原始图像作为numpy数组。
`orig_shape`	`tuple`	The original image shape in (height, width) format.原始图像形状(高度，宽度)格式。
`boxes`	`Boxes, optional`	A Boxes object containing the detection bounding boxes.一个Boxes对象，包含检测边界框。
`masks`	`Masks, optional`	A Masks object containing the detection masks.包含检测掩码的mask对象。
`probs`	`Probs, optional`	A Probs object containing probabilities of each class for classification task.一个包含分类任务中每个类别的概率的Probs对象。
`keypoints`	`Keypoints, optional`	A Keypoints object containing detected keypoints for each object.一个Keypoints对象，包含每个对象检测到的关键点。
`obb`	`OBB, optional`	An OBB object containing oriented bounding boxes.包含定向边界框的OBB对象。
`speed`	`dict`	A dictionary of preprocess, inference, and postprocess speeds in milliseconds per image.一个以毫秒为单位的预处理、推理和后处理速度字典。
`names`	`dict`	A dictionary of class names.类名的字典。
`path`	`str`	The path to the image file.镜像文件的路径。

yyes

关注

10
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
yoloV8-Predict

A Probs object containing probabilities of each class for classification task.一个包含分类任务中每个类别的概率的Probs对象。A Keypoints object containing detected keypoints for each object.一个Keypoints对象，包含每个对象检测到的关键点。1的值处理每帧，更高的值跳过帧。支持半精度(FP16)推理，可以在支持的gpu上加速模型推理，同时对精度的影响最小。
复制链接

扫一扫

专栏目录