yoloV8-Predict

from ultralytics import YOLO

# Load a model
model = YOLO("yolov8n.pt")  # pretrained YOLOv8n model

# Run batched inference on a list of images
results = model(["im1.jpg", "im2.jpg"])  # return a list of Results objects

# Process results list
for result in results:
    boxes = result.boxes  # Boxes object for bounding box outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    keypoints = result.keypoints  # Keypoints object for pose outputs
    probs = result.probs  # Probs object for classification outputs
    obb = result.obb  # Oriented boxes object for OBB outputs
    result.show()  # display to screen
    result.save(filename="result.jpg")  # save to disk

Inference Sources 推理来源

Source

Argument 论点

Type 类型

Notes 笔记

image

'image.jpg'

stror Path

Single image file.

单个映像文件。

URL

'https://ultralytics.com/images/bus.jpg'

str

URL to an image.

到图像的URL。

screenshot

'screen'

str

Capture a screenshot.

捕获屏幕截图。

PIL

Image.open('im.jpg')

PIL.Image

HWC format with RGB channels.

HWC格式与RGB通道。

OpenCV

cv2.imread('im.jpg')

np.ndarray

HWC format with BGR channels uint8 (0-255).

具有BGR通道的HWC格式 uint8 (0-255)

numpy

np.zeros((640,1280,3))

np.ndarray

HWC format with BGR channels uint8 (0-255).

具有BGR通道的HWC格式 uint8 (0-255)

torch

torch.zeros(16,3,320,640)

torch.Tensor

BCHW format with RGB channels float32 (0.0-1.0).

具有RGB通道的BCHW格式 float32 (0.0-1.0)

CSV

'sources.csv'

str or Path

CSV file containing paths to images, videos, or directories.

包含图像、视频或目录路径的CSV文件。

video ✅

'video.mp4'

str or Path

Video file in formats like MP4, AVI, etc.

视频文件格式,如MP4, AVI等。

directory ✅

'path/'

str or Path

Path to a directory containing images or videos.

包含图像或视频的目录路径。

glob ✅

'path/*.jpg'

str

Glob pattern to match multiple files. Use the * character as a wildcard.

匹配多个文件的Glob模式。使用 * 字符作为通配符。

YouTube ✅

'https://youtu.be/LNwODJXcvt4'

str

URL to a YouTube video.

YouTube视频的URL。

stream ✅

'rtsp://example.com/media.mp4'

str

URL for streaming protocols such as RTSP, RTMP, TCP, or an IP address.

流协议的URL,如RTSP、RTMP、TCP或IP地址。

multi-stream ✅

'list.streams'

str or Path

*.streams text file with one stream URL per row, i.e. 8 streams will run at batch-size 8.

*.streams 文本文件,每行一个流URL,即8个流将以批处理大小8运行。

from ultralytics import YOLO
import cv2
import numpy as np
import torch

# Load a pretrained YOLOv8n model
model = YOLO("yolov8n.pt")
# --------------------------------------------------------------------------------------------
# Define path to the image file
source = "path/to/image.jpg"

# Run inference on the source
results = model(source)  # list of Results objects

# --------------------------------------------------------------------------------------------
# Define current screenshot as source
source = "screen"

# --------------------------------------------------------------------------------------------
# Define remote image or video URL
source = "https://ultralytics.com/images/bus.jpg"

# --------------------------------------------------------------------------------------------
# Open an image using PIL
source = Image.open("path/to/image.jpg")

# --------------------------------------------------------------------------------------------
# Read an image using OpenCV
source = cv2.imread("path/to/image.jpg")

# --------------------------------------------------------------------------------------------
# Create a random numpy array of HWC shape (640, 640, 3) with values in range [0, 255] and type uint8
source = np.random.randint(low=0, high=255, size=(640, 640, 3), dtype="uint8")

# --------------------------------------------------------------------------------------------
# Create a random torch tensor of BCHW shape (1, 3, 640, 640) with values in range [0, 1] and type float32
source = torch.rand(1, 3, 640, 640, dtype=torch.float32)

# --------------------------------------------------------------------------------------------
# Define a path to a CSV file with images, URLs, videos and directories
source = "path/to/file.csv"

# --------------------------------------------------------------------------------------------
# Define path to video file
source = "path/to/video.mp4"

# Run inference on the source
results = model(source, stream=True)

# --------------------------------------------------------------------------------------------
# Define path to directory containing images and videos for inference
source = "path/to/dir"

# Run inference on the source
results = model(source, stream=True)  # generator of Results objects

# --------------------------------------------------------------------------------------------
# Define a glob search for all JPG files in a directory
source = "path/to/dir/*.jpg"

# OR define a recursive glob search for all JPG files including subdirectories
source = "path/to/dir/**/*.jpg"

# Run inference on the source
results = model(source, stream=True)  # generator of Results objects

# --------------------------------------------------------------------------------------------
# Define source as YouTube video URL
source = "https://youtu.be/LNwODJXcvt4"

# --------------------------------------------------------------------------------------------
# Single stream with batch-size 1 inference
source = "rtsp://example.com/media.mp4"  # RTSP, RTMP, TCP or IP streaming address

# Multiple streams with batched inference (i.e. batch-size 8 for 8 streams)
source = "path/to/list.streams"  # *.streams text file with one streaming address per row

# Run inference on the source
results = model(source, stream=True)  # generator of Results objects


Inference arguments: 推理参数:

Argument 论点

Type 类型

Default 默认的

Description 描述

source

str

'ultralytics/assets'

Specifies the data source for inference. Can be an image path, video file, directory, URL, or device ID for live feeds. Supports a wide range of formats and sources, enabling flexible application across different types of input.

指定用于推断的数据源。可以是图像路径、视频文件、目录、URL或实时提要的设备ID。支持多种格式和源,支持跨不同类型输入的灵活应用程序。

conf

float

0.25

Sets the minimum confidence threshold for detections. Objects detected with confidence below this threshold will be disregarded. Adjusting this value can help reduce false positives.

设置检测的最小置信度阈值。以低于此阈值的置信度检测到的对象将被忽略。调整这个值可以帮助减少误报。

iou

float

0.7

Intersection Over Union (IoU) threshold for Non-Maximum Suppression (NMS). Lower values result in fewer detections by eliminating overlapping boxes, useful for reducing duplicates.

非最大抑制(NMS)的IoU (Intersection Over Union)阈值。较低的值通过消除重叠框导致较少的检测,有助于减少重复。

imgsz

int or tuple

640

Defines the image size for inference. Can be a single integer 640 for square resizing or a (height, width) tuple. Proper sizing can improve detection accuracy and processing speed.

定义用于推理的图像大小。可以是单个整数 640 用于正方形大小调整,也可以是(height, width)元组。适当的施胶可以提高检测精度和加工速度。

half

bool

False

Enables half-precision (FP16) inference, which can speed up model inference on supported GPUs with minimal impact on accuracy.

支持半精度(FP16)推理,可以在支持的gpu上加速模型推理,同时对精度的影响最小。

device

str

None

Specifies the device for inference (e.g., cpu, cuda:0 or 0). Allows users to select between CPU, a specific GPU, or other compute devices for model execution.

指定用于推理的设备(例如 cpucuda:00 )。允许用户在CPU、特定GPU或其他计算设备之间选择模型执行。

max_det

int

300

Maximum number of detections allowed per image. Limits the total number of objects the model can detect in a single inference, preventing excessive outputs in dense scenes.

每个图像允许的最大检测数。限制模型在单个推理中可以检测到的对象总数,防止在密集场景中输出过多。

vid_stride

int

1

Frame stride for video inputs. Allows skipping frames in videos to speed up processing at the cost of temporal resolution. A value of 1 processes every frame, higher values skip frames.

视频输入的帧跨距。允许跳过视频中的帧,以时间分辨率为代价加快处理速度。1的值处理每帧,更高的值跳过帧。

stream_buffer

bool

False

Determines if all frames should be buffered when processing video streams (True), or if the model should return the most recent frame (False). Useful for real-time applications.

决定在处理视频流时是否应该缓冲所有帧( True ),或者模型是否应该返回最近的帧( False )。用于实时应用程序。

visualize

bool

False

Activates visualization of model features during inference, providing insights into what the model is "seeing". Useful for debugging and model interpretation.

在推理过程中激活模型特征的可视化,提供对模型“看到”内容的洞察。用于调试和模型解释。

augment

bool

False

Enables test-time augmentation (TTA) for predictions, potentially improving detection robustness at the cost of inference speed.

为预测启用测试时间增强(TTA),以牺牲推理速度为代价潜在地提高检测健壮性。

agnostic_nms

bool

False

Enables class-agnostic Non-Maximum Suppression (NMS), which merges overlapping boxes of different classes. Useful in multi-class detection scenarios where class overlap is common.

启用类无关的非最大抑制(NMS),它合并不同类的重叠框。在类重叠很常见的多类检测场景中非常有用。

classes

list[int]

None

Filters predictions to a set of class IDs. Only detections belonging to the specified classes will be returned. Useful for focusing on relevant objects in multi-class detection tasks.

将预测筛选到一组类id。只返回属于指定类的检测结果。用于在多类检测任务中聚焦相关对象。

retina_masks

bool

False

Uses high-resolution segmentation masks if available in the model. This can enhance mask quality for segmentation tasks, providing finer detail.

如果模型中可用,则使用高分辨率分割掩码。这可以提高分割任务的掩码质量,提供更精细的细节。

embed

list[int]

None

Specifies the layers from which to extract feature vectors or embeddings. Useful for downstream tasks like clustering or similarity search.

指定要从中提取特征向量或嵌入的层。用于下游任务,如聚类或相似性搜索。

Visualization arguments: 可视化参数:

Argument 论点

Type 类型

Default 默认的

Description 描述

show

bool

False

If True, displays the annotated images or videos in a window. Useful for immediate visual feedback during development or testing.如果 True,则在窗口中显示标注的图片或视频。用于开发或测试期间的即时视觉反馈。

save

bool

False

Enables saving of the annotated images or videos to file. Useful for documentation, further analysis, or sharing results.

允许将注释的图像或视频保存到文件。用于文档编制、进一步分析或共享结果。

save_frames

bool

False

When processing videos, saves individual frames as images. Useful for extracting specific frames or for detailed frame-by-frame analysis.

在处理视频时,将单个帧保存为图像。用于提取特定帧或进行详细的逐帧分析。

save_txt

bool

False

Saves detection results in a text file, following the format [class] [x_center] [y_center] [width] [height] [confidence]. Useful for integration with other analysis tools.

将检测结果保存为文本文件,格式为 [class] [x_center] [y_center] [width] [height] [confidence] 。用于与其他分析工具集成。

save_conf

bool

False

Includes confidence scores in the saved text files. Enhances the detail available for post-processing and analysis.

包括保存的文本文件中的置信度分数。增强可用于后处理和分析的细节。

save_crop

bool

False

Saves cropped images of detections. Useful for dataset augmentation, analysis, or creating focused datasets for specific objects.

保存检测的裁剪图像。用于数据集增强、分析或为特定对象创建重点数据集。

show_labels

bool

True

Displays labels for each detection in the visual output. Provides immediate understanding of detected objects.

在可视化输出中显示每个检测的标签。提供对检测对象的即时理解。

show_conf

bool

True

Displays the confidence score for each detection alongside the label. Gives insight into the model's certainty for each detection.

在标签旁边显示每个检测的置信度得分。为每个检测提供洞察模型的确定性。

show_boxes

bool

True

Draws bounding boxes around detected objects. Essential for visual identification and location of objects in images or video frames.

在检测到的对象周围绘制边界框。对于图像或视频帧中物体的视觉识别和定位至关重要。

line_width

None or int

None

Specifies the line width of bounding boxes. If None, the line width is automatically adjusted based on the image size. Provides visual customization for clarity.

指定边界框的线宽。如果 None,则根据图像大小自动调整线宽。提供视觉自定义以提高清晰度。

Working with Results 结果


 

Attribute 属性

Type 类型

Description 描述

orig_img

numpy.ndarray

The original image as a numpy array.原始图像作为numpy数组。

orig_shape

tuple

The original image shape in (height, width) format.原始图像形状(高度,宽度)格式。

boxes

Boxes, optional

A Boxes object containing the detection bounding boxes.一个Boxes对象,包含检测边界框。

masks

Masks, optional

A Masks object containing the detection masks.包含检测掩码的mask对象。

probs

Probs, optional

A Probs object containing probabilities of each class for classification task.一个包含分类任务中每个类别的概率的Probs对象。

keypoints

Keypoints, optional

A Keypoints object containing detected keypoints for each object.一个Keypoints对象,包含每个对象检测到的关键点。

obb

OBB, optional

An OBB object containing oriented bounding boxes.包含定向边界框的OBB对象。

speed

dict

A dictionary of preprocess, inference, and postprocess speeds in milliseconds per image.一个以毫秒为单位的预处理、推理和后处理速度字典。

names

dict

A dictionary of class names.类名的字典。

path

str

The path to the image file.镜像文件的路径。

  • 10
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值