from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n.pt") # pretrained YOLOv8n model
# Run batched inference on a list of images
results = model(["im1.jpg", "im2.jpg"]) # return a list of Results objects
# Process results list
for result in results:
boxes = result.boxes # Boxes object for bounding box outputs
masks = result.masks # Masks object for segmentation masks outputs
keypoints = result.keypoints # Keypoints object for pose outputs
probs = result.probs # Probs object for classification outputs
obb = result.obb # Oriented boxes object for OBB outputs
result.show() # display to screen
result.save(filename="result.jpg") # save to disk
Inference Sources 推理来源
Source 源 | Argument 论点 | Type 类型 | Notes 笔记 |
image |
|
| Single image file. 单个映像文件。 |
URL |
|
| URL to an image. 到图像的URL。 |
screenshot |
|
| Capture a screenshot. 捕获屏幕截图。 |
PIL |
|
| HWC format with RGB channels. HWC格式与RGB通道。 |
OpenCV |
|
| HWC format with BGR channels 具有BGR通道的HWC格式 |
numpy |
|
| HWC format with BGR channels 具有BGR通道的HWC格式 |
torch |
|
| BCHW format with RGB channels 具有RGB通道的BCHW格式 |
CSV |
|
| CSV file containing paths to images, videos, or directories. 包含图像、视频或目录路径的CSV文件。 |
video ✅ |
|
| Video file in formats like MP4, AVI, etc. 视频文件格式,如MP4, AVI等。 |
directory ✅ |
|
| Path to a directory containing images or videos. 包含图像或视频的目录路径。 |
glob ✅ |
|
| Glob pattern to match multiple files. Use the 匹配多个文件的Glob模式。使用 |
YouTube ✅ |
|
| URL to a YouTube video. YouTube视频的URL。 |
stream ✅ |
|
| URL for streaming protocols such as RTSP, RTMP, TCP, or an IP address. 流协议的URL,如RTSP、RTMP、TCP或IP地址。 |
multi-stream ✅ |
|
|
|
from ultralytics import YOLO
import cv2
import numpy as np
import torch
# Load a pretrained YOLOv8n model
model = YOLO("yolov8n.pt")
# --------------------------------------------------------------------------------------------
# Define path to the image file
source = "path/to/image.jpg"
# Run inference on the source
results = model(source) # list of Results objects
# --------------------------------------------------------------------------------------------
# Define current screenshot as source
source = "screen"
# --------------------------------------------------------------------------------------------
# Define remote image or video URL
source = "https://ultralytics.com/images/bus.jpg"
# --------------------------------------------------------------------------------------------
# Open an image using PIL
source = Image.open("path/to/image.jpg")
# --------------------------------------------------------------------------------------------
# Read an image using OpenCV
source = cv2.imread("path/to/image.jpg")
# --------------------------------------------------------------------------------------------
# Create a random numpy array of HWC shape (640, 640, 3) with values in range [0, 255] and type uint8
source = np.random.randint(low=0, high=255, size=(640, 640, 3), dtype="uint8")
# --------------------------------------------------------------------------------------------
# Create a random torch tensor of BCHW shape (1, 3, 640, 640) with values in range [0, 1] and type float32
source = torch.rand(1, 3, 640, 640, dtype=torch.float32)
# --------------------------------------------------------------------------------------------
# Define a path to a CSV file with images, URLs, videos and directories
source = "path/to/file.csv"
# --------------------------------------------------------------------------------------------
# Define path to video file
source = "path/to/video.mp4"
# Run inference on the source
results = model(source, stream=True)
# --------------------------------------------------------------------------------------------
# Define path to directory containing images and videos for inference
source = "path/to/dir"
# Run inference on the source
results = model(source, stream=True) # generator of Results objects
# --------------------------------------------------------------------------------------------
# Define a glob search for all JPG files in a directory
source = "path/to/dir/*.jpg"
# OR define a recursive glob search for all JPG files including subdirectories
source = "path/to/dir/**/*.jpg"
# Run inference on the source
results = model(source, stream=True) # generator of Results objects
# --------------------------------------------------------------------------------------------
# Define source as YouTube video URL
source = "https://youtu.be/LNwODJXcvt4"
# --------------------------------------------------------------------------------------------
# Single stream with batch-size 1 inference
source = "rtsp://example.com/media.mp4" # RTSP, RTMP, TCP or IP streaming address
# Multiple streams with batched inference (i.e. batch-size 8 for 8 streams)
source = "path/to/list.streams" # *.streams text file with one streaming address per row
# Run inference on the source
results = model(source, stream=True) # generator of Results objects
Inference arguments: 推理参数:
Argument 论点 | Type 类型 | Default 默认的 | Description 描述 |
|
|
| Specifies the data source for inference. Can be an image path, video file, directory, URL, or device ID for live feeds. Supports a wide range of formats and sources, enabling flexible application across different types of input. 指定用于推断的数据源。可以是图像路径、视频文件、目录、URL或实时提要的设备ID。支持多种格式和源,支持跨不同类型输入的灵活应用程序。 |
|
|
| Sets the minimum confidence threshold for detections. Objects detected with confidence below this threshold will be disregarded. Adjusting this value can help reduce false positives. 设置检测的最小置信度阈值。以低于此阈值的置信度检测到的对象将被忽略。调整这个值可以帮助减少误报。 |
|
|
| Intersection Over Union (IoU) threshold for Non-Maximum Suppression (NMS). Lower values result in fewer detections by eliminating overlapping boxes, useful for reducing duplicates. 非最大抑制(NMS)的IoU (Intersection Over Union)阈值。较低的值通过消除重叠框导致较少的检测,有助于减少重复。 |
|
|
| Defines the image size for inference. Can be a single integer 定义用于推理的图像大小。可以是单个整数 |
|
|
| Enables half-precision (FP16) inference, which can speed up model inference on supported GPUs with minimal impact on accuracy. 支持半精度(FP16)推理,可以在支持的gpu上加速模型推理,同时对精度的影响最小。 |
|
|
| Specifies the device for inference (e.g., 指定用于推理的设备(例如 |
|
|
| Maximum number of detections allowed per image. Limits the total number of objects the model can detect in a single inference, preventing excessive outputs in dense scenes. 每个图像允许的最大检测数。限制模型在单个推理中可以检测到的对象总数,防止在密集场景中输出过多。 |
|
|
| Frame stride for video inputs. Allows skipping frames in videos to speed up processing at the cost of temporal resolution. A value of 1 processes every frame, higher values skip frames. 视频输入的帧跨距。允许跳过视频中的帧,以时间分辨率为代价加快处理速度。1的值处理每帧,更高的值跳过帧。 |
|
|
| Determines if all frames should be buffered when processing video streams ( 决定在处理视频流时是否应该缓冲所有帧( |
|
|
| Activates visualization of model features during inference, providing insights into what the model is "seeing". Useful for debugging and model interpretation. 在推理过程中激活模型特征的可视化,提供对模型“看到”内容的洞察。用于调试和模型解释。 |
|
|
| Enables test-time augmentation (TTA) for predictions, potentially improving detection robustness at the cost of inference speed. 为预测启用测试时间增强(TTA),以牺牲推理速度为代价潜在地提高检测健壮性。 |
|
|
| Enables class-agnostic Non-Maximum Suppression (NMS), which merges overlapping boxes of different classes. Useful in multi-class detection scenarios where class overlap is common. 启用类无关的非最大抑制(NMS),它合并不同类的重叠框。在类重叠很常见的多类检测场景中非常有用。 |
|
|
| Filters predictions to a set of class IDs. Only detections belonging to the specified classes will be returned. Useful for focusing on relevant objects in multi-class detection tasks. 将预测筛选到一组类id。只返回属于指定类的检测结果。用于在多类检测任务中聚焦相关对象。 |
|
|
| Uses high-resolution segmentation masks if available in the model. This can enhance mask quality for segmentation tasks, providing finer detail. 如果模型中可用,则使用高分辨率分割掩码。这可以提高分割任务的掩码质量,提供更精细的细节。 |
|
|
| Specifies the layers from which to extract feature vectors or embeddings. Useful for downstream tasks like clustering or similarity search. 指定要从中提取特征向量或嵌入的层。用于下游任务,如聚类或相似性搜索。 |
Visualization arguments: 可视化参数:
Argument 论点 | Type 类型 | Default 默认的 | Description 描述 |
|
|
| If |
|
|
| Enables saving of the annotated images or videos to file. Useful for documentation, further analysis, or sharing results. 允许将注释的图像或视频保存到文件。用于文档编制、进一步分析或共享结果。 |
|
|
| When processing videos, saves individual frames as images. Useful for extracting specific frames or for detailed frame-by-frame analysis. 在处理视频时,将单个帧保存为图像。用于提取特定帧或进行详细的逐帧分析。 |
|
|
| Saves detection results in a text file, following the format 将检测结果保存为文本文件,格式为 |
|
|
| Includes confidence scores in the saved text files. Enhances the detail available for post-processing and analysis. 包括保存的文本文件中的置信度分数。增强可用于后处理和分析的细节。 |
|
|
| Saves cropped images of detections. Useful for dataset augmentation, analysis, or creating focused datasets for specific objects. 保存检测的裁剪图像。用于数据集增强、分析或为特定对象创建重点数据集。 |
|
|
| Displays labels for each detection in the visual output. Provides immediate understanding of detected objects. 在可视化输出中显示每个检测的标签。提供对检测对象的即时理解。 |
|
|
| Displays the confidence score for each detection alongside the label. Gives insight into the model's certainty for each detection. 在标签旁边显示每个检测的置信度得分。为每个检测提供洞察模型的确定性。 |
|
|
| Draws bounding boxes around detected objects. Essential for visual identification and location of objects in images or video frames. 在检测到的对象周围绘制边界框。对于图像或视频帧中物体的视觉识别和定位至关重要。 |
|
|
| Specifies the line width of bounding boxes. If 指定边界框的线宽。如果 |
Working with Results 结果
Attribute 属性 | Type 类型 | Description 描述 |
|
| The original image as a numpy array.原始图像作为numpy数组。 |
|
| The original image shape in (height, width) format.原始图像形状(高度,宽度)格式。 |
|
| A Boxes object containing the detection bounding boxes.一个Boxes对象,包含检测边界框。 |
|
| A Masks object containing the detection masks.包含检测掩码的mask对象。 |
|
| A Probs object containing probabilities of each class for classification task.一个包含分类任务中每个类别的概率的Probs对象。 |
|
| A Keypoints object containing detected keypoints for each object.一个Keypoints对象,包含每个对象检测到的关键点。 |
|
| An OBB object containing oriented bounding boxes.包含定向边界框的OBB对象。 |
|
| A dictionary of preprocess, inference, and postprocess speeds in milliseconds per image.一个以毫秒为单位的预处理、推理和后处理速度字典。 |
|
| A dictionary of class names.类名的字典。 |
|
| The path to the image file.镜像文件的路径。 |