前言
在实际处理过程中,我们使用YOLO V8进行推理时,通常会针对一张图片进行推理。如果需要对多张图片进行推理,则可以通过一个循环来实现对图片逐张进行推理。
单张图片推理时,需要注意图片的尺寸必须是32的倍数,否则可能导致推理失败。在下面的示例中,我们展示了如何使用PyTorch和Ultralytics库进行单张图片的推理:
import torch
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Create a random torch tensor of BCHW shape (1, 3, 640, 640) with values in range [0, 1] and type float32
source = torch.rand(1, 3, 640, 640, dtype=torch.float32)
# Run inference on the source
results = model(source) # list of Results objects
批量图片推理时,也需要注意图片的尺寸必须是32的倍数。在下面的示例中,我们展示了如何使用PyTorch和Ultralytics库进行多张图片的批量推理:
import torch
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Create a random torch tensor of BCHW shape (1, 3, 640, 640) with values in range [0, 1] and type float32
source = torch.rand(4, 3, 640, 640, dtype=torch.float32)
# Run inference on the source
results = model(source) # list of Results objects
需要注意的是,在批量推理时,虽然一次推理了多张图片,但实际处理方式仍然是通过循环进行的。在后续的文章中,我们将介绍如何使用更高效的方式进行批量推理,以获得更快的推理速度和更好的性能。
下面我们介绍如何将检测推理代码给单独提取出来,进行推理。
一、YOLO V8-Detection 预测
在官方中,进行推理时,直接使用两行代码就能实现目标检测的功能。
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt')
# Run batched inference on a list of images
model.predict("./ultralytics/assets/bus.jpg", imgsz=640, save=True, device=0)
模型推理保存的结果图像如下所示:
模型预测成功,我们就需要自己动手来写下 YOLO V8-Detection 的模型加载、预处理和后处理,以便我们进行相关的操作,我们先来看看预处理的实现
二、YOLO V8-Detection 模型加载
原始加载方式
模型文件:ultralytics\engine\model.py
def _load(self, weights: str, task=None):
"""
Initializes a new model and infers the task type from the model head.
Args:
weights (str): model checkpoint to be loaded
task (str | None): model task
"""
suffix = Path(weights).suffix
if suffix == ".pt":
self.model, self.ckpt = attempt_load_one_weight(weights)
self.task = self.model.args["task"]
self.overrides = self.model.args = self._reset_ckpt_args(self.model.args)
self.ckpt_path = self.model.pt_path
else:
weights = checks.check_file(weights)
self.model, self.ckpt = weights, None
self.task = task or guess_model_task(weights)
self.ckpt_path = weights
self.overrides["model"] = weights
self.overrides["task"] = self.task
模型文件:ultralytics/nn/tasks.py
def attempt_load_one_weight(weight, device=None, inplace=True, fuse=False):
"""Loads a single model weights."""
ckpt, weight = torch_safe_load(weight) # load ckpt
args = {**DEFAULT_CFG_DICT, **(ckpt.get("train_args", {}))} # combine model and default args, preferring model args
model = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model
# Model compatibility updates
model.args = {k: v for k, v in args.items() if k in DEFAULT_CFG_KEYS} # attach args to model
model.pt_path = weight # attach *.pt file path to model
model.task = guess_model_task(model)
if not hasattr(model, "stride"):
model.stride = torch.tensor([32.0])
model = model.fuse().eval() if fuse and hasattr(model, "fuse") else model.eval() # model in eval mode
# Module updates
for m in model.modules():
t = type(m)
if t in (nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU, Detect, Segment, Pose, OBB):
m.inplace = inplace
elif t is nn.Upsample and not hasattr(m, "recompute_scale_factor"):
m.recompute_scale_factor = None # torch 1.11.0 compatibility
# Return model and ckpt
return model, ckpt
上述两个代码是加载模型的原始方法,这种方法不仅会加载模型的权重,还会加载一系列相关的配置文件,这个并不是我们想要的。我们只加载模型权重
,其余相关的都不需要加载,因此要使用下面这种方式进行加载。
修改后的加载方式
模型文件:ultralytics/nn/autobackend.py
@torch.no_grad()
def __init__(
self,
weights="yolov8n.pt",
device=torch.device("cpu"),
dnn=False,
data=None,
fp16=False,
fuse=True,
verbose=True,
):
参数介绍
weights:模型权重文件的路径。默认为 "yolov8n.pt"。 device (torch.device):运行模型的设备。默认为 CPU。 dnn:使用 OpenCV DNN 模块进行 ONNX 推断。默认为假。 data:包含类名的附加 data.yaml 文件的路径。可选。 fp16:启用半精度推理。仅在特定后端支持。默认为 False。 fuse:融合 Conv2D + BatchNorm 层进行优化。默认为 True。 verbose:启用详细日志记录。默认为 True。
.pt 加载方式
elif pt: # PyTorch
from ultralytics.nn.tasks import attempt_load_weights
model = attempt_load_weights(
weights if isinstance(weights, list) else w, device=device, inplace=True, fuse=fuse
)
if hasattr(model, "kpt_shape"):
kpt_shape = model.kpt_shape # pose-only
stride = max(int(model.stride.max()), 32) # model stride
names = model.module.names if hasattr(model, "module") else model.names # get class names
model.half() if fp16 else model.float()
self.model = model # explicitly assign for to(), cpu(), cuda(), half()
最终代码
from ultralytics.nn.autobackend import AutoBackend
weights = 'yolov8n.pt'
model = AutoBackend(weights, device=torch.device("cuda:0"))
三、YOLO V8-Detection 预处理
原始加载方式
模型文件:ultralytics/engine/predictor.py
from ultralytics.data.augment import LetterBox
@smart_inference_mode()
def stream_inference(self, source=None, model=None, *args, **kwargs):
"""Streams real-time inference on camera feed and saves results to file."""
.
.
.
# Preprocess
with profilers[0]:
im = self.preprocess(im0s)
.
.
.
def pre_transform(self, im):
"""
Pre-transform input image before inference.
Args:
im (List(np.ndarray)): (N, 3, h, w) for tensor, [(h, w, 3) x N] for list.
Returns:
(list): A list of transformed images.
"""
same_shapes = all(x.shape == im[0].shape for x in im)
letterbox = LetterBox(self.imgsz, auto=same_shapes and self.model.pt, stride=self.model.stride)
return [