YOLO实践

追光少年3322

已于 2024-09-01 18:26:49 修改

阅读量435

点赞数 13

文章标签： python YOLO 人工智能计算机视觉目标检测图像处理

于 2024-09-01 11:19:25 首次发布

本文链接：https://blog.csdn.net/weixin_52019286/article/details/141716171

版权

一. 环境安装

参考视频

Pytorch环境安装细节

pytorch安装：一个单独的环境中，能使用pip就尽量使用pip，实在有问题的情况，例如没有合适的编译好的系统版本的安装包，再使用conda进行安装，不要来回混淆
CUDA是否要安装：如果只需要训练、简单推理，则无需单独安装CUDA，直接安装pytorch；如果有部署需求，例如导出TensorRT模型，则需要进行CUDA安装
Pytorch安装注意事项:必须使用官网的命令进行安装，否则安装的是cpu的版本

正确使用windows终端：

使用cmd,而不是powershell（无法激活环境）。
在其他软件，如Pycharm,vscode中也要注意！

vscode配置cmd终端
在这里插入图片描述

可能出现的问题

Arial.ttf字体文件无法下载
- 手动下载，放到对应的位置，windows下的目录是：～/AppData/Roaming/Ultralytics
页面文件太小,无法完成操作
- 调整训练参数中的workers，设置为1
- 修改虚拟内存，将环境安装位置所在的盘，设置一个较大的参数
‘Upsample’ object has no attribute ‘recompute _scale_factor’
- pytorch版本过高导致，可以选择降版本，1.8.2目前是不会报错的版本
- 如不想降低版本，可以修改pytorch源码，打开报错的unsampling.py，删除
  recompute_scale_factor这个参数

设置电脑虚拟内存：
在这里插入图片描述

二 .数据集制作

数据格式转化

voc2yolo
yolo2voc
json转yolo
提示:可以采用 data_df = pd.read_json(json_path),然后利用pandas的语法进行提取bbox和class_id

划分数据集

三.模型使用

YOLOV5 模型调用（本地）

# Model
model = torch.hub.load('.', 'custom', path=r'best.pt',source='local')

. : 本地的yolov5的根目录
source=“local” ：从本地调用该模型

YOLOV8 模型调用

# Load the YOLOv8 model
model = YOLO("yolov8n.pt")  # Make sure the model file is in the correct path

如果本地不存在yolov8n.pt，则从网上下载

四. 结果处理

.xyxy()：即为[x_min,y_min,x_max,y_max]

YOLOV5 参考博客

import torch
model = torch.hub.load('.', 'yolov5s', source='local') 
im = r'data\images\bus.jpg'  # file, Path, PIL.Image, OpenCV, nparray, list`
results = model(im)  # inference
# results.crop()  # or .show(), .save(), .crop(), .pandas(), etc.

results是一个Detections对象，主要有以下方法：

# results.crop()  # or .show(), .save(), .crop(), .pandas(), etc.
results.xyxy[0]   #Tensor类型
results.pandas().xyxy[0]  #转为pandas格式
results.print()  #控制台显示
frame = results.render()[0]  # 将模型的输出绘制回帧上

另外：想要个性化输出，限定类别

# 将检测结果转换为Pandas DataFrame格式
detections_df = results.pandas().xyxy[0]

# 筛选出标签名为'person'的所有检测结果
person_detections = detections_df[detections_df['name'] == 'person'].to_numpy()

# 遍历检测到的人
for detection in person_detections:
    label_name = detection[6]  # 获取标签名
    bbox = detection[:4].astype('int')  # 获取边界框坐标并转换为整数

    # 在图像上绘制边界框
    cv.rectangle(image_temp, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (0, 0, 255), 3)

    # 在边界框左上角添加标签名
    cv.putText(image_temp, label_name, (bbox[0] - 10, bbox[1] - 10), cv.FONT_ITALIC, 1, (0, 255, 0), 2)

YOLOV8

建议：ultalytics源码安装，在ultralytics的根目录下pip install -e . -e指的是本地可编辑，下载该库时不会直接下载到site-packages下，而是以链接的方式链接到源码的根目录下

from ultralytics import YOLO
model=YOLO("yolov8n-seg.pt")
results=model(r"images\bus.jpg")

这里的results是list类型。

for idx, result in enumerate(results):
        boxes = result.boxes  # Boxes object for bounding box outputs
        masks = result.masks  # Masks object for segmentation masks outputs
        keypoints = result.keypoints  # Keypoints object for pose outputs
        probs = result.probs  # Probs object for classification outputs
        obb = result.obb  # Oriented boxes object for OBB outputs

        if len(boxes.cls) == 0:
            continue
        
        xyxy = boxes.xyxy.data.cpu().numpy().round()
        cls = boxes.cls.data.cpu().numpy().round()
        conf = boxes.conf.data.cpu().numpy()

另外:如果想要个性化输出，可以使用 patched_yolo_infer库（我只验证了YOLOV8,其它的YOLO系列不知道）

import cv2
from ultralytics import YOLO
from patched_yolo_infer import visualize_results_usual_yolo_inference

model=YOLO("yolov8n.pt")
# Load the image
img_path = 'images/bus.jpg'
img = cv2.imread(img_path)

img=visualize_results_usual_yolo_inference(
        img,
        model,
        conf=0.4,
        iou=0.7,
        show_classes_list=[0], #Whether to perform instance segmentation. Default is False.
        segment=True,          #是否分割
        thickness=5,
        show_boxes=False,
        fill_mask=False,      
        alpha=0.7,			#The transparency of filled masks. Default is 0.3.
        show_class=False,
        delta_colors=25,     #The random seed offset for color variation. Default is 0.
        inference_extra_args={'retina_masks':True}, #increase the accuracy of the contours
        return_image_array=True
    )

附:Pyme组件

import cv2 as cv
from PIL import ImageTk, Image
class VideoPlayer():
    def __init__(self,elementName,video_source=0):
        super().__init__()
        self.video_source = video_source
        self.vid = cv.VideoCapture(self.video_source)
        self.label= Fun.GetElement(uiName,elementName)  #如果想要替换为tk，这里直接注入label对象
        self.delay = 15
        self.paused = False  # 暂停状态标志
        self.update()
        
    def update(self):
        # 检查是否成功读取帧
        if self.vid.isOpened():
            if not self.paused:
                ret, frame = self.vid.read()
                if ret:
                    frame = self.process_frame(frame)
                    # 转换为Image
                    self.photo = ImageTk.PhotoImage(image=Image.fromarray(frame))
                    self.label.config(image=self.photo)
                else:
                    self.vid.release()
                    return
        self.label.after(self.delay, self.update)
        
    def process_frame(self, frame):
        #convert BGR to RGB
        frame = cv.cvtColor(frame, cv.COLOR_BGR2RGB)
        return frame
    
    def pause(self):
        self.paused = not self.paused
        return self.paused
    
    def stop(self):
        self.vid.release()