yolov8训练行人目标检测数据集_使用构建一个基于YOLOv8的行人目标检测系统来识别路上行人的检测，环境设置、数据准备、模型训练等

最新推荐文章于 2025-05-16 11:33:43 发布

QQ_1309399183

最新推荐文章于 2025-05-16 11:33:43 发布

阅读量583

点赞数 13

文章标签： YOLO 目标检测人工智能

本文链接：https://blog.csdn.net/QQ_1309399183/article/details/145222607

版权

yolov8训练行人目标检测数据集_使用构建一个基于YOLOv8的行人目标检测系统来识别路上行人的检测

——行人数据集人群目标检测数据集密集人群数据集yolo voc带标注
在这里插入图片描述

行人目标检测数据集在这里插入图片描述

图片: 3687张
带xml格式标注文件: 3687个 .
带txt格式标注文件, 已划分好训练集和验证集:
训练集:2950个
验证集: 737个
1个分类
names:
person
可直接用于yolo各版本训练。在这里插入图片描述

构建一个基于YOLOv8的行人目标检测系统，特别是这个特定数据集，涉及多个步骤，包括环境设置、数据准备、模型训练、评估和推理部署。以下是详细的指南：

以下代码仅供参考### 1. 环境设置

确保你的开发环境已经安装了必要的库和工具：

pip install torch torchvision ultralytics pyqt5 opencv-python pandas

2. 数据准备

2.1 数据集结构

根据你提供的信息，数据集包含3687张图片，并且带XML格式标注文件。数据集已经划分好训练集和验证集，每个图像对应一个同名的.txt文件（YOLO格式）。确保数据集目录结构如下：

datasets/
└── pedestrian_detection/
    ├── images/
    │   ├── train/
    │   └── val/
    ├── labels_yolo/
    │   ├── train/
    │   └── val/

2.2 类别映射

你提到只有一个分类 person。假设你已经有了正确的YOLO格式的标注文件，其中类别ID为0。

2.3 转换VOC到YOLO格式

如果你的数据是以VOC格式提供的，可以使用Python脚本将它们转换为YOLO格式：

import xml.etree.ElementTree as ET
import os

# Define class names and their corresponding IDs
class_names = ['person']
class_ids = {name: idx for idx, name in enumerate(class_names)}

def convert_voc_to_yolo(xml_file, output_dir):
    tree = ET.parse(xml_file)
    root = tree.getroot()
    
    img_width = int(root.find('size/width').text)
    img_height = int(root.find('size/height').text)
    
    with open(os.path.join(output_dir, root.find('filename').text.split('.')[0] + '.txt'), 'w') as f:
        for obj in root.findall('object'):
            class_name = obj.find('name').text
            if class_name not in class_ids:
                continue
            bbox = obj.find('bndbox')
            x_min = float(bbox.find('xmin').text)
            y_min = float(bbox.find('ymin').text)
            x_max = float(bbox.find('xmax').text)
            y_max = float(bbox.find('ymax').text)

            x_center = ((x_min + x_max) / 2) / img_width
            y_center = ((y_min + y_max) / 2) / img_height
            bbox_width = (x_max - x_min) / img_width
            bbox_height = (y_max - y_min) / img_height
            
            f.write(f"{class_ids[class_name]} {x_center} {y_center} {bbox_width} {bbox_height}\n")

# Example usage
for filename in os.listdir('path/to/voc/annotations'):
    if filename.endswith('.xml'):
        convert_voc_to_yolo(os.path.join('path/to/voc/annotations', filename), 'labels_yolo/train')

3. 文件内容

3.1 Config.py

配置文件用于定义数据集路径、模型路径等。

# Config.py
DATASET_PATH = 'datasets/pedestrian_detection/'
MODEL_PATH = 'runs/detect/train/weights/best.pt'
IMG_SIZE = 640
BATCH_SIZE = 16
EPOCHS = 50
CONF_THRESHOLD = 0.5

3.2 train.py

训练YOLOv8模型的脚本。

from ultralytics import YOLO
import os

# Load a model
model = YOLO('yolov8n.pt')  # You can also use other versions like yolov8s.pt, yolov8m.pt, etc.

# Define dataset configuration
dataset_config = f"""
train: {os.path.join(os.getenv('DATASET_PATH', 'datasets/pedestrian_detection/'), 'images/train')}
val: {os.path.join(os.getenv('DATASET_PATH', 'datasets/pedestrian_detection/'), 'images/val')}
nc: 1
names: ['person']
"""

# Save dataset configuration to a YAML file
with open('pedestrian.yaml', 'w') as f:
    f.write(dataset_config)

# Train the model
results = model.train(data='pedestrian.yaml', epochs=int(os.getenv('EPOCHS', 50)), imgsz=int(os.getenv('IMG_SIZE', 640)), batch=int(os.getenv('BATCH_SIZE', 16)))

3.3 detect_tools.py

用于检测的工具函数。

from ultralytics import YOLO
import cv2
import numpy as np

def load_model(model_path):
    return YOLO(model_path)

def detect_objects(frame, model, conf_threshold=0.5):
    results = model(frame, conf=conf_threshold)
    detections = []
    for result in results:
        boxes = result.boxes.cpu().numpy()
        for box in boxes:
            r = box.xyxy[0].astype(int)
            cls = int(box.cls[0])
            conf = round(float(box.conf[0]), 2)
            label = f"person {conf}"
            detections.append((r, label))
    return detections

def draw_detections(frame, detections):
    for (r, label) in detections:
        cv2.rectangle(frame, (r[0], r[1]), (r[2], r[3]), (0, 255, 0), 2)
        cv2.putText(frame, label, (r[0], r[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
    return frame

3.4 UIProgram/MainProgram.py

主程序，使用PyQt5构建图形界面。

import sys
import cv2
from PyQt5.QtWidgets import QApplication, QMainWindow, QLabel, QVBoxLayout, QWidget, QPushButton
from PyQt5.QtGui import QImage, QPixmap
from PyQt5.QtCore import Qt, QTimer
from detect_tools import load_model, detect_objects, draw_detections
import os

class VideoWindow(QMainWindow):
    def __init__(self):
        super().__init__()

        self.setWindowTitle("Pedestrian Detection")
        self.setGeometry(100, 100, 800, 600)

        self.central_widget = QWidget()
        self.setCentralWidget(self.central_widget)

        self.layout = QVBoxLayout()
        self.central_widget.setLayout(self.layout)

        self.label = QLabel()
        self.layout.addWidget(self.label)

        self.start_button = QPushButton("Start Detection")
        self.start_button.clicked.connect(self.start_detection)
        self.layout.addWidget(self.start_button)

        self.cap = None
        self.timer = QTimer()
        self.timer.timeout.connect(self.update_frame)

        self.model = load_model(os.getenv('MODEL_PATH', 'runs/detect/train/weights/best.pt'))

    def start_detection(self):
        if not self.cap:
            self.cap = cv2.VideoCapture(0)  # Use webcam
        self.timer.start(30)

    def update_frame(self):
        ret, frame = self.cap.read()
        if not ret:
            return

        detections = detect_objects(frame, self.model, conf_threshold=float(os.getenv('CONF_THRESHOLD', 0.5)))
        frame = draw_detections(frame, detections)

        rgb_image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        h, w, ch = rgb_image.shape
        bytes_per_line = ch * w
        qt_image = QImage(rgb_image.data, w, h, bytes_per_line, QImage.Format_RGB888)
        pixmap = QPixmap.fromImage(qt_image)
        self.label.setPixmap(pixmap.scaled(800, 600, Qt.KeepAspectRatio))

if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = VideoWindow()
    window.show()
    sys.exit(app.exec_())

3.5 requirements.txt

列出所有依赖项。

torch
torchvision
ultralytics
pyqt5
opencv-python
pandas

3.6 setup.py

用于安装项目的脚本。

from setuptools import setup, find_packages

setup(
    name='pedestrian_detection',
    version='0.1',
    packages=find_packages(),
    install_requires=[
        'torch',
        'torchvision',
        'ultralytics',
        'pyqt5',
        'opencv-python',
        'pandas'
    ],
    entry_points={
        'console_scripts': [
            'train=train:main',
            'detect=UIProgram.MainProgram:main'
        ]
    }
)

3.7 README.md

项目说明文档。

# Pedestrian Detection System

This project uses YOLOv8 and PyQt5 to create a real-time detection system for pedestrians. The system detects persons in images or video streams.

## Installation

1. Clone the repository:
   ```bash
   git clone https://github.com/yourusername/pedestrian-detection.git
   cd pedestrian-detection

Install dependencies:
```
pip install -r requirements.txt
```

Set up environment variables (optional):

export DATASET_PATH=./datasets/pedestrian_detection/
export MODEL_PATH=./runs/detect/train/weights/best.pt
export IMG_SIZE=640
export BATCH_SIZE=16
export EPOCHS=50
export CONF_THRESHOLD=0.5

Training

To train the YOLOv8 model:

python train.py

Running the GUI

To run the graphical user interface:

python UIProgram/MainProgram.py

Usage Tutorial

See 使用教程.xt for detailed usage instructions.


### 4. 运行步骤

- **确保数据集路径正确**：将你的数据集放在 `datasets/pedestrian_detection` 目录下。
- **安装必要的库**：确保已安装所有所需库。
- **运行代码**：
  - 首先运行训练代码来训练YOLOv8模型：
    ```bash
    python train.py
    ```
  - 然后运行GUI代码来启动检测系统：
    ```bash
    python UIProgram/MainProgram.py
    ```

### 5. 模型评估与优化

在训练完成后，你可以通过验证集评估模型性能，查看mAP（平均精度均值）和其他指标。根据评估结果，调整超参数如学习率、批次大小、图像尺寸等，以优化模型性能。

### 6. 结果分析与可视化

利用内置的方法或自定义脚本来分析结果和可视化预测边界框。这有助于理解模型的表现并识别可能的改进点。

### 7. 用户界面开发

为了构建用户界面，你可以使用Flask或FastAPI等框架创建RESTful服务，或者直接用Streamlit这样的快速原型开发工具。上述代码中已经包含了使用PyQt5创建的简单GUI示例。

### 注意事项

- **类别映射**：确保YOLO格式的标签文件中的类别ID与`train.py`中定义的类别名称一致。
- **数据增强**：考虑到行人检测可能会有复杂背景和光照变化，可以考虑使用数据增强技术提高模型的泛化能力。
- **模型选择**：根据你的硬件条件和需求选择合适的YOLO版本（如YOLOv8n、YOLOv8s等）。
- **预处理**：对于特别大的数据集，建议在训练前对数据进行适当的预处理，比如缩放、裁剪等操作。
- **数据集划分**：确认数据集已经按照合适的比例进行了训练集和验证集的划分（例如你提供的数据集已经划分好了）。

这样你呀就_构建基于YOLOv8的行人目标检测系统。果有任何问题或需要进一步的帮助，请随时提问！