如何使用YOLOv8 训练使用大量松材线虫病分类松材线虫病分类数据集，使用这些数据进行二分类任务，并使用YOLOv8进行训练、评估和可视化预测结果，样本采集自固定翼无人机拍摄图像

QQ_1309399183

已于 2024-12-18 08:20:56 修改

阅读量294

点赞数 2

分类专栏：植物病害类数据数据集文章标签： YOLO 分类无人机

于 2024-12-18 08:17:33 首次发布

本文链接：https://blog.csdn.net/QQ_1309399183/article/details/144548915

版权

数据集同时被 2 个专栏收录

197 篇文章

订阅专栏

植物病害类数据

50 篇文章

订阅专栏

在这里插入图片描述
超大规模松材线虫分类数据集
构建了超大规模的松材线虫病分类数据集，正样本100000张，负样本150000张，样本采集自固定翼无人机拍摄图像。
YOLOv8 松材线虫病分类

import os
import glob
import shutil
import yaml
from pathlib import Path
import numpy as np
import torch
from IPython.display import Image, clear_output
from ultralytics import YOLO

# 设置随机种子以保证可重复性
torch.manual_seed(42)

# 定义数据集路径
dataset_dir = 'path/to/dataset'
images_dir = os.path.join(dataset_dir, 'images')
annotations_dir = os.path.join(dataset_dir, 'annotations')

# 创建YOLOv5的数据集配置文件
data_config = {
    'train': os.path.join(dataset_dir, 'train/images'),
    'val': os.path.join(dataset_dir, 'val/images'),
    'test': os.path.join(dataset_dir, 'test/images'),
    'nc': 2,  # 类别数量
    'names': ['negative', 'positive']  # 类别名称
}

with open(os.path.join(dataset_dir, 'data.yaml'), 'w') as f:
    yaml.dump(data_config, f)

# 划分训练集、验证集和测试集
def split_dataset(images_dir, annotations_dir):
    image_files = [os.path.basename(f) for f in glob.glob(os.path.join(images_dir, '*.jpg'))]
    np.random.shuffle(image_files)
    
    train_files = image_files[:int(len(image_files) * 0.8)]
    val_files = image_files[int(len(image_files) * 0.8):int(len(image_files) * 0.9)]
    test_files = image_files[int(len(image_files) * 0.9):]
    
    def create_folder_and_write_files(folder_name, files):
        folder_path = os.path.join(dataset_dir, folder_name)
        images_folder = os.path.join(folder_path, 'images')
        
        if not os.path.exists(images_folder):
            os.makedirs(images_folder)
        
        with open(os.path.join(folder_path, 'images.txt'), 'w') as f:
            for img in files:
                src_img_path = os.path.join(images_dir, img)
                dst_img_path = os.path.join(images_folder, img)
                shutil.copy(src_img_path, dst_img_path)
                f.write(dst_img_path + '\n')

    create_folder_and_write_files('train', train_files)
    create_folder_and_write_files('val', val_files)
    create_folder_and_write_files('test', test_files)

split_dataset(images_dir, annotations_dir)

# 训练模型
model = YOLO('yolov8n.pt')  # 加载预训练的YOLOv8n模型

results = model.train(
    data=os.path.join(dataset_dir, 'data.yaml'),
    epochs=100,
    imgsz=640,
    batch=16,
    name='pine_wilt_classification',
    project='runs/train'
)

# 评估模型
metrics = model.val()

# 可视化预测结果
source_image = '../path/to/dataset/test/sample.jpg'  # 替换为你要测试的图片路径
results = model.predict(source=source_image, conf=0.25, iou=0.45, save=True, save_txt=True)

# 显示预测结果
Image(filename='runs/detect/predict/sample.jpg')

在这里插入图片描述

如果i有一个超大规模的松材线虫病分类数据集，包含100,000张正样本和150,000张负样本，这些图像是通过固定翼无人机拍摄的。你需要使用这些数据进行二分类任务，并使用YOLOv8进行训练、评估和可视化预测结果。

项目介绍

数据准备

数据集: 包含250,000张图像（100,000张正样本和150,000张负样本），这些图像是通过固定翼无人机拍摄的。
类别:
- positive: 正样本（松材线虫病）
- negative: 负样本（无松材线虫病）

模型选择

YOLOv8: 使用YOLOv8进行目标检测。虽然YOLOv8主要用于目标检测，但我们可以通过调整模型结构来进行二分类任务。

功能

数据加载: 自动从指定目录加载图像和标注文件。
数据转换: 确保数据格式正确。
模型训练: 使用YOLOv8进行训练。
模型评估: 在验证集上评估模型性能。
结果保存: 保存训练日志和最佳模型权重。
可视化预测结果: 可视化预测结果以进行验证。

代码实现

由于YOLOv8主要用于目标检测，对于二分类任务，我们可以将每个图像视为一个整体，而不是具体的边界框。我们将使用YOLOv8的分类功能来进行二分类任务。

首先，确保你已经安装了YOLOv8库和其他必要的依赖项。你可以通过以下命令安装YOLOv8：

pip install ultralytics

接下来，我们编写代码来组织数据集并训练YOLOv8模型。

如何使用这些代码

准备数据：

确保你的数据集格式正确，包含图像文件夹。

示例数据结构如下：

path/to/dataset/
├── images/
│   ├── positive/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   ├── negative/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...

替换数据路径：
- 在代码中，将 'path/to/dataset' 替换为你的数据集路径。
```
dataset_dir = 'your_dataset_directory'
```
运行代码：
- 将上述代码复制到你的Python脚本中，并运行该脚本。
- 确保你已经安装了所需的库：
```
pip install ultralytics
```

示例：使用自定义数据集

假设你有一个新的数据集 my_pine_wilt_classification_dataset，其内容如下：

my_pine_wilt_classification_dataset/
├── images/
│   ├── positive/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   ├── negative/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...

你可以按照以下步骤进行替换：

修改数据路径：

dataset_dir = 'my_pine_wilt_classification_dataset'

运行完整的代码：
- 将所有代码整合到一个Python脚本中，并运行该脚本。

注释说明

代码中包含了详细的注释，帮助你理解每个部分的功能。以下是关键部分的注释：

数据准备：
- data_config: 定义训练集、验证集和测试集的路径，以及类别信息。
数据划分：
- split_dataset: 根据给定的比例划分训练集、验证集和测试集。
模型训练：
- model.train: 使用YOLOv8进行训练。
模型评估：
- model.val: 在验证集上评估模型性能。
可视化预测结果：
- model.predict: 进行推理并显示预测结果。

结果

运行代码后，你将得到以下结果：

控制台输出：
- 训练过程中每个epoch的日志信息。
- 验证集上的评价指标（如准确率、精确率、召回率等）。
文件输出：
- runs/train/pine_wilt_classification/weights/best.pt: 最佳模型权重。
- runs/val/exp/results.txt: 验证结果。
图像输出：
- runs/detect/predict/sample.jpg: 带有预测边界的图像。