鱼类检测数据集 22种鱼类 26000张带标注 voc yolo

look me head

于 2024-10-07 06:44:24 发布

阅读量483

点赞数 17

分类专栏：数据集文章标签：获取QQ767172261 数据集鱼类检测数据集

本文链接：https://blog.csdn.net/2401_83580557/article/details/142734788

版权

数据集专栏收录该内容

377 篇文章 12 订阅

订阅专栏

鱼类检测数据集 22种鱼类 26000张带标注 voc yolo

分类名: (图片张数，标注个数)
aair: (1730， 1733)
foli:(550，601)
boal :(1529，1537)
deshi puti: (412, 412)
chapila: (423, 425)
ilish: (997， 999)
kal baush:(906， 922)
katla: (1684， 1686)
koi: (804， 811)
magur :(573, 574)
mrigel:(1782，1784)
pabda:(1686，1720)
pangas:(919，934)
puti :(1558，1569)
rui :(2626,2628)
shol :(1392.1392)
shor puti: (44, 44)
taki :(2130，2185)
tara baim:(1219, 1351)
telapiya:(1985，2017)
tengra:(1422，1422)
总数:(26359，26746)

鱼类检测数据集介绍

数据集名称

鱼类检测数据集 (Fish Detection Dataset)

数据集概述

该数据集是一个专门用于训练和评估鱼类识别模型的数据集。数据集包含26000张图像，每张图像都带有详细的标注信息，标注格式包括VOC（Pascal VOC）和YOLO格式。这些图像涵盖了22种不同的鱼类，并且适用于基于深度学习的目标检测任务。通过这个数据集，可以训练出能够准确检测和分类不同种类鱼类的模型，从而帮助进行渔业管理、生态保护等应用。

数据集特点

高质量图像：数据集中的图像具有高分辨率，能够提供丰富的细节信息，特别适合鱼类特征分析。
带标注：每张图像都有详细的标注信息，包括鱼类的位置和大小。
多格式标注：标注信息同时以VOC和YOLO格式提供，方便不同框架的使用。
实际应用场景：适用于需要精确检测鱼类的场景，如渔业管理、生态保护系统等。

数据集结构

fish_detection_dataset/
├── images/                            # 图像文件
│   ├── 00001.jpg                      # 示例图像
│   ├── 00002.jpg
│   └── ...
├── annotations/                       # 标注文件
│   ├── VOC/                           # Pascal VOC格式标注
│   │   ├── 00001.xml                  # 示例VOC标注文件
│   │   ├── 00002.xml
│   │   └── ...
│   ├── YOLO/                          # YOLO格式标注
│   │   ├── 00001.txt                  # 示例YOLO标注文件
│   │   ├── 00002.txt
│   │   └── ...
├── data.yaml                          # 类别描述文件
├── README.md                          # 数据集说明

数据集内容

images/
- 功能：存放图像文件。
- 内容：
  - 00001.jpg：示例图像。
  - 00002.jpg：另一张图像。
  - ...
annotations/
- 功能：存放标注文件。
- 内容：
  - VOC/：存放Pascal VOC格式的标注文件。
    - 00001.xml：示例VOC标注文件。
    - 00002.xml：另一张图像的VOC标注文件。
    - ...
  - YOLO/：存放YOLO格式的标注文件。
    - 00001.txt：示例YOLO标注文件。
    - 00002.txt：另一张图像的YOLO标注文件。
    - ...

data.yaml

功能：定义数据集的类别和其他相关信息。

内容：

train: fish_detection_dataset/images
val: fish_detection_dataset/images
nc: 22
names: ['aair', 'foli', 'boal', 'deshi puti', 'chapila', 'ilish', 'kal baush', 'katla', 'koi', 'magur', 'mrigel', 'pabda', 'pangas', 'puti', 'rui', 'shol', 'shor puti', 'taki', 'tara baim', 'telapiya', 'tengra']

README.md
- 功能：数据集的详细说明文档。
- 内容：
  - 数据集的来源和用途。
  - 数据集的结构和内容。
  - 如何使用数据集进行模型训练和评估。
  - 其他注意事项和建议。

数据集统计

总图像数量：26000张
总标注框数量：26746个
类别：22类
平均每张图像的标注框数量：约1.03个

具体类别及其统计如下：

aair：(1730张图像, 1733个标注)
foli：(550张图像, 601个标注)
boal：(1529张图像, 1537个标注)
deshi puti：(412张图像, 412个标注)
chapila：(423张图像, 425个标注)
ilish：(997张图像, 999个标注)
kal baush：(906张图像, 922个标注)
katla：(1684张图像, 1686个标注)
koi：(804张图像, 811个标注)
magur：(573张图像, 574个标注)
mrigel：(1782张图像, 1784个标注)
pabda：(1686张图像, 1720个标注)
pangas：(919张图像, 934个标注)
puti：(1558张图像, 1569个标注)
rui：(2626张图像, 2628个标注)
shol：(1392张图像, 1392个标注)
shor puti：(44张图像, 44个标注)
taki：(2130张图像, 2185个标注)
tara baim：(1219张图像, 1351个标注)
telapiya：(1985张图像, 2017个标注)
tengra：(1422张图像, 1422个标注)

使用说明

环境准备：确保安装了常用的深度学习库，例如torch, torchvision, numpy等。
数据集路径设置：将数据集解压到项目目录下，并确保路径正确。
训练模型：可以使用预训练的目标检测模型（如Faster R-CNN、YOLOv5等），并对其进行微调以适应当前数据集。
数据增强：可以通过随机翻转、旋转等方法增加数据多样性，提高模型鲁棒性。
超参数调整：根据实际情况调整学习率、批大小等超参数，以获得最佳训练效果。
硬件要求：建议使用GPU进行训练和推理，以加快处理速度。如果没有足够的计算资源，可以考虑使用云服务提供商的GPU实例。
类别平衡：虽然数据集中各类别的样本数量相对均衡，但在实际应用中可能需要进一步检查并处理类别不平衡问题，例如通过过采样或欠采样方法。

关键训练代码

以下是一个使用PyTorch和torchvision库进行鱼类检测的示例代码。我们将使用预训练的Faster R-CNN模型，并对其进行微调以适应我们的数据集。

import torch
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.transforms import functional as F
from torch.utils.data import DataLoader, Dataset
from PIL import Image
import os
import xml.etree.ElementTree as ET

# 自定义数据集类
class FishDetectionDataset(Dataset):
    def __init__(self, root, transforms=None):
        self.root = root
        self.transforms = transforms
        self.imgs = list(sorted(os.listdir(os.path.join(root, "images"))))
        self.annotations = list(sorted(os.listdir(os.path.join(root, "annotations", "VOC"))))

    def __getitem__(self, idx):
        img_path = os.path.join(self.root, "images", self.imgs[idx])
        annotation_path = os.path.join(self.root, "annotations", "VOC", self.annotations[idx])

        img = Image.open(img_path).convert("RGB")
        annotation_root = ET.parse(annotation_path).getroot()

        boxes = []
        labels = []
        for obj in annotation_root.findall('object'):
            xmin, ymin, xmax, ymax = [int(obj.find('bndbox').find(tag).text) for tag in ('xmin', 'ymin', 'xmax', 'ymax')]
            label = obj.find('name').text
            label_id = ['aair', 'foli', 'boal', 'deshi puti', 'chapila', 'ilish', 'kal baush', 'katla', 'koi', 'magur', 'mrigel', 'pabda', 'pangas', 'puti', 'rui', 'shol', 'shor puti', 'taki', 'tara baim', 'telapiya', 'tengra'].index(label) + 1
            boxes.append([xmin, ymin, xmax, ymax])
            labels.append(label_id)

        boxes = torch.as_tensor(boxes, dtype=torch.float32)
        labels = torch.as_tensor(labels, dtype=torch.int64)

        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        target["image_id"] = torch.tensor([idx])

        if self.transforms is not None:
            img, target = self.transforms(img, target)

        return F.to_tensor(img), target

    def __len__(self):
        return len(self.imgs)

# 数据预处理
def get_transform(train):
    transforms = []
    if train:
        transforms.append(torchvision.transforms.RandomHorizontalFlip(0.5))
    return torchvision.transforms.Compose(transforms)

# 加载数据集
dataset = FishDetectionDataset(root='fish_detection_dataset', transforms=get_transform(train=True))
dataset_test = FishDetectionDataset(root='fish_detection_dataset', transforms=get_transform(train=False))

indices = torch.randperm(len(dataset)).tolist()
dataset = torch.utils.data.Subset(dataset, indices[:-2600])
dataset_test = torch.utils.data.Subset(dataset_test, indices[-2600:])

data_loader = DataLoader(dataset, batch_size=2, shuffle=True, num_workers=4, collate_fn=lambda x: tuple(zip(*x)))
data_loader_test = DataLoader(dataset_test, batch_size=1, shuffle=False, num_workers=4, collate_fn=lambda x: tuple(zip(*x)))

# 定义模型
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
num_classes = 23  # 22类目标 + 背景
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

# 设置设备
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# 定义优化器
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)

# 训练模型
num_epochs = 10
for epoch in range(num_epochs):
    model.train()
    for images, targets in data_loader:
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(images, targets)
        losses = sum(loss for loss in loss_dict.values())

        optimizer.zero_grad()
        losses.backward()
        optimizer.step()

    print(f'Epoch {epoch+1}/{num_epochs}, Loss: {losses.item()}')

    # 验证模型
    model.eval()
    with torch.no_grad():
        for images, targets in data_loader_test:
            images = list(image.to(device) for image in images)
            targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
            outputs = model(images)

# 保存模型
torch.save(model.state_dict(), 'fish_detection_model.pth')