草原牛羊马目标检测数据集
数据集拥有3个类别、总计2400张图片
支持YOLO、VOC格式
已经划分为训练集、验证集、测试集
可直接进行YOLO
基于草原牛羊马目标检测数据集来实现一个目标检测任务。由于您的数据集已经划分好了训练集、验证集和测试集,并且支持YOLO格式,我们可以直接使用YOLOv5进行训练。
环境准备
确保您已经安装了以下软件和库:
- Python 3.8 或更高版本
- PyTorch 1.9 或更高版本
- torchvision 0.10 或更高版本
- OpenCV
- numpy
- pandas
- matplotlib
- albumentations(用于数据增强)
您可以使用以下命令安装所需的Python库:
pip install torch torchvision opencv-python numpy pandas matplotlib albumentations
数据集准备
假设您的数据集已经按照YOLO格式组织好,并且包含训练集、验证集和测试集。以下是数据集的预期结构:
datasets/
└── grassland_animals/
├── images/
│ ├── train/
│ ├── val/
│ └── test/
└── labels/
├── train/
├── val/
└── test/
同时,有一个 classes.txt
文件包含类别名称,每行一个类别名称。
类别文件 (classes.txt
)
cow
sheep
horse
模型训练
我们将使用YOLOv5进行训练。首先,克隆YOLOv5仓库并设置环境。
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -r requirements.txt
准备配置文件
创建一个 hyp.scratch.yaml
文件来定义超参数:
# Hyperparameters for YOLOv5 training from scratch
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.5 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 1.0 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.5 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.0 # image mixup (probability)
copy_paste: 0.0 # segment copy-paste (probability)
paste_in: 0.0 # segment paste-in (probability)
rect: 0 # rectangular training
resume: false # resume training from last checkpoint
nosave: false # only save final checkpoint
noval: false # only validate final epoch
noautoanchor: true # disable AutoAnchor
evolve: false # evolve hyperparameters
bucket: '' # gsutil bucket
cache_images: false # cache images for faster training
image_weights: false # use weighted image selection for training
single_cls: false # train multi-class data as single-class
optimizer: SGD # optimizer: SGD or AdamW
sync_bn: false # use SyncBatchNorm, only available in DDP mode
workers: 8 # dataloader workers (max is number of CPU cores)
freeze: 0 # freeze first n layers
v5_metric: true # assume maximum recall as 1.0 in AP calculation
multi_scale: true # vary input size between 320 and 640 pixels
rect_training: false # rectangular training
cos_lr: false # cosine LR scheduler
close_mosaic: 1000 # close mosaic border
scales: [0.5, 1.5] # image size scales
augment: true # augment data
verbose: false # verbose print
seed: 0 # random seed for reproducibility
local_rank: -1 # ddp device id (-1 for single gpu train)
entity: null # W&B entity
upload_dataset: False # upload dataset as W&B artifact table
bbox_interval: -1 # W&B bounding box logging interval
artifact_alias: latest # version of dataset artifact to use
project: runs/train # save results to project/name
exist_ok: false # existing project/name ok, do not increment
quad: false # quadrilateral anchors
linear_assignment: false # use linear assignment for NMS
创建一个 data.yaml
文件来定义数据集路径和类别:
train: ../datasets/grassland_animals/images/train/
val: ../datasets/grassland_animals/images/val/
nc: 3 # number of classes
names: ['cow', 'sheep', 'horse'] # list of class names
训练模型
使用以下命令开始训练:
python train.py --img 640 --batch 16 --epochs 50 --data data.yaml --cfg yolov5s.yaml --weights yolov5s.pt --hyp hyp.scratch.yaml
结果评估
训练完成后,可以使用以下命令评估模型性能:
python val.py --data data.yaml --weights runs/train/exp/weights/best.pt --task test
使用说明
-
配置路径:
- 确保
datasets/grassland_animals/
目录结构正确。 - 确保
data.yaml
中的路径和类别名称正确。
- 确保
-
运行脚本:
- 在终端中依次运行训练脚本和评估脚本。
-
注意事项:
- 根据需要调整超参数和训练设置。
- 可以通过修改
data.yaml
中的cfg
参数来选择不同的YOLOv5模型架构(如yolov5m.yaml
,yolov5l.yaml
,yolov5x.yaml
)。
示例
假设您的数据集文件夹结构如下:
datasets/
└── grassland_animals/
├── annotations.json
├── images/
│ ├── train/
│ ├── val/
│ └── test/
└── labels/
├── train/
├── val/
└── test/
并且 annotations.json
包含所有必要的注释信息。运行上述脚本后,您可以查看训练日志和最终的模型权重文件。
总结
通过上述步骤,我们可以构建一个全面的目标检测系统,包括数据集准备、模型训练和结果评估。以下是所有相关的代码文件:
- YOLOv5超参数配置文件 (
hyp.scratch.yaml
) - YOLOv5数据集配置文件 (
data.yaml
)