使用YOLOv5进行自定义数据集训练的流程及结果
参考资料:https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/
一、安装与配置
- git clone YOLOv5
git clone git@github.com:ultralytics/yolov5.git
- 新建python虚拟环境,避免版本相互干扰,我建立虚拟环境的指令如下:
conda create -n py39_yolov5 python=3.9
pip install -r requirements.txt
二、项目创建与配置
- 创建如图所示的文件结构
-
- 其中
Annotations
存放VOC格式的标签,如果直接创建的是YOLO格式的标签则此文件夹非必要,直接将标签存放到labels
文件夹中即可 image
文件夹中存放的是数据集中的图片内容ImageSets
文件夹是将image
文件夹中的图片以及labels
内的标签按照train、test、val划分开来,每种存储在对应的文件夹中labels
存放的是YOLO类型的标签
- 其中
-
使用labelimg给图片数据集进行标注
- 将
/labelimg/data/predefined_classes.txt
中修改为自定义数据集中的类 - 运行labelimg进行打标,AD左右切换图片,W创建标注框,将标注的VOC文件存储到项目临时文件夹中
- 将
-
使用数据集增强功能包进行数据集增强
- 将数据集的标注和标注好的标注文件分别存储到功能包
DataAugForObjectDetecton
中的/data/Annotations
和/data/images
中 - 进入
DataAugForObjectDetecton.py
文件,到程序入口处修改need_aug_num = 10
,这是每张图片进行扩增的数量 - 运行
DataAugForObjectDetecton.py
文件,在Dataset文件夹下可以找到扩增后的标签与数据集 - 将标签与数据集移动到项目中对应的文件夹
- 将数据集的标注和标注好的标注文件分别存储到功能包
-
划分训练集、测试集和验证集
- 运行
spit.py
文件,调节其中的参数可以改变三者的比例大小
- 运行
-
VOC标签转YOLO
- 运行
voc_lavel.py
文件,转化后的标签将存储在labels文件夹中
- 运行
-
将
images
和labels
中的文件按照train
、test
、val
划分- 运行
set_dir.py
,修改其中的地址,可以按照spit.py
中生成的.txt
文件自动从images
和labels
中找到对应文件并将其复制到ImageSets
文件夹中的指定位置
- 运行
# set_dir.py,可以读取train test val.txt中的路径,将图片和标签放入指定的文件夹中
import os
import shutil
def set_dir(txt_path,source_image_folder,source_label_folder,destination_image_folder,destination_label_folder):
txt_files = open(txt_path , 'r' , encoding= 'utf-8')
for txt_file in txt_files:
txt_image_file = txt_file.strip() + ".jpg"
txt_label_file = txt_file.strip() + ".txt"
origin_image_path = os.path.join(source_image_folder, txt_image_file)
destination_image_path = os.path.join(destination_image_folder,txt_image_file)
origin_label_path = os.path.join(source_label_folder, txt_label_file)
destination_label_path = os.path.join(destination_label_folder,txt_label_file)
if os.path.exists(source_image_folder) and os.path.exists(source_label_folder):
shutil.copy(origin_image_path, destination_image_path)
shutil.copy(origin_label_path,destination_label_path)
else:
print(f"ERROR!{origin_image_path} or {origin_label_path}is not found,cannot copy ")
if __name__=="__main__":
# 原txt文件与图片及标注路径
train_txt_path = '/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/train.txt'
test_txt_path = '/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/test.txt'
val_txt_path = '/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/val.txt'
image_folder = "/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/image"
label_folder = "/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/labels"
# train的图像与标注存储路径
train_image_folder = "/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/ImageSets/train/images"
train_label_folder = "/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/ImageSets/train/labels"
#test的图像与标注存储路径
test_image_folder = "/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/ImageSets/test/images"
test_label_folder = "/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/ImageSets/test/labels"
#val的图像与标注存储路径
val_image_folder = "/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/ImageSets/val/images"
val_label_folder = "/home/ding/deeplearning_task/YOLOv5/yolov5/yolov5/datasets/winter_task5_YOLOv5/ImageSets/val/labels"
set_dir(train_txt_path,image_folder,label_folder,train_image_folder,train_label_folder)
set_dir(test_txt_path,image_folder,label_folder,test_image_folder,test_label_folder)
set_dir(val_txt_path,image_folder,label_folder,val_image_folder,val_label_folder)
-
data
文件的配置-
新建
.yaml
文件,具体内容如下:train : datasets/winter_task5_YOLOv5/ImageSets/train val : datasets/winter_task5_YOLOv5/ImageSets/val test : datasets/winter_task5_YOLOv5_/ImageSets/test # 各数据集的路径 nc: 7 #类别数 names: # 类别名 ["corn","corn_plant","cucumber","cucumber_plant","watermelon","rice","wheat"]
-
-
cfg
文件的配置 -
相关操作指令
python train.py --data <path_of_data.yaml> --cfg <path_of_cfg.yaml> --weight <path_of_pretrained_model> # --weight参数可以不加如果没有预训练模型
说明:
path_of_data.yaml
是data文件的路径。path_of_cfg.yaml
是.cfg文件的路径。path_of_pretrained_model
是预训练权重文件的路径#模型检验 python detect.py --weights <model> --source <arg> #python detect.py --source 0 # webcam img.jpg # image vid.mp4 # video screen # screenshot path/ # directory 'path/*.jpg' # glob 'https://youtu.be/LNwODJXcvt4' # YouTube 'rtsp://example.com/media.mp4' # RTSP, RTMP,HTTP stream
四、wandb
可视化工具的使用
五、本次任务模型训练结果
- 类别划分
本次训练划分了七个类别
corn
corn_plant
cucumber
cucumber_plant
watermelon
rice
wheat
-
训练输出图像
更多图像可以在我的仓库查看
-
验证结果