PaddleDetection系列-[1]-ppyoloe+训练自己的数据集
1.制作数据集
1.1. 用labelImg制作的数据集转换成PaddleDetection可以训练的coco数据集
labelImg制作的数据集包含JPEGImage
图片目录和Annotation
xml目录
- 用脚本
split_dataset.py
用于划分训练集、测试集、验证集并制作成voc格式的
# -*- coding: UTF-8 -*-
import os
import random
base_dir = '/home/geekplusa/ai/datasets/'
def split_dataset():
xmlfilepath = os.path.join(base_dir, 'Annotation')
saveBasePath = os.path.join(base_dir, 'ImageSets/Main0')
if not os.path.exists(saveBasePath):
os.makedirs(saveBasePath)
trainval_percent = 0.9
train_percent = 0.9
val_percent = 0.1
temp_xml = os.listdir(xmlfilepath)
total_xml = []
for xml in temp_xml:
if xml.endswith(".xml"):
total_xml.append(xml)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
tval = int(tv * val_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
val = random.sample(trainval, tval)
print("train and val size", tv)
print("traub suze", tr)
ftrainval = open(os.path.join(saveBasePath, 'trainval.txt'), 'w')
ftest = open(os.path.join(saveBasePath, 'test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath, 'train.txt'), 'w')
fval = open(os.path.join(saveBasePath, 'val.txt'), 'w')
for i in list:
name = total_xml[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
if __name__ == '__main__':
split_dataset()
- 用
PaddleDetection
项目自带的tools/x2coco.py
脚本生成coco格式的数据集,生成相对应的json文件.
下面的例子是生成train.json
的脚本例子
python tools/x2coco.py --dataset_type voc \
--image_input_dir '/home/geekplusa/ai/datasets/JPEGImage/ \
--voc_anno_dir '/home/geekplusa/ai/datasets/Annotation/ \
--voc_anno_list '/home/geekplusa/ai/datasets/ImageSets/Main0/train.txt \
--voc_label_list '/home/geekplusa/ai/datasets/label_list.txt \
--output_dir '/home/geekplusa/ai/datasets/ImageSets/coco \
--voc_out_name 'train.json'
2.训练模型
2.1 修改相关的配置文件
coco_detection.yml
修改成自己的数据
2.2 训练脚本
- 单卡训练
python3 tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco_my.yml --eval --amp
- 多卡训练
nohup python3 -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco_my.yml --eval >logs/log_gkpw.log 2>&1 &