题记:YOLOv3训练自己的数据集,可参考官方指导,点这里,简单易懂。
一、下载代码
git clone https://github.com/pjreddie/darknet
cd darknet
二、编译
修改Makefile配置,需要改的地方注释了
Makefile修改如下:
GPU=1 #使用GPU设置为1,CPU设置为0
CUDNN=1 #使用CUDNN设置为1,否则为0
OPENCV=1 #使用OPENCV设置为1,否则为0
OPENMP=0 #使用OPENMP设置为1,否则为0
DEBUG=0 #使用DEBUG设置为1,否则为0
ARCH= -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=[sm_50,compute_50] \
-gencode arch=compute_52,code=[sm_52,compute_52]
# -gencode arch=compute_20,code=[sm_20,sm_21] \ This one is deprecated?
# This is what I use, uncomment if you know your arch and want to specify
# ARCH= -gencode arch=compute_52,code=compute_52
VPATH=./src/:./examples
SLIB=libdarknet.so
ALIB=libdarknet.a
EXEC=darknet
OBJDIR=./obj/
##CC=gcc 这里最大的坑,加了路径,不然报错
CC=/usr/bin/gcc
##CPP=g++ 这里也是
CPP=/usr/bin/g++
#NVCC=nvcc 修改为自己的路径,注意自己cuda版本号
NVCC=/usr/local/cuda-10.2/bin/nvcc
AR=ar
ARFLAGS=rcs
OPTS=-Ofast
LDFLAGS= -lm -pthread
COMMON= -Iinclude/ -Isrc/
CFLAGS=-Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC
ifeq ($(OPENMP), 1)
CFLAGS+= -fopenmp
endif
ifeq ($(DEBUG), 1)
OPTS=-O0 -g
endif
CFLAGS+=$(OPTS)
ifeq ($(OPENCV), 1)
COMMON+= -DOPENCV
CFLAGS+= -DOPENCV
LDFLAGS+= `pkg-config --libs opencv` -lstdc++
COMMON+= `pkg-config --cflags opencv`
endif
ifeq ($(GPU), 1)
COMMON+= -DGPU -I/usr/local/cuda/include/
CFLAGS+= -DGPU
LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand
endif
ifeq ($(CUDNN), 1)
COMMON+= -DCUDNN
CFLAGS+= -DCUDNN
LDFLAGS+= -lcudnn
endif
OBJ=gemm.o utils.o cuda.o deconvolutional_layer.o convolutional_layer.o list.o image.o activations.o im2col.o col2im.o blas.o crop_layer.o dropout_layer.o maxpool_layer.o softmax_layer.o data.o matrix.o network.o connected_layer.o cost_layer.o parser.o option_list.o detection_layer.o route_layer.o upsample_layer.o box.o normalization_layer.o avgpool_layer.o layer.o local_layer.o shortcut_layer.o logistic_layer.o activation_layer.o rnn_layer.o gru_layer.o crnn_layer.o demo.o batchnorm_layer.o region_layer.o reorg_layer.o tree.o lstm_layer.o l2norm_layer.o yolo_layer.o iseg_layer.o image_opencv.o
EXECOBJA=captcha.o lsd.o super.o art.o tag.o cifar.o go.o rnn.o segmenter.o regressor.o classifier.o coco.o yolo.o detector.o nightmare.o instance-segmenter.o darknet.o
ifeq ($(GPU), 1)
LDFLAGS+= -lstdc++
OBJ+=convolutional_kernels.o deconvolutional_kernels.o activation_kernels.o im2col_kernels.o col2im_kernels.o blas_kernels.o crop_layer_kernels.o dropout_layer_kernels.o maxpool_layer_kernels.o avgpool_layer_kernels.o
endif
EXECOBJ = $(addprefix $(OBJDIR), $(EXECOBJA))
OBJS = $(addprefix $(OBJDIR), $(OBJ))
DEPS = $(wildcard src/*.h) Makefile include/darknet.h
all: obj backup results $(SLIB) $(ALIB) $(EXEC)
#all: obj results $(SLIB) $(ALIB) $(EXEC)
$(EXEC): $(EXECOBJ) $(ALIB)
$(CC) $(COMMON) $(CFLAGS) $^ -o $@ $(LDFLAGS) $(ALIB)
$(ALIB): $(OBJS)
$(AR) $(ARFLAGS) $@ $^
$(SLIB): $(OBJS)
$(CC) $(CFLAGS) -shared $^ -o $@ $(LDFLAGS)
$(OBJDIR)%.o: %.cpp $(DEPS)
$(CPP) $(COMMON) $(CFLAGS) -c $< -o $@
$(OBJDIR)%.o: %.c $(DEPS)
$(CC) $(COMMON) $(CFLAGS) -c $< -o $@
$(OBJDIR)%.o: %.cu $(DEPS)
$(NVCC) $(ARCH) $(COMMON) --compiler-options "$(CFLAGS)" -c $< -o $@
obj:
mkdir -p obj
backup:
mkdir -p backup
results:
mkdir -p results
.PHONY: clean
clean:
rm -rf $(OBJS) $(SLIB) $(ALIB) $(EXEC) $(EXECOBJ) $(OBJDIR)/*
修改完成后,编辑
make
三、数据集准备
1、新建文件夹存放自己的数据
在/darknet目录下新建一个myData文件夹,存放自己的数据
mkdir myData
2、在该目录下新建三个文件夹,其中,
JPEGImages:用来存放图片
Annotations:图片对应的xml
cd myData
mkdir Annotations
mkdir JPEGImages
mkdir ImageSets
3、在ImageSets下新建文件夹Main,用来存放txt文件(后续会生成)
cd ImageSets
mkdir Main
上述所建文件夹目录如下图:
4、将 jpg 和 xml 文件放入对应目录下,然后使用 python 脚本划分一下训练集和测试集,脚本放在/darknet/myData/下:
import os
import random
trainval_percent = 0.1
train_percent = 0.9
xmlfilepath = 'Annotations'
txtsavepath = 'ImageSets\Main'
total_xml = os.listdir(xmlfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
ftrainval = open('ImageSets/Main/trainval.txt', 'w')
ftest = open('ImageSets/Main/test.txt', 'w')
ftrain = open('ImageSets/Main/train.txt', 'w')
fval = open('ImageSets/Main/val.txt', 'w')
for i in list:
name = total_xml[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftest.write(name)
else:
fval.write(name)
else:
ftrain.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
划分后的4个.txt文件存放在上门所建的Main文件夹中,如图:
四、数据转换
1、用voc_label.py转换label
YOLOV3的label和我们之前标注的label不同,它的label一行有5个数,分别为:<object-class> <x> <y> <width> <height>:
<object-class> :从0到(num_class-1)的整数类别编号
<x_center> <y_center> <width> <height>:BoundingBox 的中心坐标x,中心坐标y,宽,高
因此我们需要用voc_label.py文件转换label。
cp ./scripts/voc_label.py ./my_voc_label.py
2、修改my_voc_label.py
需要修改的地方为一些自己的路径,类别:
my_voc_label.py修改后完整代码如下:
#!/usr/bin/env Python
# coding=utf-8
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
##sets=[('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
sets=[('myData', 'train'), ('myData', 'val'), ('myData', 'test')]
##classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
classes = ["car", "truck", "bus", "moto", "bike", "tricycle", "pedestrian", "plate", "driver", "codriver", "tissue", "mark", "decorate"]
def convert(size, box):
dw = 1./(size[0])
dh = 1./(size[1])
x = (box[0] + box[1])/2.0 - 1
y = (box[2] + box[3])/2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
def convert_annotation(year, image_id):
#in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
in_file = open('myData/Annotations/%s.xml'%(image_id))
#out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
out_file = open('myData/labels/%s.txt'%(image_id), 'w')
tree=ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult)==1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
bb = convert((w,h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
wd = getcwd()
for year, image_set in sets:
if not os.path.exists('myData/labels/'):
os.makedirs('myData/labels/')
image_ids = open('myData/ImageSets/Main/%s.txt'%(image_set)).read().strip().split()
list_file = open('myData/%s_%s.txt'%(year, image_set), 'w')
for image_id in image_ids:
list_file.write('%s/myData/JPEGImages/%s.jpg\n'%(wd, image_id))
convert_annotation(year, image_id)
list_file.close()
##os.system("cat 2007_train.txt 2007_val.txt 2012_train.txt 2012_val.txt > train.txt")
##os.system("cat 2007_train.txt 2007_val.txt 2007_test.txt 2012_train.txt 2012_val.txt > train.all.txt")
os.system("cat ./myData/myData_train.txt ./myData/myData_val.txt > ./myData/train.txt")
os.system("cat ./myData/myData_train.txt ./myData/myData_val.txt ./myData/myData_test.txt > ./myData/train.all.txt")
3、运行my_voc_label.py
python my_voc_label.py
运行后,在/myData/labels/目录下生成了对应的txt标签:
打开其中一个txt文件,内容如下,发现如上所述一行五个数:
同时在/myData/也生成了5个list文件:
打开其中一个txt文件,内容如下,发现路径自动写好了:
五、修改各种配置文件
1、为了方便及保险起见,这里将所有需要修改的配置文件统一拷贝放到自己新建的文件夹,然后修改,也方便后续训练给路径。
cd /darknet/myData/
mkdir my_cfg
cd ..
cp ./data/voc.names ./myData/my_cfg/my_voc.names
cp ./cfg/voc.data ./myData/my_cfg/my_voc.data
cp ./cfg/yolov3-voc.cfg ./myData/my_cfg/my_yolov3-voc.cfg
2、修改my_voc.names:将类别名称改成自己的
3、修改my_voc.data:将训练数据、类别、weights路径改成自己的
4、修改my_yolov3-voc.cfg:
1)、注释掉Testing,打开Training模式,根据自己的机器设置batch和subdivisions;设置自己想要的宽和高
2)、 修改三个yolo层中的类别数(class)
Ctrl+f 搜索 yolo, 会搜出3个含有yolo的地方,每个地方有2出需要修改:classes和filters
classes = len(classes) = 13,这里以13个类型为例
filters =(类别+5)* 3 = 54
此外,random = 1这里原来是1,若显存足够不用改,若显存不够改为0
六、开始训练
1、先在darknet目录下面下载预训练权重
wget https://pjreddie.com/media/files/darknet53.conv.74
2、训练自己的数据
./darknet detector train myData/my_cfg/my_voc.data myData/my_cfg/my_yolov3-voc.cfg darknet53.conv.74 -gpus 0,1
3、训练开始
训练日志说明:
Region xx: cfg文件中yolo-layer的索引
Avg IOU:当前迭代中,预测的box与标注的box的平均交并比,越大越好,期望数值为1
Class: 标注物体的分类准确率,越大越好,期望数值为1
obj: 越大越好,期望数值为1
No obj: 越小越好
.5R: 以IOU=0.5为阈值时候的recall; recall = 检出的正样本/实际的正样本
0.75R: 以IOU=0.75为阈值时候的recall
count:正样本数目
4、断点续训
训练得到的权重文件weights都存在/darknet/myData/weights/中。
若中途训练中断,不需重头训练,接着上次训练的地方继续训练即可,命令如下:
./darknet detector train myData/my_cfg/my_voc.data myData/my_cfg/my_yolov3-voc.cfg myData/weights/my_yolov3-voc.backup -gpus 0,1
七、测试
测试前需要修改my_yolov3-voc.cfg文件,注释掉Training,打开Testing模式
开始测试:
./darknet detector test myData/my_cfg/my_voc.data myData/my_cfg/my_yolov3-voc.cfg myData/weights/my_yolov3-voc_final.weights test_pic/test.jpg -gpus 0
测试结果:
后记:官方给的测试命令,一次只能测一张,若要批量测试,需要自己写代码,后续加更。
YOLOV4的训练,后续再写。