Yolo训练准备

准备数据:

1、当你阅读这份文档前,请确认你已经准备好了需要训练的图片,这些图片中包含需要识别的目标
2、请准备好标注工具,这里推荐用labelImg,下载地址: https://github.com/tzutalin/labelImg 
3、到这里你需要准备好Annotations 和 JPEGImages,Annotations里面放的是标注完数据生成的xml文件,xml里面有我们标注的信息,JPEGImages里面存放的是标注的图片,ImageSets里面是一个Main文件夹,Main里面是四个txt文件,四个里面我们训练主要要用的是train.txt和val.txt,再看labels文件夹,它是我们存放VOC转成yolo需要的格式数据后的一些txt文件,后面会将如何生成。接下来我们要说的就是create_main_txt.py脚本。







4、create_main_txt.py脚本
# coding:utf-8

import os
if __name__ == '__main__':
    path = os.getcwd()
    dataAnnotated = os.listdir(path + '/Annotations')#有待改善
    dataNum = len(dataAnnotated)  # 数据集数量

    ftest = open('ImageSets/Main/test.txt', 'w')  # 测试集
    ftrain = open('ImageSets/Main/train.txt', 'w')  # 训练集
    ftrainval = open('ImageSets/Main/trainval.txt', 'w')  # 训练验证集
    fval = open('ImageSets/Main/val.txt', 'w')  # 验证集
    testScale = 0.0  # 测试集占总数据集的比例
    trainScale = 0.85  # 训练集占训练验证集的比例

    i = 1
    testNum = int(dataNum * testScale)  # 测试集的数量
    trainNum = int((dataNum - testNum) * trainScale)  # 训练集的数量

    for name in dataAnnotated:
        (filename,extension) = os.path.splitext(name)
        if i <= testNum:

            ftest.write(str(filename) + "\n")

        elif i <= testNum + trainNum:

            ftrain.write(str(filename) + "\n")
            ftrainval.write(str(filename) + "\n")
        else:

            fval.write(str(filename) + "\n")
            ftrainval.write(str(filename) + "\n")
        i += 1
    ftrain.close
    ftrainval.close
    fval.close
    ftest.close

该脚本的作用主要是生成Main下的四个txt文件,txt中主要存放的各个集合(训练集、测试集,训练验证集、测试验证集)图片的绝对路径,需要注意的是我这边的脚本有点笨,前提是在Annotations里面的xml都是有效的(图片在用labelImg标记后即会生成,如果后面又删除标记的物体,那么即使图片里面没有我们要训练的问题,该图片对应的xml还在,这样的话也会拿去做训练),所有使用本脚本,请确保xml都是有效的。
5、到此,我们已经准备好了VOC格式的数据。接下来我们需要将VOC转成YOLO支持的数据。同样,也有一个脚本可以实现。 voc_label.py
6、voc_label.py。
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join

sets=[('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')]

classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]


def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

def convert_annotation(year, image_id):
    in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
    out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

wd = getcwd()

for year, image_set in sets:
    if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
        os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
    image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
    list_file = open('%s_%s.txt'%(year, image_set), 'w')
    for image_id in image_ids:
        list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'%(wd, year, image_id))
        convert_annotation(year, image_id)
    list_file.close()


同样的,要根据具体自己的路径修改一下脚本中的一些路径。
7、voc_label.py生成的东西有:
'%s_%s.txt'%(year, image_set)
'VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id)



准备训练代码及配置文件

1、下载训练网络代码https://github.com/pjreddie/darknet
2、配置文件修改好,用到的文件有:
  • 训练集和验证集绝对路径集合(两个前面voc_label.py生成的txt,这里是train.txt和val.txt)
  • .data文件,里面必须有的信息为
    classes= 20
    train  = /home/pjreddie/data/voc/train.txt
    valid  = /home/pjreddie/data/voc/val.txt
    names = data/voc.names
    backup = backup
  • .names文件,里面是标注的种类名称,格式如下:
    aeroplane
    bicycle
    bird
    boat
    bottle
    bus
    car
    cat
    chair
    cow
    diningtable
    dog
    horse
    motorbike
    person
    pottedplant
    sheep
    sofa
    train
    tvmonitor
  • backup参数为指定的存放训练生成的权重文件的路径,这里是同目录下的backup目录
  • .cfg文件,如下yolo.cfg:
[net]
batch=1
subdivisions=1
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
max_batches = 120000
policy=steps
steps=-1,100,80000,100000
scales=.1,10,.1,.1

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky


#######

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[route]
layers=-9

[reorg]
stride=2

[route]
layers=-1,-3

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=425
activation=linear

[region]
anchors = 0.738768,0.874946,  2.42204,2.65704,  4.30971,7.04493,  10.246,4.59428,  12.6868,11.8741
bias_match=1
classes=80
coords=4
num=5
softmax=1
jitter=.2
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .6
random=0


3、训练脚本
编译成功后,会生成darknet可执行文件,执行darknet 后面接准备好的.data和.cfg文件路径,另外还可以接-gpus 0,1,2,3,4 选择用来训练的GPU编号(多GPU可能导致Obj参数接近零),和其他的一些参数。
./darknet detector  train cfg/voc.data cfg/yolo-voc.2.0.cfg -gpus 0,1,2,3  2>netlog.txt | tee train_log.txt

4、开始训练。



  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

blacklong28

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值