【2020-8月踩坑大集合】Pytorch-YOLOv4训练自己的数据(VOC数据类型)

前言

我大概花了一周多把当前高star的pytorch版YOLOv4都跑了一遍。这里将Tianxiaomo大佬yolov4版本:https://github.com/Tianxiaomo/pytorch-YOLOv4做一个总结。
在train和eval过程中,有以下几个问题:

  • evaluate评价coco 指标时,需要自己写一个get_img_id函数(已解决)
  • 几个报错(已解决)
  • 预测效果很好,bbox的conf也很高,但map较低(待解决)

数据集准备

该数据集格式和qqwwweee版Keras-yolov3一样,数据集格式如下:

# train.txt
image_path1 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ...
image_path2 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ...
...
...

1.划分train,eval,test数据集

和qqwwweee版Keras-yolov3数据集制作一样,如图准备文件夹,在VOCdevkit\VOC2007\Annotations放入所有xml文件,VOCdevkit\VOC2007\JPEGImages放所有的图像。运行test.py,test.py如下:

import os
import random

trainval_percent = 0.9
train_percent = 0.8
xmlfilepath = './Annotations/'
txtsavepath = './ImageSets/Main/'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open(txtsavepath + '/trainval.txt', 'w')
ftest = open(txtsavepath + '/test.txt', 'w')
ftrain = open(txtsavepath + '/train.txt', 'w')
fval = open(txtsavepath + '/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

2.生成所需数据集格式

建立pytorch-YOLOv4-master\voc_annotation.py,voc_annotation.py内容如下,修改classes为自己的类别
运行后会在VOCdevkit\VOC2007\ImageSets\Main中生成train.txt、test.txt、val.txt 和 trainval.txt四个文件。
将train.txt、test.txt、val.txt复制至data\文件夹下

import xml.etree.ElementTree as ET
from os import getcwd

sets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')]

#classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
classes = ['apple']

def convert_annotation(year, image_id, list_file):
    in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
    tree=ET.parse(in_file)
    root = tree.getroot()

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult)==1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text))
        list_file.write(" " + ",".join([str(a) for a in b]) + ',' + str(cls_id))

wd = getcwd()

for year, image_set in sets:
    image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
    list_file = open('%s_%s.txt'%(year, image_set), 'w')
    for image_id in image_ids:
        list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg'%(wd, year, image_id))
        convert_annotation(year, image_id, list_file)
        list_file.write('\n')
    list_file.close()

配置文件

1. cfg.py

修改如下项:

Cfg.batch = 2  
Cfg.subdivisions = 1

Cfg.TRAIN_EPOCHS = 2

Cfg.train_label = os.path.join(_BASE_DIR, 'data', 'train.txt')
Cfg.val_label = os.path.join(_BASE_DIR, 'data' ,'val.txt')

2. dataset.py

这步是为了在训练过程中同时测试当前训练map,如果不需要测试map,将train.py中415-440行注释掉,跳过此步骤。

需要测试map,则修改dataset.py中get_image_id,如下:

def get_image_id(filename:str) -> int:
    """
    Convert a string to a integer.
    Make sure that the images and the `image_id`s are in one-one correspondence.
    There are already `image_id`s in annotations of the COCO dataset,
    in which case this function is unnecessary.
    For creating one's own `get_image_id` function, one can refer to
    https://github.com/google/automl/blob/master/efficientdet/dataset/create_pascal_tfrecord.py#L86
    or refer to the following code (where the filenames are like 'level1_123.jpg')
    >>> lv, no = os.path.splitext(os.path.basename(filename))[0].split("_")
    >>> lv = lv.replace("level", "")
    >>> no = f"{int(no):04d}"
    >>> return int(lv+no)
    """
    imgname = os.path.splitext(os.path.basename(filename))[0]
    return int(imgname)

注意:

  • 以上方式仅适用于图像名为纯字的情况,命名IMG_123.jpg等情况参考源码作者注释
  • get_image_id主要是为每一张图像生成唯一id,这里用了图像名称作为id;
  • 也可以生成一个字典,对应每个图像;
  • github上用的global image id方式是错的,第一个epoch训练ok但是第二个epoch的image id就和第一epoch不一样了导致报错

3. train.py

1)下载源码作者的weights后,放在weights\文件夹
2)配置如下参数,尤其注意train.py的lr起作用,而非cfg.py中的lr;
3)-dir 为数据集图像路径

parser.add_argument('-l', '--learning-rate', metavar='LR', type=float, nargs='?', default=0.001,
                    help='Learning rate', dest='learning_rate')
parser.add_argument('-g', '--gpu', metavar='G', type=str, default='0',
                    help='GPU', dest='gpu')
parser.add_argument('-dir', '--data-dir', type=str, default='VOCdevkit/VOC2007/JPEGImages',
                    help='dataset dir', dest='dataset_dir')
parser.add_argument('-pretrained', type=str, default='weights/yolov4.pth', help='pretrained yolov4.conv.137')
parser.add_argument('-classes', type=int, default=1, help='dataset classes')
parser.add_argument('-train_label_path', dest='train_label', type=str, default='data/train.txt', help="train label path")

训练网络

运行train.py完事。但之前训练有几个奇奇怪怪的报错,部分报错已经被修复了:

num = operator.index(num) TypeError: ‘numpy.float64’ object cannot be interpreted as an integer
降低numpy版本,1.16可以

Tianxiaomo版YOLOv4 测试

测试官方预训练权重

参考:这篇博客链接。大佬的代码,精妙的一点是:不是将待检测图像放在一个文件夹里然后遍历,进行检测。他的做法是:遍历test.txt(写着待测图像名称),然后检测。以下为我修改报错后的代码:

if __name__ == "__main__":
    import sys
    import cv2
    from tqdm import tqdm
    import pandas as pd
    import pickle

    weightfile = 'weights/yolov4.pth'
    train_data_path = 'VOCdevkit/VOC2007/ImageSets/Main/test1.txt'
    root_dir = 'VOCdevkit/VOC2007/JPEGImages/' #图像文件夹路径
    conv137weight = 'weights/yolov4.conv.137.pth'

    n_classes = 80
    width = 608
    height = 608

    all_box_dict = {}

    model = Yolov4(yolov4conv137weight=conv137weight, n_classes=n_classes, inference=True)

    pretrained_dict = torch.load(weightfile, map_location=torch.device('cuda'))
    model.load_state_dict(pretrained_dict)

    use_cuda = True
    if use_cuda:
        model.cuda()

    test_data_list = list(pd.read_csv(train_data_path, index_col=False, header=None, sep=' ')[0])

    for i in tqdm(test_data_list):
        img_path = os.path.join(root_dir, str(i)) + '.jpg'
        img = cv2.imread(img_path)

        # Inference input size is 416*416 does not mean training size is the same
        # Training size could be 608*608 or even other sizes
        # Optional inference sizes:
        #   Hight in {320, 416, 512, 608, ... 320 + 96 * n}
        #   Width in {320, 416, 512, 608, ... 320 + 96 * m}
        sized = cv2.resize(img, (width, height))
        sized = cv2.cvtColor(sized, cv2.COLOR_BGR2RGB)

        from tool.utils import load_class_names, plot_boxes_cv2
        from tool.torch_utils import do_detect

        for num in range(2):
            # This 'for' loop is for speed check
            # Because the first iteration is usually longer
            boxes = do_detect(model, sized, 0.4, 0.6, use_cuda)

        # all_box_dict[i] = boxes[0]
        # print(len(boxes[0]))

        namesfile = 'data/coco.names'
        class_names = load_class_names(namesfile)
        save_path = os.path.join('demo/', str(i)) + '.jpg'

        img = plot_boxes_cv2(img, boxes[0], save_path, class_names)
        all_box_dict[i] = boxes[0]

    # 字典,所有图像的检测框
    print(all_box_dict)
  1. 根据自己文件,修改weightfile等四个路径,namesfile,save_path路径;width和height数据
  2. train_data_path的文件test.txt仅包括图像名称,类似
IMGxxxxx1
IMGxxxxx2
IMGxxxxx3

3.测试的官方权重,coco.names共80类(我用的自己数据集的coco.names,检测一直报错 out of index我还查了半天,-_-||)

结果如图,字号有点大

测试自己的训练权重

  1. 修改coco.names为自己的类别,n_classes=1
  2. 修改 weightsfile
if __name__ == "__main__":
    import sys
    import cv2
    from tqdm import tqdm
    import pandas as pd
    import pickle

    weightfile = 'weights/Yolov4_epoch70.pth'
    train_data_path = 'VOCdevkit/VOC2007/ImageSets/Main/test1.txt'
    root_dir = 'VOCdevkit/VOC2007/JPEGImages/'
    conv137weight = 'weights/yolov4.conv.137.pth'

    n_classes = 1
    width = 608
    height = 608

    all_box_dict = {}

    model = Yolov4(yolov4conv137weight=conv137weight, n_classes=n_classes, inference=True)

    pretrained_dict = torch.load(weightfile, map_location=torch.device('cuda'))
    model.load_state_dict(pretrained_dict)

    use_cuda = True
    if use_cuda:
        model.cuda()

    test_data_list = list(pd.read_csv(train_data_path, index_col=False, header=None, sep=' ')[0])

    for i in tqdm(test_data_list):
        img_path = os.path.join(root_dir, str(i)) + '.jpg'
        img = cv2.imread(img_path)

        # Inference input size is 416*416 does not mean training size is the same
        # Training size could be 608*608 or even other sizes
        # Optional inference sizes:
        #   Hight in {320, 416, 512, 608, ... 320 + 96 * n}
        #   Width in {320, 416, 512, 608, ... 320 + 96 * m}
        sized = cv2.resize(img, (width, height))
        sized = cv2.cvtColor(sized, cv2.COLOR_BGR2RGB)

        from tool.utils import load_class_names, plot_boxes_cv2
        from tool.torch_utils import do_detect

        for num in range(2):
            # This 'for' loop is for speed check
            # Because the first iteration is usually longer
            boxes = do_detect(model, sized, 0.4, 0.6, use_cuda)

        # all_box_dict[i] = boxes[0]
        # print(len(boxes[0]))

        namesfile = 'data/coco.names'
        class_names = load_class_names(namesfile)
        save_path = os.path.join('demo/', str(i)) + '.jpg'

        img = plot_boxes_cv2(img, boxes[0], save_path, class_names)
        all_box_dict[i] = boxes[0]

    # 字典,所有图像的检测框
    print(all_box_dict)

在这里插入图片描述

参考以下:
手口一斤-博客
神码堂-博客

  • 0
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 13
    评论
YOLO系列是基于深度学习的端到端实时目标检测方法。 PyTorch版的YOLOv5轻量而性能高,更加灵活和易用,当前非常流行。 本课程将手把手地教大家使用labelImg标注和使用YOLOv5训练自己的数据集。课程实战分为两个项目:单目标检测(足球目标检测)和多目标检测(足球和梅西同时检测)。 本课程的YOLOv5使用ultralytics/yolov5,在Ubuntu系统上做项目演示。包括:安装YOLOv5、标注自己的数据集、准备自己的数据集、修改配置文件、训练自己的数据集、测试训练出的网络模型和性能统计。 希望学习在Windows系统上演示的学员,请前往《YOLOv5(PyTorch)实战:训练自己的数据集(Windows)》课程链接:https://edu.csdn.net/course/detail/30923本人推出了有关YOLOv5目标检测的系列课程。请持续关注该系列的其它视频课程,包括:《YOLOv5(PyTorch)目标检测实战:训练自己的数据集》Ubuntu系统 https://edu.csdn.net/course/detail/30793Windows系统 https://edu.csdn.net/course/detail/30923《YOLOv5(PyTorch)目标检测:原理与源码解析》课程链接:https://edu.csdn.net/course/detail/31428《YOLOv5目标检测实战:Flask Web部署》课程链接:https://edu.csdn.net/course/detail/31087《YOLOv5(PyTorch)目标检测实战:TensorRT加速部署》课程链接:https://edu.csdn.net/course/detail/32303《YOLOv5目标检测实战:Jetson Nano部署》课程链接:https://edu.csdn.net/course/detail/32451《YOLOv5+DeepSORT多目标跟踪与计数精讲》课程链接:https://edu.csdn.net/course/detail/32669《YOLOv5实战口罩佩戴检测》课程链接:https://edu.csdn.net/course/detail/32744《YOLOv5实战中国交通标志识别》课程链接:https://edu.csdn.net/course/detail/35209 《YOLOv5实战垃圾分类目标检测》课程链接:https://edu.csdn.net/course/detail/35284  
PyTorch下使用Ultralytics YOLOv5训练自己的数据集,可以按照以下步骤进行操作: 1. 安装依赖库: ```shell pip install torch torchvision ``` 2. 克隆YOLOv5仓库: ```shell git clone https://github.com/ultralytics/yolov5.git ``` 3. 进入yolov5目录: ```shell cd yolov5 ``` 4. 准备数据集: - 将自己的数据集放置在`yolov5/data`目录下,包括图像和对应的标注文件(如YOLO格式的txt文件)。 - 在`yolov5/data`目录下创建一个新的文件夹,用于存放自己的数据集,例如`my_dataset`。 5. 配置数据集: - 在`yolov5/data`目录下创建一个新的文件,命名为`my_dataset.yaml`,用于配置自己的数据集。 - 在`my_dataset.yaml`中,按照以下格式填写配置信息: ```yaml train: ../my_dataset/train/images val: ../my_dataset/val/images nc: 1 # 类别数 names: ['class1'] # 类别名称 ``` 6. 划分训练集和验证集: - 在`yolov5/data`目录下创建`my_dataset/train`和`my_dataset/val`两个文件夹,分别用于存放训练集和验证集的图像和标注文件。 - 将数据集中的图像和对应的标注文件按照一定比例划分到`train`和`val`文件夹中。 7. 开始训练: ```shell python train.py --img 640 --batch 16 --epochs 50 --data my_dataset.yaml --weights yolov5s.pt ``` - `--img`:输入图像的大小。 - `--batch`:每个批次的图像数量。 - `--epochs`:训练的轮数。 - `--data`:数据集的配置文件。 - `--weights`:预训练模型的权重文件。 8. 查看训练结果: - 训练过程中的日志和权重文件保存在`yolov5/runs/train/exp`目录下。 - 可以使用TensorBoard查看训练过程中的损失曲线: ```shell tensorboard --logdir=runs/train/exp ``` 9. 进行推理: - 使用训练好的模型进行目标检测: ```shell python detect.py --source path/to/images --weights runs/train/exp/weights/best.pt --conf 0.4 ``` - `--source`:输入图像或视频的路径。 - `--weights`:训练好的权重文件。 - `--conf`:置信度阈值,用于过滤检测结果。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 13
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值