YOLOv4（AlexeyAB版darknet)和YOLOv5训练coco数据集，训练其中的某一类

最新推荐文章于 2024-05-05 21:38:04 发布

置顶愿望是当打工人

最新推荐文章于 2024-05-05 21:38:04 发布

阅读量3.1k

点赞数 1

分类专栏： YOLOv4 文章标签：计算机视觉

转载注明出处。

本文链接：https://blog.csdn.net/weixin_40557160/article/details/116004086

版权

YOLOv4 专栏收录该内容

21 篇文章 2 订阅

订阅专栏

训练coco数据集单类

前言
实验

前言

最近在用YOLOv4做大规模检测，但是发现这方面的资料实在是太少了。就连大规模检测是干什么的也没有一个较准确的定义。导师给我的任务是密集人群检测，就是场景中有很多人保证检测的准确性。
正好最近在用yolov4，想着用v4去训练一下coco中的人单个类别。坐下实验，看下效果。
之后再拿yolov5训练一下。

实验

数据集准备

先下载coco数据集，我用的是coco2017数据集，下载地址网上搜一下，有很多，我的博客里面也有。
这部分的教程，参考我的github：https://github.com/SpongeBab/COCO_only_person

提取单类别（以人为例）

并没有使用网上那种从json文件中挑选出哪一类再生成id的方法，太麻烦了。所幸我从YOLOv5的github项目中找到了TXT格式的label文件，从图片名的label文件中找是否有你需要的类别的id，如果有就根据label文件名去找对应的image。
代码如下：

import os
import shutil
import inspect


def select_person(source_label_path, output_label_path, source_images_path, output_images_path):
    if not os.path.exists(output_label_path):
        os.makedirs(output_label_path)
    if not os.path.exists(output_images_path):
        os.makedirs(output_images_path)
    for txt_file in os.listdir(source_label_path):
        # print(txt_file)
        # 得到了txt_file的集合
        txt_name, extension = os.path.splitext(txt_file)
        # print(txt_name)
        with open(os.path.join(source_label_path, txt_file), "r") as f:
            fb = f.readlines()
            for line in fb[:]:
                # line = line.strip() # 错误的,使用这句会将换行符 /n 去掉，使输出变为一行
                linelist = line.split(" ")
                first = linelist[0]
                # output_file = os.path.join(output_path, txt_file)
                if "0" == linelist[0]:
                    outfile = open(os.path.join(output_label_path, txt_file), "a+")   # w这种打开方式会每次打开都把以前的覆盖掉，使用a或者a+，追加写
                    outfile.write(line)
                    src = os.path.join(source_images_path, txt_name + ".jpg")
                    dst = os.path.join(output_images_path, txt_name + ".jpg")
                    shutil.copy(src, dst)
                    print("已完成", dst)


if __name__ == '__main__':
    train_source_labels = "/home/xiaopeng/coco/labels/train2017"
    train_output_labels = "/home/xiaopeng/coco_only_person/labels/train2017"
    train_source_images = "/home/xiaopeng/coco/images/train2017"
    train_output_images = "/home/xiaopeng/coco_only_person/images/train2017"
    select_person(train_source_labels, train_output_labels, train_source_images, train_output_images)
    valid_source_labels = "/home/xiaopeng/coco/labels/val2017"
    valid_output_labels = "/home/xiaopeng/coco_only_person/labels/val2017"
    valid_source_images = "/home/xiaopeng/coco/images/val2017"
    valid_output_images = "/home/xiaopeng/coco_only_person/images/val2017"
    select_person(valid_source_labels, valid_output_labels, valid_source_images, valid_output_images)

结果：

对于person来说，训练集中有64115张图片，验证集有2693张。
在这里插入图片描述

训练

因为我的数据集和darknet程序是独立的，如下，
在这里插入图片描述
所以我使用如下命令：

./darknet detector train data/obj.data cfg/yolov4.cfg ../backup/yolov4_last.weights -map

哦对了，cfg文件还有，data文件都要修改。这部分很简单官网都有，就不赘述了。

开始训练

在这里插入图片描述

YOLOv4结果

垃圾显卡得1964小时…

2021.5.10更新

10000batch的AP

detections_count = 188270, unique_truth_count = 10777  
class_id = 0, name = person, ap = 24.21%   	 (TP = 6760, FP = 16472) 

 for conf_thresh = 0.25, precision = 0.29, recall = 0.63, F1-score = 0.40 
 for conf_thresh = 0.25, TP = 6760, FP = 16472, FN = 4017, average IoU = 22.16 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.50) = 0.242147, or 24.21 % 
Total Detection Time: 305 Seconds

20000个batch的AP

Loading weights from ../backup/yolov4_20000.weights...
 seen 64, trained: 1280 K-images (20 Kilo-batches_64) 
Done! Loaded 162 layers from weights-file 

 calculation mAP (mean average precision)...
 Detection layer: 139 - type = 28 
 Detection layer: 150 - type = 28 
 Detection layer: 161 - type = 28 
2696
 detections_count = 135166, unique_truth_count = 10777  
class_id = 0, name = person, ap = 68.28%   	 (TP = 7521, FP = 4854) 

 for conf_thresh = 0.25, precision = 0.61, recall = 0.70, F1-score = 0.65 
 for conf_thresh = 0.25, TP = 7521, FP = 4854, FN = 3256, average IoU = 48.22 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.50) = 0.682761, or 68.28 % 
Total Detection Time: 297 Seconds

30000个batch的AP

Loading weights from ../backup/yolov4_30000.weights...
 seen 64, trained: 1920 K-images (30 Kilo-batches_64) 
Done! Loaded 162 layers from weights-file 

 calculation mAP (mean average precision)...
 Detection layer: 139 - type = 28 
 Detection layer: 150 - type = 28 
 Detection layer: 161 - type = 28 
2696
 detections_count = 74317, unique_truth_count = 10777  
class_id = 0, name = person, ap = 72.13%   	 (TP = 7355, FP = 3165) 

 for conf_thresh = 0.25, precision = 0.70, recall = 0.68, F1-score = 0.69 
 for conf_thresh = 0.25, TP = 7355, FP = 3165, FN = 3422, average IoU = 56.45 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.50) = 0.721290, or 72.13 % 
Total Detection Time: 287 Seconds

40000个batch的AP

Loading weights from ../backup/yolov4_40000.weights...
 seen 64, trained: 2560 K-images (40 Kilo-batches_64) 
Done! Loaded 162 layers from weights-file 

 calculation mAP (mean average precision)...
 Detection layer: 139 - type = 28 
 Detection layer: 150 - type = 28 
 Detection layer: 161 - type = 28 
2696
 detections_count = 112421, unique_truth_count = 10777  
class_id = 0, name = person, ap = 69.88%   	 (TP = 8055, FP = 6296) 

 for conf_thresh = 0.25, precision = 0.56, recall = 0.75, F1-score = 0.64 
 for conf_thresh = 0.25, TP = 8055, FP = 6296, FN = 2722, average IoU = 45.02 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.50) = 0.698818, or 69.88 % 
Total Detection Time: 297 Seconds

50000个batch的AP

detections_count = 101082, unique_truth_count = 10777  
class_id = 0, name = person, ap = 73.55%   	 (TP = 7895, FP = 4391) 

 for conf_thresh = 0.25, precision = 0.64, recall = 0.73, F1-score = 0.68 
 for conf_thresh = 0.25, TP = 7895, FP = 4391, FN = 2882, average IoU = 52.26 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.50) = 0.735530, or 73.55 % 
Total Detection Time: 290 Seconds

50000batch，73.55%AP，检测结果仍然没有原版yolov4的效果好。还需要继续训练。

YOLOv5结果

待更新，正在训练…

2021.5.11更新
YOLOv5训练速度也太快了吧。一个epoch就实现了比YOLOv4五万个batch还要高的AP，我的天，太强了。
v5一个epoch4个多小时，v4，50000个batch几百个小时。。

val: Scanning '../../coco_only_person/valid.cache' images and labels... 2693 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 2693/2693 [00:00<?, ?it/s]
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100%|██████████| 1347/1347 [04:08<00:00,  5.43it/s]
                 all        2693       10777       0.827       0.746       0.831       0.577
Speed: 58.4/1.9/60.3 ms inference/NMS/total per 640x640 image at batch-size 2