前言
最近在用YOLOv4做大规模检测,但是发现这方面的资料实在是太少了。就连大规模检测是干什么的也没有一个较准确的定义。导师给我的任务是密集人群检测,就是场景中有很多人保证检测的准确性。
正好最近在用yolov4,想着用v4去训练一下coco中的人单个类别。坐下实验,看下效果。
之后再拿yolov5训练一下。
实验
数据集准备
先下载coco数据集,我用的是coco2017数据集,下载地址网上搜一下,有很多,我的博客里面也有。
这部分的教程,参考我的github:https://github.com/SpongeBab/COCO_only_person
提取单类别(以人为例)
并没有使用网上那种从json文件中挑选出哪一类再生成id的方法,太麻烦了。所幸我从YOLOv5的github项目中找到了TXT格式的label文件,从图片名的label文件中找是否有你需要的类别的id,如果有就根据label文件名去找对应的image。
代码如下:
import os
import shutil
import inspect
def select_person(source_label_path, output_label_path, source_images_path, output_images_path):
if not os.path.exists(output_label_path):
os.makedirs(output_label_path)
if not os.path.exists(output_images_path):
os.makedirs(output_images_path)
for txt_file in os.listdir(source_label_path):
# print(txt_file)
# 得到了txt_file的集合
txt_name, extension = os.path.splitext(txt_file)
# print(txt_name)
with open(os.path.join(source_label_path, txt_file), "r") as f:
fb = f.readlines()
for line in fb[:]:
# line = line.strip() # 错误的,使用这句会将换行符 /n 去掉,使输出变为一行
linelist = line.split(" ")
first = linelist[0]
# output_file = os.path.join(output_path, txt_file)
if "0" == linelist[0]:
outfile = open(os.path.join(output_label_path, txt_file), "a+") # w这种打开方式会每次打开都把以前的覆盖掉,使用a或者a+,追加写
outfile.write(line)
src = os.path.join(source_images_path, txt_name + ".jpg")
dst = os.path.join(output_images_path, txt_name + ".jpg")
shutil.copy(src, dst)
print("已完成", dst)
if __name__ == '__main__':
train_source_labels = "/home/xiaopeng/coco/labels/train2017"
train_output_labels = "/home/xiaopeng/coco_only_person/labels/train2017"
train_source_images = "/home/xiaopeng/coco/images/train2017"
train_output_images = "/home/xiaopeng/coco_only_person/images/train2017"
select_person(train_source_labels, train_output_labels, train_source_images, train_output_images)
valid_source_labels = "/home/xiaopeng/coco/labels/val2017"
valid_output_labels = "/home/xiaopeng/coco_only_person/labels/val2017"
valid_source_images = "/home/xiaopeng/coco/images/val2017"
valid_output_images = "/home/xiaopeng/coco_only_person/images/val2017"
select_person(valid_source_labels, valid_output_labels, valid_source_images, valid_output_images)
结果:
对于person来说,训练集中有64115张图片,验证集有2693张。
训练
因为我的数据集和darknet程序是独立的,如下,
所以我使用如下命令:
./darknet detector train data/obj.data cfg/yolov4.cfg ../backup/yolov4_last.weights -map
哦对了,cfg文件还有,data文件都要修改。这部分很简单官网都有,就不赘述了。
开始训练
YOLOv4结果
垃圾显卡得1964小时…
2021.5.10更新
- 10000batch的AP
detections_count = 188270, unique_truth_count = 10777
class_id = 0, name = person, ap = 24.21% (TP = 6760, FP = 16472)
for conf_thresh = 0.25, precision = 0.29, recall = 0.63, F1-score = 0.40
for conf_thresh = 0.25, TP = 6760, FP = 16472, FN = 4017, average IoU = 22.16 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.242147, or 24.21 %
Total Detection Time: 305 Seconds
- 20000个batch的AP
Loading weights from ../backup/yolov4_20000.weights...
seen 64, trained: 1280 K-images (20 Kilo-batches_64)
Done! Loaded 162 layers from weights-file
calculation mAP (mean average precision)...
Detection layer: 139 - type = 28
Detection layer: 150 - type = 28
Detection layer: 161 - type = 28
2696
detections_count = 135166, unique_truth_count = 10777
class_id = 0, name = person, ap = 68.28% (TP = 7521, FP = 4854)
for conf_thresh = 0.25, precision = 0.61, recall = 0.70, F1-score = 0.65
for conf_thresh = 0.25, TP = 7521, FP = 4854, FN = 3256, average IoU = 48.22 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.682761, or 68.28 %
Total Detection Time: 297 Seconds
- 30000个batch的AP
Loading weights from ../backup/yolov4_30000.weights...
seen 64, trained: 1920 K-images (30 Kilo-batches_64)
Done! Loaded 162 layers from weights-file
calculation mAP (mean average precision)...
Detection layer: 139 - type = 28
Detection layer: 150 - type = 28
Detection layer: 161 - type = 28
2696
detections_count = 74317, unique_truth_count = 10777
class_id = 0, name = person, ap = 72.13% (TP = 7355, FP = 3165)
for conf_thresh = 0.25, precision = 0.70, recall = 0.68, F1-score = 0.69
for conf_thresh = 0.25, TP = 7355, FP = 3165, FN = 3422, average IoU = 56.45 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.721290, or 72.13 %
Total Detection Time: 287 Seconds
- 40000个batch的AP
Loading weights from ../backup/yolov4_40000.weights...
seen 64, trained: 2560 K-images (40 Kilo-batches_64)
Done! Loaded 162 layers from weights-file
calculation mAP (mean average precision)...
Detection layer: 139 - type = 28
Detection layer: 150 - type = 28
Detection layer: 161 - type = 28
2696
detections_count = 112421, unique_truth_count = 10777
class_id = 0, name = person, ap = 69.88% (TP = 8055, FP = 6296)
for conf_thresh = 0.25, precision = 0.56, recall = 0.75, F1-score = 0.64
for conf_thresh = 0.25, TP = 8055, FP = 6296, FN = 2722, average IoU = 45.02 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.698818, or 69.88 %
Total Detection Time: 297 Seconds
- 50000个batch的AP
detections_count = 101082, unique_truth_count = 10777
class_id = 0, name = person, ap = 73.55% (TP = 7895, FP = 4391)
for conf_thresh = 0.25, precision = 0.64, recall = 0.73, F1-score = 0.68
for conf_thresh = 0.25, TP = 7895, FP = 4391, FN = 2882, average IoU = 52.26 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.735530, or 73.55 %
Total Detection Time: 290 Seconds
50000batch,73.55%AP,检测结果仍然没有原版yolov4的效果好。还需要继续训练。
YOLOv5结果
待更新,正在训练…
2021.5.11更新
YOLOv5训练速度也太快了吧。一个epoch就实现了比YOLOv4五万个batch还要高的AP,我的天,太强了。
v5一个epoch4个多小时,v4,50000个batch几百个小时。。
val: Scanning '../../coco_only_person/valid.cache' images and labels... 2693 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 2693/2693 [00:00<?, ?it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 1347/1347 [04:08<00:00, 5.43it/s]
all 2693 10777 0.827 0.746 0.831 0.577
Speed: 58.4/1.9/60.3 ms inference/NMS/total per 640x640 image at batch-size 2
83.1%AP0.5,57.7%AP0.5:0.95