文章目录
1.参考资料
- 源码-https://github.com/ultralytics/yoloV5-U版YOLOv5-20200625
- 如何评价YOLOv5?
- 一文读懂YOLO V5 与 YOLO V4
- yolov4-AB源论文-YOLOv4: Optimal Speed and Accuracy of Object Detection-20200423
- 深入浅出YOLOv5
- 深入浅出Yolo系列之Yolov5核心基础知识完整讲解
- Yolov3&Yolov4&Yolov5模型权重及网络结构图资源下载
2.使用源码图片测试
2.1.下载源码
down zip
2.2 安装依赖库
pip install -r requirements.txt
或者自己:
pip install +X
2.3 下载模型
detect.py直接运行会自动下载yolov5s.pt模型文件,但是很慢。。。
可以自己找网址下载:
https://github.com/ultralytics/yolov5/releases
(放在代码detect.py同级目录下。。。)
2.4 测试结果
Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', img_size=640, iou_thres=0.45, output='inference/output', save_conf=False, save_txt=False, source='inference/images', update=False, view_img=False, weights='yolov5s.pt')
Using CUDA device0 _CudaDeviceProperties(name='GeForce RTX 2080 Ti', total_memory=11011MB)
Fusing layers...
Model Summary: 140 layers, 7.45958e+06 parameters, 0 gradients
image 1/2 /home/hjz/PycharmProjects/pythonProject/yolov5-master/inference/images/bus.jpg: 640x480 4 persons, 1 buss, 1 skateboards, Done. (0.069s)
image 2/2 /home/hjz/PycharmProjects/pythonProject/yolov5-master/inference/images/zidane.jpg: 384x640 2 persons, 2 ties, Done. (0.054s)
Results saved to inference/output
Done. (0.168s)
Process finished with exit code 0
视频:
python detect.py --source=inference/int/01.mp4 --output=inference/out/01.mp4
coco_classes_names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush']
3.训练自己数据
3.1参考
- 官方--https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data
- 使用YOLOv5训练自己的数据集
- Pytorch版YOLOV5训练自己的数据集
3.2 train coco128
python train.py --epochs=20
自己下载权重和coco128数据集,太慢自己下载:
coco128数据集
yolov5权重等下载V3.0
1. coco128数据集放在项目同级目录下,和yolov5同级
3.2.1训练结果
Using CUDA device0 _CudaDeviceProperties(name='GeForce RTX 2080 Ti', total_memory=11011MB)
Namespace(adam=False, batch_size=16, bucket='', cache_images=False, cfg='', data='data/coco128.yaml', device='', epochs=20, evolve=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[640, 640], local_rank=-1, logdir='runs/', multi_scale=False, name='', noautoanchor=False, nosave=False, notest=False, rect=False, resume=False, single_cls=False, sync_bn=False, total_batch_size=16, weights='yolov5s.pt', workers=8, world_size=1)
Start Tensorboard with "tensorboard --logdir runs/", view at http://localhost:6006/
Hyperparameters {'lr0': 0.01, 'lrf': 0.2, 'momentum': 0.937, 'weight_decay': 0.0005, 'warmup_epochs': 3.0, 'warmup_momentum': 0.8, 'warmup_bias_lr': 0.1, 'box': 0.05, 'cls': 0.5, 'cls_pw': 1.0, 'obj': 1.0, 'obj_pw': 1.0, 'iou_t': 0.2, 'anchor_t': 4.0, 'fl_gamma': 0.0, 'hsv_h': 0.015, 'hsv_s': 0.7, 'hsv_v': 0.4, 'degrees': 0.0, 'translate': 0.1, 'scale': 0.5, 'shear': 0.0, 'perspective': 0.0, 'flipud': 0.0, 'fliplr': 0.5, 'mosaic': 1.0, 'mixup': 0.0}
from n params module arguments
0 -1 1 3520 models.common.Focus [3, 32, 3]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 19904 models.common.BottleneckCSP [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 1 161152 models.common.BottleneckCSP [128, 128, 3]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 1 641792 models.common.BottleneckCSP [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
9 -1 1 1248768 models.common.BottleneckCSP [512, 512, 1, False]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 378624 models.common.BottleneckCSP [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 95104 models.common.BottleneckCSP [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 313088 models.common.BottleneckCSP [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1248768 models.common.BottleneckCSP [512, 512, 1, False]
24 [17, 20, 23] 1 229245 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model Summary: 191 layers, 7.46816e+06 parameters, 7.46816e+06 gradients
Transferred 370/370 items from yolov5s.pt
Optimizer groups: 62 .bias, 70 conv.weight, 59 other
Scanning images: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:00<00:00, 3224.53it/s]
Scanning labels ../coco128/labels/train2017.cache (126 found, 0 missing, 2 empty, 0 duplicate, for 128 images): 128it [00:00, 4429.52it/s]
Scanning labels ../coco128/labels/train2017.cache (126 found, 0 missing, 2 empty, 0 duplicate, for 128 images): 128it [00:00, 14134.51it/s]
Analyzing anchors... anchors/target = 4.26, Best Possible Recall (BPR) = 0.9946
Image sizes 640 train, 640 test
Using 8 dataloader workers
Logging results to runs/exp1
Starting training for 20 epochs...
Epoch gpu_mem box obj cls total targets img_size
0/19 5.24G 0.04188 0.06183 0.01566 0.1194 171 640: 100%|██████████████████████████████████████████████| 8/8 [00:05<00:00, 1.46it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:04<00:00, 1.77it/s]
all 128 929 0.405 0.761 0.698 0.442
Epoch gpu_mem box obj cls total targets img_size
1/19 5.12G 0.04172 0.05666 0.01659 0.115 146 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.57it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.97it/s]
all 128 929 0.399 0.765 0.699 0.447
Epoch gpu_mem box obj cls total targets img_size
2/19 5.12G 0.0426 0.06244 0.01579 0.1208 196 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.65it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.07it/s]
all 128 929 0.404 0.773 0.702 0.453
Epoch gpu_mem box obj cls total targets img_size
3/19 5.12G 0.04476 0.06601 0.01603 0.1268 204 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.51it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.67it/s]
all 128 929 0.396 0.778 0.705 0.455
Epoch gpu_mem box obj cls total targets img_size
4/19 5.12G 0.04329 0.06541 0.01635 0.1251 252 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.58it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.62it/s]
all 128 929 0.39 0.781 0.706 0.458
Epoch gpu_mem box obj cls total targets img_size
5/19 5.12G 0.043 0.05926 0.01625 0.1185 146 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.51it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.25it/s]
all 128 929 0.39 0.785 0.713 0.463
Epoch gpu_mem box obj cls total targets img_size
6/19 5.12G 0.04202 0.06307 0.01541 0.1205 204 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.71it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.38it/s]
all 128 929 0.388 0.791 0.719 0.467
Epoch gpu_mem box obj cls total targets img_size
7/19 5.12G 0.04285 0.06677 0.0151 0.1247 204 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.55it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.91it/s]
all 128 929 0.388 0.794 0.723 0.474
Epoch gpu_mem box obj cls total targets img_size
8/19 5.12G 0.04252 0.05974 0.01529 0.1176 211 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.64it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.37it/s]
all 128 929 0.386 0.794 0.726 0.48
Epoch gpu_mem box obj cls total targets img_size
9/19 5.12G 0.04098 0.06076 0.01374 0.1155 227 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.52it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.55it/s]
all 128 929 0.395 0.799 0.73 0.477
Epoch gpu_mem box obj cls total targets img_size
10/19 5.12G 0.04312 0.06949 0.0154 0.128 185 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.53it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.93it/s]
all 128 929 0.393 0.798 0.74 0.483
Epoch gpu_mem box obj cls total targets img_size
11/19 5.12G 0.04207 0.05844 0.0155 0.116 190 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.64it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.47it/s]
all 128 929 0.4 0.802 0.744 0.489
Epoch gpu_mem box obj cls total targets img_size
12/19 5.12G 0.04147 0.06319 0.01335 0.118 234 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.57it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.47it/s]
all 128 929 0.404 0.801 0.747 0.493
Epoch gpu_mem box obj cls total targets img_size
13/19 5.12G 0.04178 0.0565 0.01371 0.112 225 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.52it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.83it/s]
all 128 929 0.419 0.808 0.751 0.498
Epoch gpu_mem box obj cls total targets img_size
14/19 5.12G 0.04076 0.05859 0.01472 0.1141 179 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.57it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.78it/s]
all 128 929 0.408 0.815 0.751 0.496
Epoch gpu_mem box obj cls total targets img_size
15/19 5.12G 0.04175 0.05848 0.01484 0.1151 181 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.41it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.17it/s]
all 128 929 0.413 0.813 0.754 0.502
Epoch gpu_mem box obj cls total targets img_size
16/19 5.12G 0.04283 0.05989 0.01417 0.1169 198 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.52it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.83it/s]
all 128 929 0.415 0.82 0.754 0.503
Epoch gpu_mem box obj cls total targets img_size
17/19 5.12G 0.04006 0.05161 0.01465 0.1063 156 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.54it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.18it/s]
all 128 929 0.421 0.827 0.76 0.505
Epoch gpu_mem box obj cls total targets img_size
18/19 5.12G 0.04003 0.06271 0.01228 0.115 196 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.48it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.28it/s]
all 128 929 0.42 0.826 0.767 0.509
Epoch gpu_mem box obj cls total targets img_size
19/19 5.12G 0.04196 0.06346 0.01286 0.1183 221 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.54it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:01<00:00, 4.44it/s]
all 128 929 0.411 0.822 0.77 0.515
Optimizer stripped from runs/exp1/weights/last.pt, 15.2MB
Optimizer stripped from runs/exp1/weights/best.pt, 15.2MB
20 epochs completed in 0.016 hours.
3.2.2查看训练曲线
(pytorch) tensorboard --logdir runs
TensorFlow installation not found - running with reduced feature set.
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.3.0 at http://localhost:6006/ (Press CTRL+C to quit)
3.3 安全帽目标检测(ubuntu16.04)
3.3.1 数据及预处理
It includes 7581 images with 9044 human safety helmet wearing objects(positive) and 111514 normal head objects(not wearing or negative)
1. 标签:hat&person二类
2. 难点:在于把数据集划分训练测试和标签
3. 另外,数据有几个是.JPG需要改成小写.jpg(ubuntu16.04)
数据集和标签比较好用的代码如下:gen_train_test_label.py
"""
1.要修改各文件夹路径
2.类别标签按自己从0修改,此处二类为0-1
3.此代码路径是ubuntu16.04系统绝对路径
"""
import os
from pathlib import Path
from shutil import copyfile
from PIL import Image, ImageDraw
from xml.dom.minidom import parse
import numpy as np
FILE_ROOT = f"/home/hjz/PycharmProjects/pythonProject"+"/"
IMAGE_SET_ROOT = FILE_ROOT + f"VOC2028/ImageSets/Main" # 图片区分文件的路径
IMAGE_PATH = FILE_ROOT + f"VOC2028/JPEGImages" # 图片的位置
ANNOTATIONS_PATH = FILE_ROOT + f"VOC2028/Annotations" # 数据集标签文件的位置
LABELS_ROOT = FILE_ROOT + f"VOC2028/Labels" # 进行归一化之后的标签位置
DEST_IMAGES_PATH = f"./custom_data/images" # 区分训练集、测试集、验证集的图片目标路径
DEST_LABELS_PATH = f"./custom_data/labels" # 区分训练集、测试集、验证集的标签文件目标路径
def cord_converter(size, box):
"""
将标注的 xml 文件标注转换为 darknet 形的坐标
:param size: 图片的尺寸: [w,h]
:param box: anchor box 的坐标 [左上角x,左上角y,右下角x,右下角y,]
:return: 转换后的 [x,y,w,h]
"""
x1 = int(box[0])
y1 = int(box[1])
x2 = int(box[2])
y2 = int(box[3])
dw = np.float32(1. / int(size[0]))
dh = np.float32(1. / int(size[1]))
w = x2 - x1
h = y2 - y1
x = x1 + (w / 2)
y = y1 + (h / 2)
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return [x, y, w, h]
def save_file(img_jpg_file_name, size, img_box):
save_file_name = LABELS_ROOT + '/' + img_jpg_file_name + '.txt'
print(save_file_name)
file_path = open(save_file_name, "a+")
for box in img_box:
if box[0] == 'person':
cls_num = 0
else:
cls_num = 1#两个类别
new_box = cord_converter(size, box[1:])
file_path.write(f"{cls_num} {new_box[0]} {new_box[1]} {new_box[2]} {new_box[3]}\n")
file_path.flush()
file_path.close()
def test_dataset_box_feature(file_name, point_array):
"""
使用样本数据测试数据集的建议框
:param image_name: 图片文件名
:param point_array: 全部的点 [建议框sx1,sy1,sx2,sy2]
:return: None
"""
im = Image.open(rf"{IMAGE_PATH}\{file_name}")
imDraw = ImageDraw.Draw(im)
for box in point_array:
x1 = box[1]
y1 = box[2]
x2 = box[3]
y2 = box[4]
imDraw.rectangle((x1, y1, x2, y2), outline='red')
im.show()
def get_xml_data(file_path, img_xml_file):
img_path = file_path + '/' + img_xml_file + '.xml'
print(img_path)
dom = parse(img_path)
root = dom.documentElement
img_name = root.getElementsByTagName("filename")[0].childNodes[0].data
img_size = root.getElementsByTagName("size")[0]
objects = root.getElementsByTagName("object")
img_w = img_size.getElementsByTagName("width")[0].childNodes[0].data
img_h = img_size.getElementsByTagName("height")[0].childNodes[0].data
img_c = img_size.getElementsByTagName("depth")[0].childNodes[0].data
# print("img_name:", img_name)
# print("image_info:(w,h,c)", img_w, img_h, img_c)
img_box = []
for box in objects:
cls_name = box.getElementsByTagName("name")[0].childNodes[0].data
x1 = int(box.getElementsByTagName("xmin")[0].childNodes[0].data)
y1 = int(box.getElementsByTagName("ymin")[0].childNodes[0].data)
x2 = int(box.getElementsByTagName("xmax")[0].childNodes[0].data)
y2 = int(box.getElementsByTagName("ymax")[0].childNodes[0].data)
# print("box:(c,xmin,ymin,xmax,ymax)", cls_name, x1, y1, x2, y2)
img_jpg_file_name = img_xml_file + '.jpg'
img_box.append([cls_name, x1, y1, x2, y2])
# print(img_box)
# test_dataset_box_feature(img_jpg_file_name, img_box)
save_file(img_xml_file, [img_w, img_h], img_box)
def copy_data(img_set_source, img_labels_root, imgs_source, type):
file_name = img_set_source + '/' + type + ".txt"
file = open(file_name)
# 判断文件夹是否存在,不存在则创建
root_file = Path(FILE_ROOT + DEST_IMAGES_PATH + '/' + type)
if not root_file.exists():
print(f"Path {root_file} is not exit")
os.makedirs(root_file)
root_file = Path(FILE_ROOT + DEST_LABELS_PATH + '/' + type)
if not root_file.exists():
print(f"Path {root_file} is not exit")
os.makedirs(root_file)
# 遍历文件夹
for line in file.readlines():
print(line)
img_name = line.strip('\n')
img_sor_file = imgs_source + '/' + img_name + '.jpg'
label_sor_file = img_labels_root + '/' + img_name + '.txt'
# print(img_sor_file)
# print(label_sor_file)
# im = Image.open(rf"{img_sor_file}")
# im.show()
# 复制图片
DICT_DIR = FILE_ROOT + DEST_IMAGES_PATH + '/' + type
img_dict_file = DICT_DIR + '/' + img_name + '.jpg'
copyfile(img_sor_file, img_dict_file)
# 复制 label
DICT_DIR = FILE_ROOT + DEST_LABELS_PATH + '/' + type
img_dict_file = DICT_DIR + '/' + img_name + '.txt'
copyfile(label_sor_file, img_dict_file)
if __name__ == '__main__':
# 生成标签
root = ANNOTATIONS_PATH
files = os.listdir(root)
for file in files:
print("file name: ", file)
file_xml = file.split(".")
get_xml_data(root, file_xml[0])
# 将文件进行 train 和 val 的区分
img_set_root = IMAGE_SET_ROOT
imgs_root = IMAGE_PATH
img_labels_root = LABELS_ROOT
copy_data(img_set_root, img_labels_root, imgs_root, "train")
copy_data(img_set_root, img_labels_root, imgs_root, "val")
copy_data(img_set_root, img_labels_root, imgs_root, "test")
3.3.2 修改配置文件
- hat.yaml:
# Custom data for safety helmet
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: /home/hjz/PycharmProjects/pythonProject/custom_data/images/train
val: /home/hjz/PycharmProjects/pythonProject/custom_data/images/val
test: /home/hjz/PycharmProjects/pythonProject/custom_data/images/test
# number of classes
nc: 2
# class names
names: ['person', 'hat']
- hat_yolov5s.yaml
# parameters
nc: 2 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
# anchors,可以后期修改
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Focus, [64, 3]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, BottleneckCSP, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 9, BottleneckCSP, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, BottleneckCSP, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 1, SPP, [1024, [5, 9, 13]]],
[-1, 3, BottleneckCSP, [1024, False]], # 9
]
# YOLOv5 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, BottleneckCSP, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
3.3.3 预训练
python train.py --data=data/hat.yaml --cfg=data/hat_yolov5s.yaml --batch-size=16 --epochs=10
Analyzing anchors... anchors/target = 4.25, Best Possible Recall (BPR) = 0.9999
Image sizes 640 train, 640 test
Using 8 dataloader workers
Logging results to runs/exp14
Starting training for 10 epochs...
Epoch gpu_mem box obj cls total targets img_size
0/9 4.51G 0.08594 0.07445 0.01321 0.1736 39 640: 100%|██████████████████████████████████████████| 342/342 [00:54<00:00, 6.28it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:09<00:00, 3.96it/s]
all 607 2.98e+04 0.221 0.288 0.21 0.0712
Epoch gpu_mem box obj cls total targets img_size
1/9 4.58G 0.0641 0.067 0.004142 0.1352 9 640: 100%|██████████████████████████████████████████| 342/342 [00:47<00:00, 7.17it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:04<00:00, 9.50it/s]
all 607 2.98e+04 0.365 0.3 0.251 0.106
Epoch gpu_mem box obj cls total targets img_size
2/9 4.58G 0.05703 0.06752 0.002748 0.1273 273 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 6.98it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:06<00:00, 5.97it/s]
all 607 2.98e+04 0.406 0.311 0.273 0.144
Epoch gpu_mem box obj cls total targets img_size
3/9 4.58G 0.04976 0.06421 0.002333 0.1163 6 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 6.98it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.33it/s]
all 607 2.98e+04 0.616 0.307 0.304 0.16
Epoch gpu_mem box obj cls total targets img_size
4/9 4.58G 0.04688 0.06446 0.001753 0.1131 273 640: 100%|██████████████████████████████████████████| 342/342 [00:49<00:00, 6.98it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.38it/s]
all 607 2.98e+04 0.645 0.309 0.306 0.177
Epoch gpu_mem box obj cls total targets img_size
5/9 4.58G 0.04377 0.06128 0.001416 0.1065 30 640: 100%|██████████████████████████████████████████| 342/342 [00:49<00:00, 6.96it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.42it/s]
all 607 2.98e+04 0.627 0.312 0.307 0.178
Epoch gpu_mem box obj cls total targets img_size
6/9 4.58G 0.04228 0.0616 0.001187 0.1051 243 640: 100%|██████████████████████████████████████████| 342/342 [00:49<00:00, 6.91it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.67it/s]
all 607 2.98e+04 0.679 0.312 0.309 0.185
Epoch gpu_mem box obj cls total targets img_size
7/9 4.58G 0.04071 0.05956 0.001062 0.1013 15 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 7.01it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.54it/s]
all 607 2.98e+04 0.675 0.312 0.309 0.188
Epoch gpu_mem box obj cls total targets img_size
8/9 4.58G 0.04015 0.0596 0.0008846 0.1006 48 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 7.00it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.96it/s]
all 607 2.98e+04 0.688 0.312 0.31 0.189
Epoch gpu_mem box obj cls total targets img_size
9/9 4.58G 0.03959 0.0595 0.0007798 0.09986 39 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 7.00it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:06<00:00, 5.79it/s]
all 607 2.98e+04 0.695 0.313 0.312 0.192
Optimizer stripped from runs/exp14/weights/last.pt, 14.8MB
Optimizer stripped from runs/exp14/weights/best.pt, 14.8MB
10 epochs completed in 0.156 hours.
3.3.4 测试
python detect.py
--weights=/home/hjz/PycharmProjects/pythonProject/01-yolov5-master/runs/exp14/weights/best.pt
--source=/home/hjz/PycharmProjects/pythonProject/01-yolov5-master/inference/int/01.jpg
15分钟10个epochs跑出来的效果如图:
3.4yolov5s训练100epochs
3.4.1 锚框聚类
gen_anchors_box_kmeans.py
# -*- coding: utf-8 -*-
import numpy as np
import random
import argparse
import os
# 参数名称
parser = argparse.ArgumentParser(description='使用该脚本生成YOLO-V3的anchor boxes\n')
parser.add_argument('--input_annotation_txt_dir', required=True, type=str, help='输入存储图片的标注txt文件(注意不要有中文)')
parser.add_argument('--output_anchors_txt', required=True, type=str, help='输出的存储Anchor boxes的文本文件')
parser.add_argument('--input_num_anchors', required=True, default=6, type=int, help='输入要计算的聚类(Anchor boxes的个数)')
parser.add_argument('--input_cfg_width', required=True, type=int, help="配置文件中width")
parser.add_argument('--input_cfg_height', required=True, type=int, help="配置文件中height")
args = parser.parse_args()
'''
centroids 聚类点 尺寸是 numx2,类型是ndarray
annotation_array 其中之一的标注框
'''
def IOU(annotation_array, centroids):
#
similarities = []
# 其中一个标注框
w, h = annotation_array
for centroid in centroids:
c_w, c_h = centroid
if c_w >= w and c_h >= h: # 第1中情况
similarity = w * h / (c_w * c_h)
elif c_w >= w and c_h <= h: # 第2中情况
similarity = w * c_h / (w * h + (c_w - w) * c_h)
elif c_w <= w and c_h >= h: # 第3种情况
similarity = c_w * h / (w * h + (c_h - h) * c_w)
else: # 第3种情况
similarity = (c_w * c_h) / (w * h)
similarities.append(similarity)
# 将列表转换为ndarray
return np.array(similarities, np.float32) # 返回的是一维数组,尺寸为(num,)
'''
k_means:k均值聚类
annotations_array 所有的标注框的宽高,N个标注框,尺寸是Nx2,类型是ndarray
centroids 聚类点 尺寸是 numx2,类型是ndarray
#按照前后两次的得到的聚类结果是否相同结束循环
'''
def k_means(annotations_array, centroids, eps=0.00005, iterations=200000):
#
N = annotations_array.shape[0] # C=2
num = centroids.shape[0]
# 损失函数
distance_sum_pre = -1
assignments_pre = -1 * np.ones(N, dtype=np.int64)
#
iteration = 0
# 循环处理
while (True):
#
iteration += 1
#
distances = []
# 循环计算每一个标注框与所有的聚类点的距离(IOU)
for i in range(N):
distance = 1 - IOU(annotations_array[i], centroids)
distances.append(distance)
# 列表转换成ndarray
distances_array = np.array(distances, np.float32) # 该ndarray的尺寸为 Nxnum
# 找出每一个标注框到当前聚类点最近的点
assignments = np.argmin(distances_array, axis=1) # 计算每一行的最小值的位置索引
# 计算距离的总和,相当于k均值聚类的损失函数
distances_sum = np.sum(distances_array)
# 计算新的聚类点
centroid_sums = np.zeros(centroids.shape, np.float32)
for i in range(N):
centroid_sums[assignments[i]] += annotations_array[i] # 计算属于每一聚类类别的和
for j in range(num):
centroids[j] = centroid_sums[j] / (np.sum(assignments == j))
# 前后两次的距离变化
diff = abs(distances_sum - distance_sum_pre)
# 打印结果
print("iteration: {},distance: {}, diff: {}, avg_IOU: {}\n".format(iteration, distances_sum, diff,
np.sum(1 - distances_array) / (N * num)))
# 三种情况跳出while循环:1:循环20000次,2:eps计算平均的距离很小 3:以上的情况
if (assignments == assignments_pre).all():
print("按照前后两次的得到的聚类结果是否相同结束循环\n")
break
if diff < eps:
print("按照eps结束循环\n")
break
if iteration > iterations:
print("按照迭代次数结束循环\n")
break
# 记录上一次迭代
distance_sum_pre = distances_sum
assignments_pre = assignments.copy()
if __name__ == '__main__':
# 聚类点的个数,anchor boxes的个数
num_clusters = args.input_num_anchors
# 索引出文件夹中的每一个标注文件的名字(.txt)
names = os.listdir(args.input_annotation_txt_dir)
# 标注的框的宽和高
annotations_w_h = []
for name in names:
txt_path = os.path.join(args.input_annotation_txt_dir, name)
# 读取txt文件中的每一行
f = open(txt_path, 'r',encoding="utf-8")
for line in f.readlines():
line = line.rstrip('\n')
w, h = line.split(' ')[3:] # 这时读到的w,h是字符串类型
# eval()函数用来将字符串转换为数值型
annotations_w_h.append((eval(w), eval(h)))
f.close()
# 将列表annotations_w_h转换为numpy中的array,尺寸是(N,2),N代表多少框
annotations_array = np.array(annotations_w_h, dtype=np.float32)
N = annotations_array.shape[0]
# 对于k-means聚类,随机初始化聚类点
random_indices = [random.randrange(N) for i in range(num_clusters)] # 产生随机数
centroids = annotations_array[random_indices]
# k-means聚类
k_means(annotations_array, centroids, 0.00005, 200000)
# 对centroids按照宽排序,并写入文件
widths = centroids[:, 0]
sorted_indices = np.argsort(widths)
anchors = centroids[sorted_indices]
# 将anchor写入文件并保存
f_anchors = open(args.output_anchors_txt, 'w')
#
for anchor in anchors:
f_anchors.write('%d,%d' % (int(anchor[0] * args.input_cfg_width), int(anchor[1] * args.input_cfg_height)))
f_anchors.write('\n')
python gen_anchors_kmeans.py --input_annotation_txt_dir=/home/hjz/PycharmProjects/pythonProject/VOC2028/Labels --output_anchors_txt=achors.txt --input_num_anchors=9 --input_cfg_width=640 --input_cfg_height=640
iteration: 189,distance: 2494381.0, diff: 2.75, avg_IOU: 0.23371242911610443
按照前后两次的得到的聚类结果是否相同结束循环
8,18
12,26
19,36
30,52
45,77
68,114
96,175
153,250
287,399
将锚框替换掉我们hat_yolov5s.yaml文件中
#anchors:
# - [10,13, 16,30, 33,23] # P3/8
# - [30,61, 62,45, 59,119] # P4/16
# - [116,90, 156,198, 373,326] # P5/32
anchors:
- [8,18, 12,26, 19,36] # P3/8
- [30,52, 45,77, 68,114] # P4/16
- [96,175, 153,250, 287,399] # P5/32
3.4.2 train
python train.py --data data/hat.yaml --cfg data/hat_yolov5s.yaml --weights yolov5s.pt --batch-size 32 --epochs 100
Epoch gpu_mem box obj cls total targets img_size
98/99 5.95G 0.03468 0.0506 0.000218 0.08549 846 640: 100%|██████████████████████████████████████████| 171/171 [00:40<00:00, 4.24it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 19/19 [00:03<00:00, 5.64it/s]
all 607 2.98e+04 0.792 0.313 0.314 0.2
Epoch gpu_mem box obj cls total targets img_size
99/99 5.95G 0.03429 0.05084 0.0002206 0.08535 1512 640: 100%|██████████████████████████████████████████| 171/171 [00:39<00:00, 4.28it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 19/19 [00:03<00:00, 4.78it/s]
all 607 2.98e+04 0.791 0.313 0.313 0.199
Optimizer stripped from runs/exp15/weights/last.pt, 14.8MB
Optimizer stripped from runs/exp15/weights/best.pt, 14.8MB
100 epochs completed in 1.214 hours.
3.4.3 test.py
python test.py --weights=/home/hjz/PycharmProjects/pythonProject/01-yolov5-master/runs/exp15/weights/last.pt --data=data/hat.yaml
Fusing layers...
Model Summary: 140 layers, 7.24922e+06 parameters, 0 gradients
Scanning labels /home/hjz/PycharmProjects/pythonProject/custom_data/labels/val.cache (607 found, 0 missing, 0 empty, 607 duplicate, for 607 images): 607it [00:00, 11843.14it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 19/19 [00:04<00:00, 4.13it/s]
all 607 2.98e+04 0.768 0.313 0.315 0.199
Speed: 1.2/1.0/2.2 ms inference/NMS/total per 640x640 image at batch-size 32
3.5 yolov5s训练600
594/599 9.41G 0.02903 0.0438 0.0001633 0.07299 2100 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.05it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:02<00:00, 4.39it/s]
all 607 2.98e+04 0.805 0.312 0.311 0.195
Epoch gpu_mem box obj cls total targets img_size
595/599 9.41G 0.02927 0.04337 0.0001394 0.07278 2184 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.06it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:02<00:00, 4.38it/s]
all 607 2.98e+04 0.804 0.312 0.311 0.195
Epoch gpu_mem box obj cls total targets img_size
596/599 9.41G 0.02892 0.04282 0.000151 0.07189 1752 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.05it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:03<00:00, 4.30it/s]
all 607 2.98e+04 0.803 0.312 0.311 0.195
Epoch gpu_mem box obj cls total targets img_size
597/599 9.41G 0.0288 0.0426 0.0001617 0.07156 2142 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.05it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:02<00:00, 4.46it/s]
all 607 2.98e+04 0.803 0.311 0.311 0.195
Epoch gpu_mem box obj cls total targets img_size
598/599 9.41G 0.02896 0.04226 0.0001563 0.07137 1836 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.07it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:03<00:00, 4.31it/s]
all 607 2.98e+04 0.803 0.311 0.31 0.195
Epoch gpu_mem box obj cls total targets img_size
599/599 9.41G 0.02936 0.04317 0.0002012 0.07273 1827 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.05it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:03<00:00, 3.40it/s]
all 607 2.98e+04 0.803 0.311 0.311 0.195
Optimizer stripped from runs/exp17/weights/last.pt, 14.8MB
Optimizer stripped from runs/exp17/weights/best.pt, 14.8MB
600 epochs completed in 6.774 hours.
效果有点差,白帽子会被认为是安全帽,漏检也多。。。
3.6 yolov5x实测(ubuntu16.04)
3.6.1 train.py
同上,建立个新的hat_yolov5x.yaml文件,训练时选择此文件就好
python train.py --data data/hat.yaml --cfg data/hat_yolov5x.yaml --weights yolov5x.ptpt --batch-size 8 --epochs 300
Epoch gpu_mem box obj cls total targets img_size
298/299 8.42G 0.02804 0.04383 0.0002171 0.07209 30 640: 100%|██████████████████████████████████████████| 683/683 [04:02<00:00, 2.82it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 76/76 [00:08<00:00, 8.74it/s]
all 607 2.98e+04 0.814 0.315 0.314 0.202
Epoch gpu_mem box obj cls total targets img_size
299/299 8.42G 0.02814 0.04331 0.0001779 0.07163 30 640: 100%|██████████████████████████████████████████| 683/683 [04:02<00:00, 2.82it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 76/76 [00:08<00:00, 8.54it/s]
all 607 2.98e+04 0.815 0.315 0.314 0.202
Optimizer stripped from runs/exp5/weights/last.pt, 177.5MB
Optimizer stripped from runs/exp5/weights/best.pt, 177.5MB
300 epochs completed in 21.455 hours.
3.6.2 test.py
python test.py --weights=/home/hjz/PycharmProjects/pythonProject/01-yolov5-master/runs/exp5/weights/last.pt --data=data/hat.yaml
Fusing layers...
Model Summary: 284 layers, 8.83973e+07 parameters, 0 gradients
Scanning labels /home/hjz/PycharmProjects/pythonProject/custom_data/labels/val.cache (607 found, 0 missing, 0 empty, 607 duplicate, for 607 images): 607it [00:00, 12405.50it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 19/19 [00:08<00:00, 2.26it/s]
all 607 2.98e+04 0.812 0.315 0.315 0.202
Speed: 7.9/0.9/8.9 ms inference/NMS/total per 640x640 image at batch-size 32
依然P高R低
3.6.3 detect.py
可以看到,这边是P高R低de效果,漏检可能会大些。。。
3.7 yolov5再测(windows10)
3.7.1.准备
上次效果太差了,参考类似项目再训练一下,类别为人,头,安全帽3类
- 首先,下载数据集VOC2028,可以放在项目文件夹下
- 运行detect.py对数据集生成人标签0
注意权重放到对应位置
python detect.py --save-txt --source=E:\01_hjz\01_work\pythonProject\Smart_Construction-master\VOC2028\JPEGImages
- 运行gen_head_helmet.py生成score文件夹训练验证测试划分
- 新建文件夹Labels,运行merge_data.py,把label=0生成到VOC2028label中
此时检查score文件夹下label样本是否为0,1,2,是的话,大功告成
3.7.2 训练
1.custom_yaml
train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../Smart_Construction-master/score/images/train
val: ../Smart_Construction-master/score/images/val
# number of classes
nc: 3
# class names
names: ['person', 'head', 'helmet']
2.anchors
calculate_anchors.py
Anchors:[7.77, 15.87]
Anchors:[9.21, 20.2]
Anchors:[11.5, 23.23]
Anchors:[13.82, 28.93]
Anchors:[18.51, 35.12]
Anchors:[25.6, 44.74]
Anchors:[36.0, 61.16]
Anchors:[52.8, 89.0]
Anchors:[85.33, 147.99]
Train_Accuracy:82.27%
Ratios:[0.46, 0.48, 0.49, 0.49, 0.53, 0.57, 0.58, 0.59, 0.59]
******************** 1 ********************
3.train
python train.py --img 640 --batch 32 --epochs 100 --data ./data/custom_data.yaml --cfg ./models/custom_yolov5.yaml --weights ./weights/yolov5s.pt
4.test
Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|███████████████████████████████████████████
██████████| 19/19 [00:29<00:00, 1.55s/it]
all 607 1.29e+04 0.897 0.893 0.875 0.611
5.用大神的权重test
python test.py --weights=./weights/helmet_head_person_s.pt --data=./data/custom_data.yaml
██████████| 19/19 [00:30<00:00, 1.60s/it]
all 607 1.29e+04 0.862 0.894 0.874 0.589
Speed: 1.4/1.1/2.5 ms inference/NMS/total per 640x640 image at batch-size 32
perfect
6.预测
python detect.py --weights=runs/exp15/weights/best.pt --source=E:\01_hjz\01_work\pythonProject\Smart_Construction-master\inference\int\video
效果很棒
7.摄像头检测
--source=0即可,
如果觉得窗口太小,可以在108行cv2.imshow(p, im0)前面加上一行cv2.namedWindow(p, cv2.WINDOW_NORMAL)
python detect.py --weights=runs/exp15/weights/last.pt --source=0