基于Yolov5的交通场景目标检测

最新推荐文章于 2024-09-11 10:55:24 发布

天外来戊

最新推荐文章于 2024-09-11 10:55:24 发布

阅读量1.1k

点赞数 2

文章标签： YOLO 目标检测目标跟踪

本文链接：https://blog.csdn.net/qq_52328895/article/details/131511908

版权

该文详细介绍了如何利用Unity3D进行虚拟交通场景的标注，然后使用Yolov5算法进行训练。首先，文章讲解了Anaconda环境的安装和Pytorch环境的配置，包括创建虚拟环境、更换源以提高下载速度。接着，讲述了从VOC格式的标注文件转换为Yolo格式，并划分训练集和验证集的过程。最后，提到了训练Yolov5模型的参数设置和权重文件的选择，以及如何对图像进行测试和推理，展示了训练效果。

摘要由CSDN通过智能技术生成

本次实验主要是通过自己对unity3D的虚拟交通场景进行标注，然后再通过Yolov5算法进行训练，最后通过真实场景的图片进行检测，检测的木表有道路的蓄势线，行人，轿车，公共汽车等。

一.软件安装以及环境配置

1.1Anaconda 的安装

首先根据自己电脑的配置去官网下载好Anaconda的文件

然后下载好安装包开始安装，跟着步骤下一步即可，安装位置一般默认是C盘可以改一下安装地址，因为 Anaconda创建的环境会比较占用内存

安装到这一步的时候把第一个选项勾上，将 Anaconda配置到环境变量里。

安装完成后可以在主菜单里找到我们下载好的 Anaconda

1.2Pytorch环境安装

进入到我们刚下载好的Anaconda，然后创建一个虚拟环境conda create -n（n表示的是电脑的python版本）以下是创建了一个python3.8版本名为pytorch的虚拟环境

conda create -n pytorch python=3.8

我这里由于之前做过实验现在重新再做一次进行记录，所以他提示了虚拟环境重复，这里输入y即可

然后它会询问你是否要安装一些基础的包，配置环境的时候会有提示，也是输入y即可

创建好环境之后conda env list可以查看当前的环境

可以看到我们命名的pytorch虚拟环境已经建立，在这个环境中我们可以安装一些python的深入学习的python包，首先激活虚拟环境conda activate pytorch

conda activate pytorch

安装pytorch-gup版的环境，由于pytorch的官网在国外，下载相关的环境包是比较慢的，所以我们给环境换源。在pytorch环境下执行如下的命名给环境换清华源。

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
conda config --set show_channel_urls yes

然后打开pytorch的官网

根据自己的情况选择下载，把下面的代码复制后输入运行即可，由于是国外的源，下载速度会很慢，我们最好在后面加上-i http://pypi.douban.com/simple --trusted-host pypi.douban.com

-i http://pypi.douban.com/simple --trusted-host pypi.douban.com

使用清华源下载。下载完成即可

1.3paddlepaddle环境安装

首先我们创建一个paddle的虚拟环境conda activate paddle，然后我们再打开paddlepaddle的官网。选择自己的版本。

把下面的代码复制后再输入安装即可。

1.4labelme的安装

win+r打开命令行，输入cmd进入命令行。然后输入如下指令

 pip install labelimg -i https://pypi.tuna.tsinghua.edu.cn/simple

等待下载安装即可

自此，我们第一阶段的准备工作就完成了

二.标注图片和训练

2.1标注图片中的信息

首先打开我们的labelme，这个软件可以帮助我们标注图片中的各种信息，方便我们进行训练

如图所示，这是我对车辆行人，信号灯等交通目标的标记，在左侧的

按键即可对图片进行标注，软件的右侧还有那你已经标注的种类等信息

我这里标注了大巴，轿车，虚实线，行人和交通信号灯

标注完成后在左侧save保存即可

2.2VOC标签格式转yolo格式并划分训练集和测试集

我们经常从网上获取一些目标检测的数据集资源标签的格式都是VOC(xml格式)的，而yolov5训练所需要的文件格式是yolo(txt格式)的，这里就需要对xml格式的标签文件转换为txt文件。同时训练自己的yolov5检测模型的时候，数据集需要划分为训练集和验证集。这里提供了一份代码将xml格式的标注文件转换为txt格式的标注文件，并按比例划分为训练集和验证集。先上代码再讲解代码的注意事项

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
import random
from shutil import copyfile
 
classes = ["hat", "person"]
#classes=["ball"]
 
TRAIN_RATIO = 80
 
def clear_hidden_files(path):
    dir_list = os.listdir(path)
    for i in dir_list:
        abspath = os.path.join(os.path.abspath(path), i)
        if os.path.isfile(abspath):
            if i.startswith("._"):
                os.remove(abspath)
        else:
            clear_hidden_files(abspath)
 
def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)
 
def convert_annotation(image_id):
    in_file = open('VOCdevkit/VOC2007/Annotations/%s.xml' %image_id)
    out_file = open('VOCdevkit/VOC2007/YOLOLabels/%s.txt' %image_id, 'w')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
    in_file.close()
    out_file.close()
wd = os.getcwd()
wd = os.getcwd()
data_base_dir = os.path.join(wd, "VOCdevkit/")
if not os.path.isdir(data_base_dir):
    os.mkdir(data_base_dir)
work_sapce_dir = os.path.join(data_base_dir, "VOC2007/")
if not os.path.isdir(work_sapce_dir):
    os.mkdir(work_sapce_dir)
annotation_dir = os.path.join(work_sapce_dir, "Annotations/")
if not os.path.isdir(annotation_dir):
        os.mkdir(annotation_dir)
clear_hidden_files(annotation_dir)
image_dir = os.path.join(work_sapce_dir, "JPEGImages/")
if not os.path.isdir(image_dir):
        os.mkdir(image_dir)
clear_hidden_files(image_dir)
yolo_labels_dir = os.path.join(work_sapce_dir, "YOLOLabels/")
if not os.path.isdir(yolo_labels_dir):
        os.mkdir(yolo_labels_dir)
clear_hidden_files(yolo_labels_dir)
yolov5_images_dir = os.path.join(data_base_dir, "images/")
if not os.path.isdir(yolov5_images_dir):
        os.mkdir(yolov5_images_dir)
clear_hidden_files(yolov5_images_dir)
yolov5_labels_dir = os.path.join(data_base_dir, "labels/")
if not os.path.isdir(yolov5_labels_dir):
        os.mkdir(yolov5_labels_dir)
clear_hidden_files(yolov5_labels_dir)
yolov5_images_train_dir = os.path.join(yolov5_images_dir, "train/")
if not os.path.isdir(yolov5_images_train_dir):
        os.mkdir(yolov5_images_train_dir)
clear_hidden_files(yolov5_images_train_dir)
yolov5_images_test_dir = os.path.join(yolov5_images_dir, "val/")
if not os.path.isdir(yolov5_images_test_dir):
        os.mkdir(yolov5_images_test_dir)
clear_hidden_files(yolov5_images_test_dir)
yolov5_labels_train_dir = os.path.join(yolov5_labels_dir, "train/")
if not os.path.isdir(yolov5_labels_train_dir):
        os.mkdir(yolov5_labels_train_dir)
clear_hidden_files(yolov5_labels_train_dir)
yolov5_labels_test_dir = os.path.join(yolov5_labels_dir, "val/")
if not os.path.isdir(yolov5_labels_test_dir):
        os.mkdir(yolov5_labels_test_dir)
clear_hidden_files(yolov5_labels_test_dir)
 
train_file = open(os.path.join(wd, "yolov5_train.txt"), 'w')
test_file = open(os.path.join(wd, "yolov5_val.txt"), 'w')
train_file.close()
test_file.close()
train_file = open(os.path.join(wd, "yolov5_train.txt"), 'a')
test_file = open(os.path.join(wd, "yolov5_val.txt"), 'a')
list_imgs = os.listdir(image_dir) # list image files
prob = random.randint(1, 100)
print("Probability: %d" % prob)
for i in range(0,len(list_imgs)):
    path = os.path.join(image_dir,list_imgs[i])
    if os.path.isfile(path):
        image_path = image_dir + list_imgs[i]
        voc_path = list_imgs[i]
        (nameWithoutExtention, extention) = os.path.splitext(os.path.basename(image_path))
        (voc_nameWithoutExtention, voc_extention) = os.path.splitext(os.path.basename(voc_path))
        annotation_name = nameWithoutExtention + '.xml'
        annotation_path = os.path.join(annotation_dir, annotation_name)
        label_name = nameWithoutExtention + '.txt'
        label_path = os.path.join(yolo_labels_dir, label_name)
    prob = random.randint(1, 100)
    print("Probability: %d" % prob)
    if(prob < TRAIN_RATIO): # train dataset
        if os.path.exists(annotation_path):
            train_file.write(image_path + '\n')
            convert_annotation(nameWithoutExtention) # convert label
            copyfile(image_path, yolov5_images_train_dir + voc_path)
            copyfile(label_path, yolov5_labels_train_dir + label_name)
    else: # test dataset
        if os.path.exists(annotation_path):
            test_file.write(image_path + '\n')
            convert_annotation(nameWithoutExtention) # convert label
            copyfile(image_path, yolov5_images_test_dir + voc_path)
            copyfile(label_path, yolov5_labels_test_dir + label_name)
train_file.close()
test_file.close()

首先数据集的名称由于代码固定的原因必须严格按照代码中的名称命名

Annotations里面存放着xml格式的标签文件

JPEGImages里面存放着照片数据文件

将代码和数据放在同一目录下运行，在VOCdevkit目录下生成images和labels文件夹，文件夹下分别生成了train文件夹和val文件夹，里面分别保存着训练集的照片和txt格式的标签，还有验证集的照片和txt格式的标签。images文件夹和labels文件夹就是训练yolov5模型所需的训练集和验证集。在VOCdevkit/VOC2007目录下还生成了一个YOLOLabels文件夹，里面存放着所有的txt格式的标签文件。
然后我们在网上找到我们的yolov5的代码源，下载解压后用一款IDE打开我用的是pycharm

打开后我们可以在

requirements.txt中了解到我们要用到的依赖库以及版本，我们要先下载更新，打开pycharm的命令终端，输入pip install -r requirements.txt即可

pip install -r requirements.txt

2.3训练集和预训练权重的准备

一般为了缩短网络的训练时间，并达到更好的精度，我们一般加载预训练权重进行网络的训练。而yolov5的5.0版本给我们提供了几个预训练权重，我们可以对应我们不同的需求选择不同的版本的预训练权重，可以预料的到，预训练权重越大，训练出来的精度就会相对来说越高，但是其检测的速度就会越慢。我使用的是数据集用的预训练权重为yolov5s.pt。

然后就可以开始训练了，训练目标检测模型需要修改两个yaml文件中的参数。一个是data目录下的相应的yaml文件，一个是model目录文件下的相应的yaml文件。

然后打开hat.yaml文件修改一下里面的代码打开这个文件夹修改其中的参数，首先将箭头1中的那一行代码注释掉（我已经注释掉了），如果不注释这行代码训练的时候会报错；箭头2中需要将训练和测试的数据集的路径填上（最好要填绝对路径，有时候由目录结构的问题会莫名奇妙的报错）；箭头3中需要检测的类别数

最后再把class name中填写需要识别的类别的名字（必须是英文，否则会乱码识别不出来）。

由于该项目使用的是yolov5s.pt这个预训练权重，所以要使用models目录下的yolov5s.yaml文件中的相应参数（因为不同的预训练权重对应着不同的网络层数，所以用错预训练权重会报错）。同上修改data目录下的yaml文件一样，我们最好将yolov5s.yaml文件复制一份，然后将其重命名，我将其重命名为yolov5_hat.yaml。

打开yolov5_hat.yaml文件只需要修改如图中的数字就好了，我标注了六个类别所以我改成6

到这里我们的准备工作就完成了，然后进入我们的训练函数train.py，找到主函数的入口

里面的主要参数有

> if name == ‘main’:
> opt模型主要参数解析：
> --weights：初始化的权重文件的路径地址
> --cfg：模型yaml文件的路径地址
> --data：数据yaml文件的路径地址
> --hyp：超参数文件路径地址
> --epochs：训练轮次
> --batch-size：喂入批次文件的多少
> --img-size：输入图片尺寸
> --rect:是否采用矩形训练，默认False
> --resume:接着打断训练上次的结果接着训练
> --nosave:不保存模型，默认False
> --notest:不进行test，默认False
> --noautoanchor:不自动调整anchor，默认False
> --evolve:是否进行超参数进化，默认False
> --bucket:谷歌云盘bucket，一般不会用到
> --cache-images:是否提前缓存图片到内存，以加快训练速度，默认False
> --image-weights：使用加权图像选择进行训练
> --device:训练的设备，cpu；0(表示一个gpu设备cuda:0)；0,1,2,3(多个gpu设备)
> --multi-scale:是否进行多尺度训练，默认False
> --single-cls:数据集是否只有一个类别，默认False
> --adam:是否使用adam优化器
> --sync-bn:是否使用跨卡同步BN,在DDP模式使用
> --local_rank：DDP参数，请勿修改
> --workers：最大工作核心数
> --project:训练模型的保存位置
> --name：模型保存的目录名称
> --exist-ok：模型目录是否存在，不存在就创建
然后开始训练

           parser = argparse.ArgumentParser()
           parser.add_argument('--weights', type=str, default='yolov5s.pt', help='initial weights path')
           parser.add_argument('--cfg', type=str, default='', help='model.yaml path')
           parser.add_argument('--data', type=str, default='data/coco128.yaml', help='data.yaml path')
           parser.add_argument('--hyp', type=str, default='data/hyp.scratch.yaml', help='hyperparameters path')
           parser.add_argument('--epochs', type=int, default=300)
           parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs')
           parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')
           parser.add_argument('--rect', action='store_true', help='rectangular training')
           parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
           parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
           parser.add_argument('--notest', action='store_true', help='only test final epoch')
           parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
           parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
           parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
           parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
           parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for
training')
           parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
           parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
           parser.add_argument('--single-cls', action='store_true', help='train multi-class data as
single-class')
           parser.add_argument('--adam', action='store_true', help='use torch.optim.Adam() optimizer')
           parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
           parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')
           parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers')
           parser.add_argument('--project', default='runs/train', help='save to project/name')
           parser.add_argument('--entity', default=None, help='W&B entity')
           parser.add_argument('--name', default='exp', help='save to project/name')
           parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
           parser.add_argument('--quad', action='store_true', help='quad dataloader')
           parser.add_argument('--linear-lr', action='store_true', help='linear LR')
           parser.add_argument('--label-smoothing', type=float, default=0.0, help='Label smoothing epsilon')
           parser.add_argument('--upload_dataset', action='store_true', help='Upload dataset as W&B artifact table')
           parser.add_argument('--bbox_interval', type=int, default=-1, help='Set bounding-box image logging interval for
W&B')
           parser.add_argument('--save_period', type=int, default=-1, help='Log model after every "save_period" epoch')
           parser.add_argument('--artifact_alias', type=str, default="latest", help='version of dataset artifact to be used')
           opt = parser.parse_args() 训练自己的模型需要修改如下几个参数就可以训练了。首先将weights权重的路径填写到对应的参数里面，然后将修好好的models模型的yolov5s.yaml文件路径填写到相应的参数里面，最后将data数据的hat.yaml文件路径填写到相对于的参数里面。这几个参数就必须要修改的参数。
           parser.add_argument('--weights', type=str, default='weights/yolov5s.pt', help='initial weights path')
           parser.add_argument('--cfg', type=str, default='models/yolov5s_hat.yaml', help='model.yaml path')
           parser.add_argument('--data', type=str, default='data/hat.yaml', help='data.yaml path')   
       ***这就是刚刚三个文件的相对路径！这就是刚刚三个文件的相对路径！这就是刚刚三个文件的相对路径！***
           还有几个需要根据自己的需求来更改的参数：

 首先是模型的训练轮次，这里是训练的10000轮。
 parser.add_argument('--epochs', type=int, default=10000)
   其次是输入图片的数量和工作的核心数，这里每个人的电脑都不一样，所以这里每个人和自己的电脑的性能来。

三.测试结果

等到数据训练好了以后，就会在主目录下产生一个run文件夹，在run/train/exp/weights目录下会产生两个权重文件，一个是最后一轮的权重文件，一个是最好的权重文件，一会我们就要利用这个最好的权重文件来做推理测试。除此以外还会产生一些验证文件的图片等一些文件

找到主目录下的detect.py文件，打开该文件。

然后找到主函数的入口，这里面有模型的主要参数。模型的主要参数解析如下所示。
f name == ‘main’:
“”"
–weights:权重的路径地址
–source:测试数据，可以是图片/视频路径，也可以是’0’(电脑自带摄像头),也可以是rtsp等视频流
–output:网络预测之后的图片/视频的保存路径
–img-size:网络输入图片大小
–conf-thres:置信度阈值
–iou-thres:做nms的iou阈值
–device:是用GPU还是CPU做推理
–view-img:是否展示预测之后的图片/视频，默认False
–save-txt:是否将预测的框坐标以txt文件形式保存，默认False
–classes:设置只保留某一部分类别，形如0或者0 2 3
–agnostic-nms:进行nms是否也去除不同类别之间的框，默认False
–augment:推理的时候进行多尺度，翻转等操作(TTA)推理
–update:如果为True，则对所有模型进行strip_optimizer操作，去除pt文件中的优化器等信息，默认为False
–project：推理的结果保存在runs/detect目录下
–name：结果保存的文件夹名称
“”"
parser = argparse.ArgumentParser()
parser.add_argument(‘–weights’, nargs=‘+’, type=str, default=‘yolov5s.pt’, help=‘model.pt path(s)’)
parser.add_argument(‘–source’, type=str, default=‘data/images’, help=‘source’) # file/folder, 0 for webcam
parser.add_argument(‘–img-size’, type=int, default=640, help=‘inference size (pixels)’)
parser.add_argument(‘–conf-thres’, type=float, default=0.25, help=‘object confidence threshold’)
parser.add_argument(‘–iou-thres’, type=float, default=0.45, help=‘IOU threshold for NMS’)
parser.add_argument(‘–device’, default=‘’, help=‘cuda device, i.e. 0 or 0,1,2,3 or cpu’)
parser.add_argument(‘–view-img’, action=‘store_true’, help=‘display results’)
parser.add_argument(‘–save-txt’, action=‘store_true’, help=‘save results to *.txt’)
parser.add_argument(‘–save-conf’, action=‘store_true’, help=‘save confidences in --save-txt labels’)
parser.add_argument(‘–nosave’, action=‘store_true’, help=‘do not save images/videos’)
parser.add_argument(‘–classes’, nargs=‘+’, type=int, help=‘filter by class: --class 0, or --class 0 2 3’)
parser.add_argument(‘–agnostic-nms’, action=‘store_true’, help=‘class-agnostic NMS’)
parser.add_argument(‘–augment’, action=‘store_true’, help=‘augmented inference’)
parser.add_argument(‘–update’, action=‘store_true’, help=‘update all models’)
parser.add_argument(‘–project’, default=‘runs/detect’, help=‘save results to project/name’)
parser.add_argument(‘–name’, default=‘exp’, help=‘save results to project/name’)
parser.add_argument(‘–exist-ok’, action=‘store_true’, help=‘existing project/name ok, do not increment’)
opt = parser.parse_args()
这里需要将刚刚训练好的最好的权重传入到推理函数中去。然后就可以对图像视频进行推理了。
parser.add_argument(‘–weights’, nargs=‘+’, type=str, default=‘runs/train/exp/weights/best.pt’, help=‘model.pt path(s)’)
对图片进行测试推理，将如下参数修改成图片的路径，然后运行detect.py就可以进行测试了。
parser.add_argument(‘–source’, type=str, default=‘01407.jpg’, help=‘source’)
推理测试结束以后，在run下面会生成一个detect目录，推理结果会保存在exp目录下。如图所示。