YoloV9实战:从Labelme到训练、验证、测试、模块解析

本文详细介绍了如何使用YoloV9进行COCO数据集的训练,包括数据集的下载、COCO格式到Yolo格式的转换。同时,文章还展示了如何训练自定义数据集,提供了数据格式转换的代码,并给出了数据集配置文件的内容和训练脚本的修改说明。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

模型实战

训练COCO数据集

本次使用2017版本的COCO数据集作为例子,演示如何使用YoloV8训练和预测。

下载数据集

Images:

  • 2017 Train images [118K/18GB] :http://images.cocodataset.org/zips/train2017.zip
  • 2017 Val images [5K/1GB]:http://images.cocodataset.org/zips/val2017.zip
  • 2017 Test images [41K/6GB]:http://images.cocodataset.org/zips/unlabeled2017.zip

Annotations:

  • 2017 annotations_trainval2017 [241MB]:http://images.cocodataset.org/annotations/annotations_trainval2017.zip

COCO转yolo格式数据集(适用V4,V5,V6,V7,V8)

最初的研究论文中,COCO中有91个对象类别。然而,在2014年的第一次发布中,仅发布了80个标记和分割图像的对象类别。2014年发布之后,2017年发布了后续版本。详细的类别如下:

IDOBJECT (PAPER)OBJECT (2014 REL.)OBJECT (2017 REL.)SUPER CATEGORY
1personpersonpersonperson
2bicyclebicyclebicyclevehicle
3carcarcarvehicle
4motorcyclemotorcyclemotorcyclevehicle
5airplaneairplaneairplanevehicle
6busbusbusvehicle
7traintraintrainvehicle
8trucktrucktruckvehicle
9boatboatboatvehicle
10trafficlighttraffic lighttraffic lightoutdoor
11fire hydrantfire hydrantfire hydrantoutdoor
12streetsign--
13stop signstop signstop signoutdoor
14parking meterparking meterparking meteroutdoor
15benchbenchbenchoutdoor
16birdbirdbirdanimal
17catcatcatanimal
18dogdogdoganimal
19horsehorsehorseanimal
20sheepsheepsheepanimal
21cowcowcowanimal
22elephantelephantelephantanimal
23bearbearbearanimal
24zebrazebrazebraanimal
25giraffegiraffegiraffeanimal
26hat--accessory
27backpackbackpackbackpackaccessory
28umbrellaumbrellaumbrellaaccessory
29shoe--accessory
30eye glasses--accessory
31handbaghandbaghandbagaccessory
32tietietieaccessory
33suitcasesuitcasesuitcaseaccessory
34frisbeefrisbeefrisbeesports
35skisskisskissports
36snowboardsnowboardsnowboardsports
37sports ballsports ballsports ballsports
38kitekitekitesports
39baseball batbaseball batbaseball batsports
40baseball glovebaseball glovebaseball glovesports
41skateboardskateboardskateboardsports
42surfboardsurfboardsurfboardsports
43tennis rackettennis rackettennis racketsports
44bottlebottlebottlekitchen
45plate--kitchen
46wine glasswine glasswine glasskitchen
47cupcupcupkitchen
48forkforkforkkitchen
49knifeknifeknifekitchen
50spoonspoonspoonkitchen
51bowlbowlbowlkitchen
52bananabananabananafood
53appleappleapplefood
54sandwichsandwichsandwichfood
55orangeorangeorangefood
56broccolibroccolibroccolifood
57carrotcarrotcarrotfood
58hot doghot doghot dogfood
59pizzapizzapizzafood
60donutdonutdonutfood
61cakecakecakefood
62chairchairchairfurniture
63couchcouchcouchfurniture
64potted plantpotted plantpotted plantfurniture
65bedbedbedfurniture
66mirror--furniture
67dining tabledining tabledining tablefurniture
68window--furniture
69desk--furniture
70toilettoilettoiletfurniture
71door--furniture
72tvtvtvelectronic
73laptoplaptoplaptopelectronic
74mousemousemouseelectronic
75remoteremoteremoteelectronic
76keyboardkeyboardkeyboardelectronic
77cell phonecell phonecell phoneelectronic
78microwavemicrowavemicrowaveappliance
79ovenovenovenappliance
80toastertoastertoasterappliance
81sinksinksinkappliance
82refrigeratorrefrigeratorrefrigeratorappliance
83blender--appliance
84bookbookbookindoor
85clockclockclockindoor
86vasevasevaseindoor
87scissorsscissorsscissorsindoor
88teddy bearteddy bearteddy bearindoor
89hair drierhair drierhair drierindoor
90toothbrushtoothbrushtoothbrushindoor
91hair brush--indoor

可以看到,2014年和2017年发布的对象列表是相同的,它们是论文中最初91个对象类别中的80个对象。所以在转换的时候,要重新对类别做映射,映射函数如下:

def coco91_to_coco80_class():  # converts 80-index (val2014) to 91-index (paper)
    # https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/
    # a = np.loadtxt('data/coco.names', dtype='str', delimiter='\n')
    # b = np.loadtxt('data/coco_paper.names', dtype='str', delimiter='\n')
    # x1 = [list(a[i] == b).index(True) + 1 for i in range(80)]  # darknet to coco
    # x2 = [list(b[i] == a).index(True) if any(b[i] == a) else None for i in range(91)]  # coco to darknet
    x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, None, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, None, 24, 25, None,
         None, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, None, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
         51, 52, 53, 54, 55, 56, 57, 58, 59, None, 60, None, None, 61, None, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
         None, 73, 74, 75, 76, 77, 78, 79, None]
    return x

接下来,开始格式转换,工程的目录如下:
在这里插入图片描述

  • coco:存放解压后的数据集。
    -out:保存输出结果。
    -coco2yolo.py:转换脚本。

转换代码如下:

import json
import glob
import os
import shutil
from pathlib import Path
import numpy as np
from tqdm import tqdm


def make_folders(path='../out/'):
    # Create folders

    if os.path.exists(path):
        shutil.rmtree(path)  # delete output folder
    os.makedirs(path)  # make new output folder
    os.makedirs(path + os.sep + 'labels')  # make new labels folder
    os.makedirs(path + os.sep + 'images')  # make new labels folder
    return path


def convert_coco_json(json_dir='./coco/annotations_trainval2017/annotations/'):
    jsons = glob.glob(json_dir + '*.json')
    coco80 = coco91_to_coco80_class()

    # Import json
    for json_file in sorted(jsons):
        fn = 'out/labels/%s/' % Path(json_file).stem.replace('instances_', '')  # folder name
        fn_images = 'out/images/%s/' % Path(json_file).stem.replace('instances_', '')  # folder name
        os.makedirs(fn,exist_ok=True)
        os.makedirs(fn_images,exist_ok=True)
        with open(json_file) as f:
            data = json.load(f)
        print(fn)
        # Create image dict
        images = {'%g' % x['id']: x for x in data['images']}

        # Write labels file
        for x in tqdm(data['annotations'], desc='Annotations %s' % json_file):
            if x['iscrowd']:
                continue

            img = images['%g' % x['image_id']]
            h, w, f = img['height'], img['width'], img['file_name']
            file_path='coco/'+fn.split('/')[-2]+"/"+f
            # The Labelbox bounding box format is [top left x, top left y, width, height]
            box = np.array(x['bbox'], dtype=np.float64)
            box[:2] += box[2:] / 2  # xy top-left corner to center
            box[[0, 2]] /= w  # normalize x
            box[[1, 3]] /= h  # normalize y

            if (box[2] > 0.) and (box[3] > 0.):  # if w > 0 and h > 0
                with open(fn + Path(f).stem + '.txt', 'a') as file:
                    file.write('%g %.6f %.6f %.6f %.6f\n' % (coco80[x['category_id'] - 1], *box))
            file_path_t=fn_images+f
            print(file_path,file_path_t)
            shutil.copy(file_path,file_path_t)


def coco91_to_coco80_class():  # converts 80-index (val2014) to 91-index (paper)
    # https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/
    # a = np.loadtxt('data/coco.names', dtype='str', delimiter='\n')
    # b = np.loadtxt('data/coco_paper.names', dtype='str', delimiter='\n')
    # x1 = [list(a[i] == b).index(True) + 1 for i in range(80)]  # darknet to coco
    # x2 = [list(b[i] == a).index(True) if any(b[i] == a) else None for i in range(91)]  # coco to darknet
    x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, None, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, None, 24, 25, None,
         None, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, None, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
         51, 52, 53, 54, 55, 56, 57, 58, 59, None, 60, None, None, 61, None, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
         None, 73, 74, 75, 76, 77, 78, 79, None]
    return x

convert_coco_json()

开始运行:
在这里插入图片描述

转换完成后,验证转换的结果:

import cv2
import os

def draw_box_in_single_image(image_path, txt_path):
    # 读取图像
    image = cv2.imread(image_path)

    # 读取txt文件信息
    def read_list(txt_path):
        pos = []
        with open(txt_path, 'r') as file_to_read:
            while True:
                lines = file_to_read.readline()  # 整行读取数据
                if not lines:
                    break
                # 将整行数据分割处理,如果分割符是空格,括号里就不用传入参数,如果是逗号, 则传入‘,'字符。
                p_tmp = [float(i) for i in lines.split(' ')]
                pos.append(p_tmp)  # 添加新读取的数据
                # Efield.append(E_tmp)
                pass
        return pos


    # txt转换为box
    def convert(size, box):
        xmin = (box[1]-box[3]/2.)*size[1]
        xmax = (box[1]+box[3]/2.)*size[1]
        ymin = (box[2]-box[4]/2.)*size[0]
        ymax = (box[2]+box[4]/2.)*size[0]
        box = (int(xmin), int(ymin), int(xmax), int(ymax))
        return box

    pos = read_list(txt_path)
    print(pos)
    tl = int((image.shape[0]+image.shape[1])/2)
    lf = max(tl-1,1)
    for i in range(len(pos)):
        label = str(int(pos[i][0]))
        print('label is '+label)
        box = convert(image.shape, pos[i])
        image = cv2.rectangle(image,(box[0], box[1]),(box[2],box[3]),(0,0,255),2)
        cv2.putText(image,label,(box[0],box[1]-2), 0, 1, [0,0,255], thickness=2, lineType=cv2.LINE_AA)
        pass

    if pos:
        cv2.imwrite('./Data/see_images/{}.png'.format(image_path.split('\\')[-1][:-4]), image)
    else:
        print('None')



img_folder = "./out/images/val2017"
img_list = os.listdir(img_folder)
img_list.sort()

label_folder = "./out/labels/val2017"
label_list = os.listdir(label_folder)
label_list.sort()
if not os.path.exists('./Data/see_images'):
    os.makedirs('./Data/see_images')
for i in range(len(img_list)):
    image_path = img_folder + "\\" + img_list[i]
    txt_path = label_folder + "\\" + label_list[i]
    draw_box_in_single_image(image_path, txt_path)

结果展示:
在这里插入图片描述

训练自定义数据集

自定义数据,我们采用Labelme标注数据,将标注后的数据统一放大一个文件夹里面。
数据集下载链接:
https://download.csdn.net/download/hhhhhhhhhhwwwwwwwwww/63242994。
类别如下: [‘c17’, ‘c5’, ‘helicopter’, ‘c130’, ‘f16’, ‘b2’,
‘other’, ‘b52’, ‘kc10’, ‘command’, ‘f15’, ‘kc135’, ‘a10’,
‘b1’, ‘aew’, ‘f22’, ‘p3’, ‘p8’, ‘f35’, ‘f18’, ‘v22’, ‘f4’,
‘globalhawk’, ‘u2’, ‘su-27’, ‘il-38’, ‘tu-134’, ‘su-33’,
‘an-70’, ‘su-24’, ‘tu-22’, ‘il-76’]

格式转换

将Lableme数据集转为yolov8格式的数据集,转换代码如下:

import os
import shutil

import numpy as np
import json
from glob import glob
import cv2
from sklearn.model_selection import train_test_split
from os import getcwd


def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)


def change_2_yolo5(files, txt_Name):
    imag_name=[]
    for json_file_ in files:
        json_filename = labelme_path + json_file_ + ".json"
        out_file = open('%s/%s.txt' % (labelme_path, json_file_), 'w')
        json_file = json.load(open(json_filename, "r", encoding="utf-8"))
        # image_path = labelme_path + json_file['imagePath']
        imag_name.append(json_file_+'.jpg')
        height, width, channels = cv2.imread(labelme_path + json_file_ + ".jpg").shape
        for multi in json_file["shapes"]:
            points = np.array(multi["points"])
            xmin = min(points[:, 0]) if min(points[:, 0]) > 0 else 0
            xmax = max(points[:, 0]) if max(points[:, 0]) > 0 else 0
            ymin = min(points[:, 1]) if min(points[:, 1]) > 0 else 0
            ymax = max(points[:, 1]) if max(points[:, 1]) > 0 else 0
            label = multi["label"].lower()
            if xmax <= xmin:
                pass
            elif ymax <= ymin:
                pass
            else:
                cls_id = classes.index(label)
                b = (float(xmin), float(xmax), float(ymin), float(ymax))
                bb = convert((width, height), b)
                out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
                # print(json_filename, xmin, ymin, xmax, ymax, cls_id)
    return imag_name

def image_txt_copy(files,scr_path,dst_img_path,dst_txt_path):
    """
    :param files: 图片名字组成的list
    :param scr_path: 图片的路径
    :param dst_img_path: 图片复制到的路径
    :param dst_txt_path: 图片对应的txt复制到的路径
    :return:
    """
    for file in files:
        img_path=scr_path+file
        print(file)
        shutil.copy(img_path, dst_img_path+file)
        scr_txt_path=scr_path+file.split('.')[0]+'.txt'
        shutil.copy(scr_txt_path, dst_txt_path + file.split('.')[0]+'.txt')


if __name__ == '__main__':
    classes = ['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2',
               'other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10',
               'b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4',
               'globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33',
               'an-70', 'su-24', 'tu-22', 'il-76']

    # 1.标签路径
    labelme_path = "USA-Labelme/"
    isUseTest = True  # 是否创建test集
    # 3.获取待处理文件
    files = glob(labelme_path + "*.json")

    files = [i.replace("\\", "/").split("/")[-1].split(".json")[0] for i in files]
    for i in files:
        print(i)
    trainval_files, test_files = train_test_split(files, test_size=0.1, random_state=55)
    # split
    train_files, val_files = train_test_split(trainval_files, test_size=0.1, random_state=55)
    train_name_list=change_2_yolo5(train_files, "train")
    print(train_name_list)
    val_name_list=change_2_yolo5(val_files, "val")
    test_name_list=change_2_yolo5(test_files, "test")
    #创建数据集文件夹。
    file_List = ["train", "val", "test"]
    for file in file_List:
        if not os.path.exists('./VOC/images/%s' % file):
            os.makedirs('./VOC/images/%s' % file)
        if not os.path.exists('./VOC/labels/%s' % file):
            os.makedirs('./VOC/labels/%s' % file)
    image_txt_copy(train_name_list,labelme_path,'./VOC/images/train/','./VOC/labels/train/')
    image_txt_copy(val_name_list, labelme_path, './VOC/images/val/', './VOC/labels/val/')
    image_txt_copy(test_name_list, labelme_path, './VOC/images/test/', './VOC/labels/test/')

运行完成后就得到了yolov9格式的数据集。
在这里插入图片描述
设置数据集的yaml文件,将其命名为VOC.yaml,放到data文件夹下面,内容如下:

在这里插入图片描述
代码:

train: ./VOC/images/train # train images
val: ./VOC/images/val # val images
test: ./VOC/images/test # test images (optional)

nc: 32
names: [ 'c17', 'c5', 'helicopter', 'c130', 'f16', 'b2',
         'other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10',
         'b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4',
         'globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33',
         'an-70', 'su-24', 'tu-22', 'il-76' ]


将转化后的数据放到根目录下面,如下图:
在这里插入图片描述
修改train_dual.py脚本,代码如下:

    parser.add_argument('--weights', type=str, default='', help='initial weights path')
    parser.add_argument('--cfg', type=str, default='models/detect/yolov9-c.yaml', help='model.yaml path')
    parser.add_argument('--data', type=str, default=ROOT / 'data/VOC.yaml', help='dataset.yaml path')
    parser.add_argument('--epochs', type=int, default=100, help='total training epochs')
    parser.add_argument('--hyp', type=str, default=ROOT / 'data/hyps/hyp.scratch-high.yaml', help='hyperparameters path')
    parser.add_argument('--batch-size', type=int, default=4, help='total batch size for all GPUs, -1 for autobatch')
    parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')

weights:设置预训练权重。
cfg:选择配置文件
data:数据集配置文件的路径。
epochs:轮数。
hyp:数据增强文件。
batch-size:BatchSize的大小。
workers:使用CPU的核数。

测试结果

yolov9 summary: 580 layers, 60567520 parameters, 0 gradients, 264.3 GFLOPs
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 15/15 00:02
                   all        230       1412      0.878      0.991      0.989      0.732
                   c17        230        131       0.92      0.992      0.994      0.797
                    c5        230         68      0.828          1      0.992      0.807
            helicopter        230         43      0.895      0.977      0.969      0.634
                  c130        230         85      0.955      0.999      0.994      0.684
                   f16        230         57      0.839      0.965      0.966      0.689
                    b2        230          2          1      0.978      0.995      0.647
                 other        230         86       0.91      0.942      0.957      0.525
                   b52        230         70      0.917      0.971      0.979      0.806
                  kc10        230         62      0.958      0.984      0.987      0.826
               command        230         40      0.964          1      0.995      0.815
                   f15        230        123      0.939      0.995      0.995      0.702
                 kc135        230         91      0.949      0.989      0.978      0.691
                   a10        230         27      0.863      0.963      0.982      0.458
                    b1        230         20      0.926          1      0.995      0.712
                   aew        230         25      0.929          1      0.993      0.812
                   f22        230         17      0.835          1      0.995      0.706
                    p3        230        105       0.97          1      0.995      0.804
                    p8        230          1      0.566          1      0.995      0.697
                   f35        230         32      0.908          1      0.995      0.547
                   f18        230        125      0.956      0.992      0.993      0.828
                   v22        230         41      0.921          1      0.995      0.682
                 su-27        230         31      0.925          1      0.994      0.832
                 il-38        230         27      0.899          1      0.995      0.816
                tu-134        230          1      0.346          1      0.995      0.895
                 su-33        230          2       0.96          1      0.995      0.747
                 an-70        230          2      0.718          1      0.995      0.796
                 tu-22        230         98      0.912          1      0.995      0.804
### YOLOv5 模块功能概述 YOLOv5 是一种高效的目标检测算法,具有多种预定义配置文件来适应不同的应用场景。这些配置文件包括 `yolov5l6.yaml`, `yolov5m6.yaml`, `yolov5n6.yaml`, `yolov5s6.yaml` 和 `yolov5x6.yaml`,分别对应大型 (Large)、中型 (Medium)、小型 (Nano)、超小型 (Small) 及特大 (XLarge) 版本[^1]。 #### 主要组件与特性 - **骨干网络 Backbone**: 提取图像中的特征信息。 - **颈部结构 Neck**: 对提取到的特征图进行进一步处理,增强多尺度特征融合能力。 - **头部 Head**: 负责最终预测边界框的位置以及类别概率分布。 ```python import torch from models.experimental import attempt_load # 加载预训练模型 model = attempt_load(&#39;path/to/model.pt&#39;, map_location=torch.device(&#39;cpu&#39;)) ``` #### 数据集准备 为了能够顺利地使用自定义数据集进行训练,在创建好相应的目录结构之后还需要编写 `.yaml` 文件描述该数据集的信息,比如类别的数量、图片路径等[^2]。 ```yaml train: ./data/images/train/ val: ./data/images/valid/ nc: 80 names: [&#39;person&#39;, &#39;bicycle&#39;, ... ] ``` #### 训练流程 通过命令行工具可以方便快捷地启动训练过程: ```bash !python train.py --img 640 --batch 16 --epochs 50 \ --data custom_data.yaml --weights yolov5s.pt ``` 此命令指定了输入尺寸 (`--img`)、批次大小(`--batch`)、迭代次数(`--epochs`)以及其他参数如使用的权重文件(`--weights`)和数据配置文件(`--data`)。 #### 推理部署 完成训练后即可利用保存下来的最优权值来进行推理操作。下面是一个简单的 Python 示例展示如何加载模型并执行单张图片上的对象识别任务。 ```python import cv2 from utils.general import non_max_suppression, scale_coords from utils.torch_utils import select_device, time_synchronized device = select_device(&#39;&#39;) half = device.type != &#39;cpu&#39; # half precision only supported on CUDA with torch.no_grad(): img = cv2.imread(&#39;image.jpg&#39;) results = model(img)[0] if len(results): boxes = [] confidences = [] for *xyxy, conf, cls in reversed(non_max_suppression(results)): label = f&#39;{classes[int(cls)]} {conf:.2f}&#39; plot_one_box(xyxy, im0, label=label, color=colors(c), line_thickness=3) cv2.imshow("Image", img) cv2.waitKey(0) ``` 上述代码片段展示了从读入测试样本直至显示带有标注框的结果图像整个过程中涉及的关键步骤。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

AI智韵

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值