YOLOv5实现半标注—告别大量重复标注工作

最新推荐文章于 2024-05-29 16:24:37 发布

置顶三叔家的猫

最新推荐文章于 2024-05-29 16:24:37 发布

阅读量1.2w

点赞数 31

文章标签：计算机视觉深度学习

本文链接：https://blog.csdn.net/qq_39056987/article/details/111030600

版权

序言

在模型训练阶段，难免会遇到大量标注的时候，当拿到一批未标注的数据的时候，一想到要标个几天几夜就头大。如果是公司有自己的数据标注师还好，如果没有的话还要自己动手，这就太浪费时间。为什么不能借用已有的模型来进行预标注呢？答案肯定是可以的。本文记录一下自己常用的yolov5作为预标注的使用过程。

一、预标注流程

使用yolov5官方权重进行推理得到box的坐标，如果你需要标注的类别在coco80个类里面（或者根据自己数据集用少量数据训练出来还不错的权重）
将box坐标和类别提取出来新建重写xml文件；
使用labelimg进行微调。

github代码：yolov5-label-xml

二、使用流程

首先将自己需要标注的图片放到inference/images文件夹中，注意使用的预训练权重中一定要包含你需要标注的目标，如果没有的话可以先使用少量的数据先进行一个初步的训练，得到一个效果还不错的结果后，再进行预标注。

权重放到models/weights文件夹里，然后修改config.py路径配置，修改完直接运行demo.py

import cv2
from utils import Detect_api
from lxml.etree import Element, SubElement, tostring

def create_xml(list_xml,list_images,xml_path):
    """
    创建xml文件，依次写入xml文件必备关键字
    :param list_xml:   txt文件中的box
    :param list_images:   图片信息，xml中需要写入WHC
    :return:
    """
    node_root = Element('annotation')
    node_folder = SubElement(node_root, 'folder')
    node_folder.text = 'Images'
    node_filename = SubElement(node_root, 'filename')
    node_filename.text = str(list_images[3])
    node_size = SubElement(node_root, 'size')
    node_width = SubElement(node_size, 'width')
    node_width.text = str(list_images[1])
    node_height = SubElement(node_size, 'height')
    node_height.text = str(list_images[0])
    node_depth = SubElement(node_size, 'depth')
    node_depth.text = str(list_images[2])

    if len(list_xml)>=1:        # 循环写入box
        for list_ in list_xml:
            node_object = SubElement(node_root, 'object')
            node_name = SubElement(node_object, 'name')
            if str(list_[4]) == "person":                # 根据条件筛选需要标注的标签,例如这里只标记person这类，不符合则直接跳过
                node_name.text = str(list_[4])
            else:
                continue
            # node_name.text = str(list_[4])
            node_difficult = SubElement(node_object, 'difficult')
            node_difficult.text = '0'
            node_bndbox = SubElement(node_object, 'bndbox')
            node_xmin = SubElement(node_bndbox, 'xmin')
            node_xmin.text = str(list_[0])
            node_ymin = SubElement(node_bndbox, 'ymin')
            node_ymin.text = str(list_[1])
            node_xmax = SubElement(node_bndbox, 'xmax')
            node_xmax.text = str(list_[2])
            node_ymax = SubElement(node_bndbox, 'ymax')
            node_ymax.text = str(list_[3])

    xml = tostring(node_root, pretty_print=True)   # 格式化显示，该换行的换行

    file_name = list_images[3].split(".")[0]
    filename = xml_path+"/{}.xml".format(file_name)

    f = open(filename, "wb")
    f.write(xml)
    f.close()


if __name__ == '__main__':

    import os

    path = r"inference/images"        # 图片路径
    xml_path = r"inference/xmls"      # xml标注保存路径

    yolo= Detect_api.Yolo_inference()

    for name in os.listdir(path):
        print(name)
        image = cv2.imread(os.path.join(path,name))
        list_image = (image.shape[0],image.shape[1],image.shape[2],name)             # 图片的宽高等信息

        img0, xyxy_list,img_crop= yolo.detect(image)       # img0检测后的图像，img_crop裁剪的图像
        create_xml(xyxy_list,list_image,xml_path)          # 生成标注的xml文件

如果只标注某个类，可以在写入时进行筛选，不符合则跳过。程序运行结束后，预标注文件存放在inference/xmls文件夹中，这时候因为还只是预标注，所以标注结果还不是最理想的，需要通过labelimg标注工具再继续进行微调。

如下图所示，先打开Open Dir加载刚才预标注的图片，Change Save Dir选择刚才xmls路径，加载完之后就可以进行微调了，以下是使用了yolov5s的权重进行的预标注，可以看到效果还是很好的，基本上不用微调。当然实际使用的场景很少有官方权重里面包含的类别，可以根据自己的数据训练一个不错的权重后再进行微调，以下只是做个演示。
在这里插入图片描述

三叔家的猫

关注

31
点赞
踩
177

收藏

觉得还不错? 一键收藏
15
评论
YOLOv5实现半标注—告别大量重复标注工作

序言在模型训练阶段，难免会遇到大量标注的时候，当拿到一批未标注的数据的时候，一想到要标个几天几夜就头大。如果是公司有自己的数据标注师还好，如果没有的话还要自己动手，这就太浪费时间。为什么不能借用已有的模型来进行预标注呢？答案肯定是可以的。本文记录一下自己常用的yolov5作为预标注的使用过程。一、预标注流程使用yolov5官方权重进行推理得到box的坐标，如果你需要标注的类别在coco80个类里面（或者根据自己数据集用少量数据训练出来还不错的权重）将box坐标和类别提取出来新建重写xml文件；使
复制链接

扫一扫