分享一下xml标签文件转换成YOLOtxt文件的方法

最新推荐文章于 2024-09-02 16:16:11 发布

skywalkerxhx

最新推荐文章于 2024-09-02 16:16:11 发布

阅读量341

点赞数 4

文章标签： xml

本文链接：https://blog.csdn.net/skywalkerxhx/article/details/140632729

版权

有很多小伙伴使用YOLO进行一些目标检测、实例分割、分类等的任务，YOLO标签文件需要为txt文件，在网络上找数据集时很多都为jason文件和xml文件，这就限制了我们使用公开数据集的便利性。

jason转txt，可以看下如下文章

分享一下用YOLO进行实例分割时数据集jason转txt的方式_jason 转txt-CSDN博客

xml转txt

首先看一下xml文件在pycharm上的显示

xml文件

如上图所示，xml文件包含了filename文件名称，size图片尺寸，name标注物体的名称，points标注框的四个坐标信息。

txt文件

第一个整数代表种类如上述4代表classeslist中的第五个Van

小数前两个代表框的中心点x，y的相对像素坐标

x = （246+270）➗ 2 ➗1000 1000为照片像素宽度，如上图中的size

y = （403+411）➗ 2 ➗1000 1000为照片像素高度

小数后两位代表框的相对宽度和高度

相对宽度高度同理做减法

有细心的小伙伴已经发现了，上述坐标连线生成的框不是一个标准的矩形框如403和402不一致，其实这是正常现象，由于标注软件的差异有些坐标可能会有个位数像素的差距，一般不影响训练

下面附上完整代码

import os
import xml.etree.ElementTree as ET

def get_image_dimensions(xml_file):
    tree = ET.parse(xml_file)
    root = tree.getroot()

    size_elem = root.find('size')
    width = int(size_elem.find('width').text)
    height = int(size_elem.find('height').text)

    return width, height

def xml_to_txt(xml_folder, output_folder):
    xml_files = [f for f in os.listdir(xml_folder) if f.endswith('.xml')]

    for xml_file in xml_files:
        xml_path = os.path.join(xml_folder, xml_file)
        txt_path = os.path.join(output_folder, xml_file.replace('.xml', '.txt'))

        img_width, img_height = get_image_dimensions(xml_path)

        tree = ET.parse(xml_path)
        root = tree.getroot()

        with open(txt_path, 'w') as txt_file:
            for obj in root.findall('objects/object'):
                class_name = obj.find('possibleresult/name').text.strip()
                if class_name in classes_lst:
                    class_id = classes_lst.index(class_name)

                    points = obj.find('points')
                    xmin = float(points[2].text.split(',')[0].strip())
                    ymin = float(points[2].text.split(',')[1].strip())
                    xmax = float(points[0].text.split(',')[0].strip())
                    ymax = float(points[0].text.split(',')[1].strip())

                    # Calculate center coordinates
                    center_x = (xmin + xmax) / 2 / img_width
                    center_y = (ymin + ymax) / 2 / img_height

                    # Calculate relative dimensions
                    width_rel = abs(xmax - xmin) / img_width
                    height_rel = abs(ymax - ymin) / img_height

                    # Write to txt file
                    txt_file.write(f"{class_id} {center_x} {center_y} {width_rel} {height_rel}\n")

if __name__ == "__main__":
    classes_lst = ['Small Car', 'Bus', 'Cargo Truck', 'Dump Truck', 'Van', 'Trailer',   # 
                   'Tractor', 'Excavator', 'Truck Tractor', 'other-vehicle']
    xml_folder = r'D:\2\car_det_train\car_det_train\xml'  # XML文件路径
    output_folder = r'D:\2\car_det_train\car_det_train\txt'  # 新建保存TXT文件的路径

    xml_to_txt(xml_folder, output_folder)