多文件/批量文件xml转txt应用于yolov3/5/8

最新推荐文章于 2024-07-31 14:30:10 发布

千玺女友

最新推荐文章于 2024-07-31 14:30:10 发布

阅读量250

点赞数

文章标签： python 深度学习 YOLO xml

本文链接：https://blog.csdn.net/XGTTTTT011/article/details/134020564

版权

有的数据集是xml格式，需要转换成为txt才可应用到yolo训练。

当有多个文件夹，下面还有很多文件，需要批量处理。如下图

下面的代码会把xml转换成为txt，而且把所有文件夹都对应起来

# -*- coding: utf-8 -*-
import os
import xml.etree.ElementTree as ET

# 原始XML文件存放的根目录
root_dir = 'Caltech/train_annotations'

# 目标文件夹，用于存放生成的TXT文件
target_dir = 'train_labels'

# 字典，用于映射标签名称到YoloV3标签
dict_info = {"person": 0}

# 遍历所有文件夹
for dirpath, dirnames, filenames in os.walk(root_dir):
    for fp in filenames:
        if fp.endswith('.xml'):
            xml_file_path = os.path.join(dirpath, fp)

            # 根据XML文件的路径创建对应的目标文件夹
            relative_path = os.path.relpath(xml_file_path, root_dir)
            txt_dir = os.path.join(target_dir, os.path.dirname(relative_path))
            os.makedirs(txt_dir, exist_ok=True)

            root = ET.parse(xml_file_path).getroot()
            xmin, ymin, xmax, ymax = 0, 0, 0, 0
            sz = root.find('size')
            width = float(sz[0].text)
            height = float(sz[1].text)
            filename = root.find('filename').text

            txt_file_name = os.path.splitext(fp)[0] + '.txt'
            txt_file_path = os.path.join(txt_dir, txt_file_name)

            with open(txt_file_path, 'w') as f:
                for child in root.findall('object'):
                    sub = child.find('bndbox')
                    label = child.find('name').text
                    label_ = dict_info.get(label, 0)
                    xmin = float(sub[0].text)
                    ymin = float(sub[1].text)
                    xmax = float(sub[2].text)
                    ymax = float(sub[3].text)
                    try:
                        x_center = (xmin + xmax) / (2 * width)
                        x_center = '%.6f' % x_center
                        y_center = (ymin + ymax) / (2 * height)
                        y_center = '%.6f' % y_center
                        w = (xmax - xmin) / width
                        w = '%.6f' % w
                        h = (ymax - ymin) / height
                        h = '%.6f' % h
                    except ZeroDivisionError:
                        print(filename, '的 width有问题')
                    f.write(' '.join([str(label_), str(x_center), str(y_center), str(w), str(h) + '\n']))

print('转换完成')

在这里面，你需要修改3个地方。

1、传入的父文件夹名（内涵xml文件）。