yolo格式及txt文件转换（总结记录）

最新推荐文章于 2024-08-27 12:28:12 发布

Mrs.ye

最新推荐文章于 2024-08-27 12:28:12 发布

阅读量3.9k

点赞数 1

文章标签：深度学习

本文链接：https://blog.csdn.net/m0_65491405/article/details/126756606

版权

1.yolo训练需要txt文件，我们可以通过labelimg来对图片进行数据标注，其中分别生成了几个值

labelimg在选择yolo格式时会生成一个classes.txt文件，这个文件代表了label标签文件

这几个值分别代表了：标签，中心横坐标与图像宽度比值，中心纵坐标与图像高度比值，

bbox宽度与图像宽度比值，bbox高度与图像高度比值。

那这些值是怎么生成的呢

设图像的高和宽分别为h,w,bbox的左上角坐标（x1,y1),右下角坐标为（x2,y2),则可得bbox的中心坐标（x3,y3)为：x3=x1+(x2-x1)/2=(x1+x2)/2,y3=y1+(y2-y1)/2=(y1+y2)/2

假设数据分别为label,a,b,c,d,则：

a=(x1+x2)/(2w),b=(y1+y2)/(2h),c=(x2-x1)/w,d=(y2-y1)/h

在labelimg标注中有两种格式，一种是标注成voc的xml文件，一种是yolo的txt文件，那么xml怎么转化成yolo呢，这里附上一段代码（亲测有效）

import os
import xml.etree.ElementTree as ET


xml_path = ''  # xml所在的文件
save_path = ''  # 保存文件


class Voc_Yolo(object):
    def __init__(self, find_path):
        self.find_path = find_path

    def Make_txt(self, outfile):
        out = open(outfile, 'w')
        return out

    def Work(self, count):

        for root, dirs, files in os.walk(self.find_path):

            for file in files:

                count += 1

                input_file = xml_path + file
                outfile = save_path + file[:-4] + '.txt'

                out = self.Make_txt(outfile)

                tree = ET.parse(input_file)
                root = tree.getroot()
                size = root.find('size')
                w_image = float(size.find('width').text)
                h_image = float(size.find('height').text)

                for obj in root.iter('object'):

                    classname = obj.find('name').text
                    cls_id = classname
                    xmlbox = obj.find('bndbox')
                    x_min = float(xmlbox.find('xmin').text)
                    x_max = float(xmlbox.find('xmax').text)
                    y_min = float(xmlbox.find('ymin').text)
                    y_max = float(xmlbox.find('ymax').text)

                    x_center = ((x_min + x_max) / 2 - 1) / w_image
                    y_center = ((y_min + y_max) / 2 - 1) / h_image
                    w = (x_max - x_min) / w_image
                    h = (y_max - y_min) / h_image

                    out.write(
                        str(cls_id) + " " + str(x_center) + " " + str(y_center) + " " + str(w) + " " + str(h) + '\n')
                out.close()
        return count


if __name__ == "__main__":
    data = Voc_Yolo(xml_path)
    number = data.Work(0)