voc数据集xml文件转换为txt文件并划分训练集、测试集

最新推荐文章于 2022-12-14 15:35:50 发布

猛男技术控

最新推荐文章于 2022-12-14 15:35:50 发布

阅读量1.4k

点赞数 3

分类专栏：数据处理避免踩坑系列目标检测文章标签： python 机器学习 xml voc yolo

CSDN小白不白

本文链接：https://blog.csdn.net/weixin_45755332/article/details/115664107

版权

目标检测同时被 3 个专栏收录

7 篇文章 1 订阅

订阅专栏

避免踩坑系列

5 篇文章 0 订阅

订阅专栏

数据处理

3 篇文章 1 订阅

订阅专栏

如何提取voc中的某一类这里有写：https://xiaobaibubai.blog.csdn.net/article/details/115660715

本代码可以将voc数据集xml文件转换为txt文件：

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join

# 改变坐标格式
def convert(size, box):
    dw = 1./(size[0])
    dh = 1./(size[1])
    x = (box[0] + box[1])/2.0 
    y = (box[2] + box[3])/2.0 
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)
xml_path = "E:/data/voc/VOCdevkit/VOC2007/bird/bird_xml//"
txt_path = "E:/data/voc/VOCdevkit/VOC2007/bird/bird_label/"
def convert_annotation(xml_path,txt_path):
    if not os.path.exists(txt_path):
        os.mkdir(txt_path)
    for image_id in os.listdir(xml_path):
#         print(xml_path+image_id)
        xml_file = open(xml_path+image_id)
        txt_file = open(txt_path+image_id[:-3]+"txt","a")
        tree=ET.parse(xml_file)
        root = tree.getroot()
        size = root.find('size')
        w = int(size.find('width').text)
        h = int(size.find('height').text)
        for obj in root.iter('object'):
            difficult = obj.find('difficult').text
            cls = obj.find('name').text
            cls_id = classes.index(cls)
            xmlbox = obj.find('bndbox')
            b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
            bb = convert((w,h), b)
            txt_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
convert_annotation(xml_path,txt_path)

详解：

这是我的目录结构，图片和xml单独放在两个文件夹中，在统一目录下

本代码将划分训练集、验证集、测试集：

root_path = r"E:/data/voc/VOCdevkit/VOC2007/bird/"
img_path = "E:/data/voc/VOCdevkit/VOC2007/bird/bird_img/"
txt_path = "E:/data/voc/VOCdevkit/VOC2007/bird/bird_labels/"
def train_test_move(txt_path,root_path,img_path):
    files = os.listdir(txt_path)
    l = len(files)
    sets = ['train','val','test']
    k = 0
    p = 0.6
    for i in sets:
        if not os.path.exists(root_path+i):
            print(root_path+i)
            os.mkdir(root_path+i)
            os.mkdir(root_path+i+"/images")
            os.mkdir(root_path+i+"/labels")
        for file in files[round(l*k):round(l*p)]:
            shutil.copy(txt_path+file,root_path+i+"/labels")
            shutil.copy(img_path+file[:-3]+"jpg",root_path+i+"/images")
        k = p
        p += 0.2
train_test_move(txt_path,root_path,img_path)

猛男技术控

关注

3
点赞
踩
14

收藏

觉得还不错? 一键收藏
打赏
2
评论
voc数据集xml文件转换为txt文件并划分训练集、测试集

如何提取voc中的某一类这里有写：https://xiaobaibubai.blog.csdn.net/article/details/115660715本代码可以将voc数据集xml文件转换为txt文件：import xml.etree.ElementTree as ETimport pickleimport osfrom os import listdir, getcwdfrom os.path import join# 改变坐标格式def convert(size, box):
复制链接

扫一扫