一、VOC数据集和COCO数据集直接的相互转换
VOC数据集(xml格式)和COCO数据集(json格式)的相互转换
voc和coco数据集的目录结构:
以VOC2007数据集为例,下载下来有如下三文件夹:
Annotations文件夹是存放图片对应的xml文件,比如“2007_000027.xml"存放的是图片2007_000027.jpg对应的信息,用记事本打开可以看到,这是xml格式的数据。
ImageSets文件夹里存放了官方为我们划分好的训练集和验证集的txt文件。我们主要使用“ImageSets/Main/"文件夹下的train.txt和val.txt文件,train.txt文件存放了官方划分的训练集的图片名称,val.txt文件存放了验证集图片的名称。
还有一个需要关注的文件夹就是JEPGImages,里面存放了对应图片名称的原始图片。
<annotation>
<folder>文件夹目录</folder>
<filename>图片名.jpg</filename>
<path>path_to\at002eg001.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>550</width>
<height>518</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>Apple</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>292</xmin>
<ymin>218</ymin>
<xmax>410</xmax>
<ymax>331</ymax>
</bndbox>
</object>
<object>
...
</object>
</annotation>
可以看到一个xml文件包含如下信息:
folder: 文件夹
filename:文件名
path:路径
source:来源
size:图片大小
segmented:图像分割会用到,本文仅以目标检测(bounding box为例进行介绍)
object:一个xml文件可以有多个object,每个object表示一个box,每个box有如下信息组成:
name:改box框出来的object属于哪一类,例如Apple
bndbox:给出左上角和右下角的坐标
truncated:是否被截
difficult:是否为检测困难物体
不同于VOC,一张图片对应一个xml文件,coco是直接将所有图片以及对应的box信息写在了一个json文件里。通常整个coco目录长这样:
coco
|______annotations # 存放标注信息
| |__train.json
| |__val.json
| |__test.json
|______trainset # 存放训练集图像
|______valset # 存放验证集图像
|______testset # 存放测试集图像
一个标准的json文件包含如下信息:
{
"info" : info,
"licenses" : [license],
"images" : [image],
"annotations" : [annataton],
"categories" : [category]
}
通过上面的json整体结构可以看出,info这个key对应的值的类型是一个字典;licenses、images、annotations和categories这四个key对应的值的类型都是一个列表,列表当中存储的数据类型依旧是字典。
我们可以通过len(List)的方式得到images、annotations、categories这三个列表的长度,也就得到了以下内容。
(1)images字段列表元素的长度 = 划入训练集(或者测试集)的图片的数量;
(2)annotations字段列表元素的数量 = 训练集(或者测试集)中bounding box的数量;
(3)categories字段列表元素的数量 = 类别的数量
接下来我们看每个key对应的内容:
(1)info
info{
"year" : int, # 年份
"version" : str, # 版本
"description" : str, # 详细描述信息
"contributor" : str, # 作者
"url" : str, # 协议链接
"date_created" : datetime, # 生成日期
}
(2)images
"images": [
{"id": 0, # int 图像id,可从0开始
"file_name": "0.jpg", # str 文件名
"width": 512, # int 图像的宽
"height": 512, # int 图像的高
"date_captured": "2020-04-14 01:45:07.508146", # datatime 获取日期
"license": 1, # int 遵循哪个协议
"coco_url": "", # str coco图片链接url
"flickr_url": "" # str flick图片链接url
}]
(3)licenses
"licenses": [
{
"id": 1, # int 协议id号 在images中遵循的license即1
"name": null, # str 协议名
"url": null # str 协议链接
}]
(4)annotations
"annotations": [
{
"id": 0, # int 图片中每个被标记物体的id编号
"image_id": 0, # int 该物体所在图片的编号
"category_id": 2, # int 被标记物体的类别id编号
"iscrowd": 0, # 0 or 1 目标是否被遮盖,默认为0
"area": 4095.9999999999986, # float 被检测物体的面积(64 * 64 = 4096)
"bbox": [200.0, 416.0, 64.0, 64.0], # [x, y, width, height] 目标检测框的坐标信息
"segmentation": [[200.0, 416.0, 264.0, 416.0, 264.0, 480.0, 200.0, 480.0]]
}]
"bbox"里[x, y, width, height]x, y代表的是物体的左上角的x, y的坐标值。
"segmentation"里[x1, y1, x2, y2, x3, y3, x4, y4]是以左上角坐标为起始,顺时针依次选取的另外三个坐标点。及[左上x, 左上y, 右上x,右上y,右下x,右下y,左下x,左下y]。
(5)categories
"categories":[
{
"id": 1, # int 类别id编号
"name": "rectangle", # str 类别名字
"supercategory": "None" # str 类别所属的大类,如卡车和轿车都属于机动车这个class
},
{
"id": 2,
"name": "circle",
"supercategory": "None"
}
]
一、将voc数据集的xml转化为coco数据集的json格式
开始转换前,得先将要转化的所有.xml文件名保存在xml_list.txt列表中。如果是自己制作的voc数据集,在输入标签名的时候记得不要把类别名name打错了。
# create_xml_list.py
import os
xml_list = os.listdir('C:/Users/user/Desktop/train')
with open('C:/Users/user/Desktop/xml_list.txt','a') as f:
for i in xml_list:
if i[-3:]=='xml':
f.write(str(i)+'\n')
执行python voc2coco.py xml_list.txt的文件路径 .xml文件的真实存放路径 转化后的.json存放路径即可将xml转化为一个.json文件。
# voc2coco.py
# pip install lxml
import sys
import os
import json
import xml.etree.ElementTree as ET
START_BOUNDING_BOX_ID = 1
PRE_DEFINE_CATEGORIES = {}
# If necessary, pre-define category and its id
# PRE_DEFINE_CATEGORIES = {"aeroplane": 1, "bicycle": 2, "bird": 3, "boat": 4,
# "bottle":5, "bus": 6, "car": 7, "cat": 8, "chair": 9,
# "cow": 10, "diningtable": 11, "dog": 12, "horse": 13,
# "motorbike": 14, "person": 15, "pottedplant": 16,
# "sheep": 17, "sofa": 18, "train": 19, "tvmonitor": 20}
def get(root, name):
vars = root.findall(name)
return vars
def get_and_check(root, name, length):
vars = root.findall(name)
if len(vars) == 0:
raise NotImplementedError('Can not find %s in %s.'%(name, root.tag))
if length > 0 and len(vars) != length:
raise NotImplementedError('The size of %s is supposed to be %d, but is %d.'%(name, length, len(vars)))
if length == 1:
vars = vars[0]
return vars
def get_filename_as_int(filename):
try:
filename = os.path.splitext(filename)[0]
return int(filename)
except:
raise NotImplementedError('Filename %s is supposed to be an integer.'%(filename))
def convert(xml_list, xml_dir, json_file):
list_fp = open(xml_list, 'r')
json_dict = {"images":[], "type": "instances", "annotations": [],
"categories": []}
categories = PRE_DEFINE_CATEGORIES
bnd_id = START_BOUNDING_BOX_ID
for line in list_fp:
line = line.strip()
print("Processing %s"%(line))
xml_f = os.path.join(xml_dir, line)
tree = ET.parse(xml_f)
root = tree.getroot()
path = get(root, 'path')
if len(path) == 1:
filename = os.path.basename(path[0].text)
elif len(path) == 0:
filename = get_and_check(root, 'filename', 1).text
else:
raise NotImplementedError('%d paths found in %s'%(len(path), line))
## The filename must be a number
image_id = get_filename_as_int(filename)
size = get_and_check(root, 'size', 1)
width = int(get_and_check(size, 'width', 1).text)
height = int(get_and_check(size, 'height', 1).text)
image = {'file_name': filename, 'height': height, 'width': width,
'id':image_id}
json_dict['images'].append(image)
## Cruuently we do not support segmentation
# segmented = get_and_check(root, 'segmented', 1).text
# assert segmented == '0'
for obj in get(root, 'object'):
category = get_and_check(obj, 'name', 1).text
if category not in categories:
new_id = len(categories)
categories[category] = new_id
category_id = categories[category]
bndbox = get_and_check(obj, 'bndbox', 1)
xmin = int(get_and_check(bndbox, 'xmin', 1).text) - 1
ymin = int(get_and_check(bndbox, 'ymin', 1).text) - 1
xmax = int(get_and_check(bndbox, 'xmax', 1).text)
ymax = int(get_and_check(bndbox, 'ymax', 1).text)
############################################################
#如果报错ValueError: invalid literal for int() with base 10: '99.2',原因是我们的坐标值是#浮点数字符串,而int只能转化整型字符串,这时坐标值得先用float将浮点数字符串转成浮点数,再用int将浮点#数转成整数。
# xmin = int(float(get_and_check(bndbox, 'xmin', 1).text)) - 1
# ymin = int(float(get_and_check(bndbox, 'ymin', 1).text)) - 1
# xmax = int(float(get_and_check(bndbox, 'xmax', 1).text))
# ymax = int(float(get_and_check(bndbox, 'ymax', 1).text))
############################################################
assert(xmax > xmin)
assert(ymax > ymin)
o_width = abs(xmax - xmin)
o_height = abs(ymax - ymin)
ann = {'area': o_width*o_height, 'iscrowd': 0, 'image_id':
image_id, 'bbox':[xmin, ymin, o_width, o_height],
'category_id': category_id, 'id': bnd_id, 'ignore': 0,
'segmentation': []}
json_dict['annotations'].append(ann)
bnd_id = bnd_id + 1
for cate, cid in categories.items():
cat = {'supercategory': 'none', 'id': cid, 'name': cate}
json_dict['categories'].append(cat)
json_fp = open(json_file, 'w')
json_str = json.dumps(json_dict)
json_fp.write(json_str)
json_fp.close()
list_fp.close()
if __name__ == '__main__':
if len(sys.argv) <= 1:
print('3 auguments are need.')
print('Usage: %s XML_LIST.txt XML_DIR OUTPU_JSON.json'%(sys.argv[0]))
exit(1)
convert(sys.argv[1], sys.argv[2], sys.argv[3])
注意这里的image_id用的是图片名称去掉.jpg,所以图片名必须是数字,如果不是,先将所有图片和label名称改成数字,再转coco。
import os
img_dir='F:/Billboard/dataset/images/'
lab_dir='F:/Billboard/dataset/labels/'
name_list = os.listdir(img_dir)
for i,name in enumerate(name_list):
os.rename(img_dir+name,img_dir+str(i)+'.jpg')
os.rename(lab_dir+name[:-4]+'.txt',lab_dir+str(i)+'.txt')
第二种方法,不需要繁琐的操作即可转换,只需要更改anno 以及xml_dir
import sys
import os
import json
import warnings
import numpy as np
import xml.etree.ElementTree as ET
import glob
START_BOUNDING_BOX_ID = 1
# 按照你给定的类别来生成你的 category_id
# COCO 默认 0 是背景类别
# CenterNet 里面类别是从0开始的,否则生成heatmap的时候报错
PRE_DEFINE_CATEGORIES = {'ignored regions': 1, 'pedestrian': 2, 'people': 3,
'bicycle': 4, 'car': 5, 'van': 6, 'truck': 7,
'tricycle': 8, 'awning-tricycle': 9, 'bus': 10,
'motor': 11, 'others': 12}
START_IMAGE_ID = 0
# If necessary, pre-define category and its id
# PRE_DEFINE_CATEGORIES = {"aeroplane": 1, "bicycle": 2, "bird": 3, "boat": 4,
# "bottle":5, "bus": 6, "car": 7, "cat": 8, "chair": 9,
# "cow": 10, "diningtable": 11, "dog": 12, "horse": 13,
# "motorbike": 14, "person": 15, "pottedplant": 16,
# "sheep": 17, "sofa": 18, "train": 19, "tvmonitor": 20}
def get(root, name):
vars = root.findall(name)
return vars
def get_and_check(root, name, length):
vars = root.findall(name)
if len(vars) == 0:
raise ValueError("Can not find %s in %s." % (name, root.tag))
if length > 0 and len(vars) != length:
raise ValueError(
"The size of %s is supposed to be %d, but is %d."
% (name, length, len(vars))
)
if length == 1:
vars = vars[0]
return vars
def get_filename_as_int(filename):
try:
filename = filename.replace("\\", "/")
filename = os.path.splitext(os.path.basename(filename))[0]
return int(filename)
except:
# raise ValueError("Filename %s is supposed to be an integer." % (filename))
image_id = np.array([ord(char) % 10000 for char in filename], dtype=np.int32).sum()
# print(image_id)
return 0
def get_categories(xml_files):
"""Generate category name to id mapping from a list of xml files.
Arguments:
xml_files {list} -- A list of xml file paths.
Returns:
dict -- category name to id mapping.
"""
classes_names = []
for xml_file in xml_files:
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall("object"):
classes_names.append(member[0].text)
classes_names = list(set(classes_names))
classes_names.sort()
return {name: i for i, name in enumerate(classes_names)}
def convert(xml_files, json_file):
json_dict = {"images": [], "type": "instances", "annotations": [], "categories": []}
if PRE_DEFINE_CATEGORIES is not None:
categories = PRE_DEFINE_CATEGORIES
else:
categories = get_categories(xml_files)
bnd_id = START_BOUNDING_BOX_ID
image_id = START_IMAGE_ID
for xml_file in xml_files:
tree = ET.parse(xml_file)
root = tree.getroot()
path = get(root, "path")
if len(path) == 1:
filename = os.path.basename(path[0].text)
elif len(path) == 0:
filename = get_and_check(root, "filename", 1).text
else:
raise ValueError("%d paths found in %s" % (len(path), xml_file))
## The filename must be a number
# image_id = get_filename_as_int(filename)
size = get_and_check(root, "size", 1)
width = int(get_and_check(size, "width", 1).text)
height = int(get_and_check(size, "height", 1).text)
if ".jpg" not in filename or ".png" not in filename:
filename = filename + ".jpg"
warnings.warn("filename's default suffix is jpg")
images = {
"file_name": filename, # 图片名
"height": height,
"width": width,
"id": image_id, # 图片的ID编号(每张图片ID是唯一的)
}
json_dict["images"].append(images)
## Currently we do not support segmentation.
# segmented = get_and_check(root, 'segmented', 1).text
# assert segmented == '0'
for obj in get(root, "object"):
category = get_and_check(obj, "name", 1).text
if category not in categories:
new_id = len(categories)
categories[category] = new_id
category_id = categories[category]
bndbox = get_and_check(obj, "bndbox", 1)
xmin = int(get_and_check(bndbox, "xmin", 1).text) - 1
ymin = int(get_and_check(bndbox, "ymin", 1).text) - 1
xmax = int(get_and_check(bndbox, "xmax", 1).text)
ymax = int(get_and_check(bndbox, "ymax", 1).text)
assert xmax > xmin
assert ymax > ymin
o_width = abs(xmax - xmin)
o_height = abs(ymax - ymin)
ann = {
"area": o_width * o_height,
"iscrowd": 0,
"image_id": image_id, # 对应的图片ID(与images中的ID对应)
"bbox": [xmin, ymin, o_width, o_height],
"category_id": category_id,
"id": bnd_id, # 同一张图片可能对应多个 ann
"ignore": 0,
"segmentation": [],
}
json_dict["annotations"].append(ann)
bnd_id = bnd_id + 1
image_id += 1
for cate, cid in categories.items():
cat = {"supercategory": "none", "id": cid, "name": cate}
json_dict["categories"].append(cat)
os.makedirs(os.path.dirname(json_file), exist_ok=True)
json.dump(json_dict, open(json_file, 'w'), indent=4)
if __name__ == "__main__":
# import argparse
# parser = argparse.ArgumentParser(
# description="Convert Pascal VOC annotation to COCO format."
# )
# parser.add_argument("xml_dir", help="Directory path to xml files.", type=str)
# parser.add_argument("json_file", help="Output COCO format json file.", type=str)
# args = parser.parse_args()
# args.xml_dir
# args.json_file
xml_dir = "./xml"
json_file = "./train.json" # output json
xml_files = glob.glob(os.path.join(xml_dir, "*.xml"))
# If you want to do train/test split, you can pass a subset of xml files to convert function.
print("Number of xml files: {}".format(len(xml_files)))
convert(xml_files, json_file)
print("Success: {}".format(json_file))
此版本极其好用,可以切分训练集集验证集
注意修改
classes:自己的目标类别
xml_dir:图片与xml文件
img_dir: xml_dir的上级目录
#coding:utf-8
# pip install lxml
import os
import glob
import json
import shutil
import numpy as np
import xml.etree.ElementTree as ET
path2 = "./coco/" # 输出文件夹
# classes = ['plane', 'baseball-diamond', 'bridge', 'ground-track-field',
# 'small-vehicle', 'large-vehicle', 'ship',
# 'tennis-court', 'basketball-court',
# 'storage-tank', 'soccer-ball-field',
# 'roundabout', 'harbor',
# 'swimming-pool', 'helicopter','container-crane',] # 类别
classes=['plastic_bag','carton','plastic_bottle','hydrophyte','deciduous_aggregates','plastic_cup','cans']
xml_dir = "Annotations/" # xml文件
img_dir = "/media/wntlab/39e84b7d-5985-43ce-a0fa-a7f312f85897/HJK/dataset/data_voc_2021.11.1/" # 图片
train_ratio = 0.85 # 训练集的比例
START_BOUNDING_BOX_ID = 1
def get(root, name):
return root.findall(name)
def get_and_check(root, name, length):
vars = root.findall(name)
if len(vars) == 0:
raise NotImplementedError('Can not find %s in %s.'%(name, root.tag))
if length > 0 and len(vars) != length:
raise NotImplementedError('The size of %s is supposed to be %d, but is %d.'%(name, length, len(vars)))
if length == 1:
vars = vars[0]
return vars
def convert(xml_list, json_file):
json_dict = {"images": [], "type": "instances", "annotations": [], "categories": []}
categories = pre_define_categories.copy()
bnd_id = START_BOUNDING_BOX_ID
all_categories = {}
for index, line in enumerate(xml_list):
# print("Processing %s"%(line))
xml_f = line
tree = ET.parse(xml_f)
root = tree.getroot()
filename = os.path.basename(xml_f)[:-4] + ".JPG"
image_id = 20190000001 + index
size = get_and_check(root, 'size', 1)
width = int(get_and_check(size, 'width', 1).text)
height = int(get_and_check(size, 'height', 1).text)
image = {'file_name': filename, 'height': height, 'width': width, 'id':image_id}
json_dict['images'].append(image)
## Cruuently we do not support segmentation
# segmented = get_and_check(root, 'segmented', 1).text
# assert segmented == '0'
for obj in get(root, 'object'):
category = get_and_check(obj, 'name', 1).text
if category in all_categories:
all_categories[category] += 1
else:
all_categories[category] = 1
if category not in categories:
if only_care_pre_define_categories:
continue
new_id = len(categories) + 1
print("[warning] category '{}' not in 'pre_define_categories'({}), create new id: {} automatically".format(category, pre_define_categories, new_id))
categories[category] = new_id
category_id = categories[category]
bndbox = get_and_check(obj, 'bndbox', 1)
xmin = int(float(get_and_check(bndbox, 'xmin', 1).text))
ymin = int(float(get_and_check(bndbox, 'ymin', 1).text))
xmax = int(float(get_and_check(bndbox, 'xmax', 1).text))
ymax = int(float(get_and_check(bndbox, 'ymax', 1).text))
assert(xmax > xmin), "xmax <= xmin, {}".format(line)
assert(ymax > ymin), "ymax <= ymin, {}".format(line)
o_width = abs(xmax - xmin)
o_height = abs(ymax - ymin)
ann = {'area': o_width*o_height, 'iscrowd': 0, 'image_id':
image_id, 'bbox':[xmin, ymin, o_width, o_height],
'category_id': category_id, 'id': bnd_id, 'ignore': 0,
'segmentation': []}
json_dict['annotations'].append(ann)
bnd_id = bnd_id + 1
for cate, cid in categories.items():
cat = {'supercategory': 'none', 'id': cid, 'name': cate}
json_dict['categories'].append(cat)
json_fp = open(json_file, 'w')
json_str = json.dumps(json_dict)
json_fp.write(json_str)
json_fp.close()
print("------------create {} done--------------".format(json_file))
print("find {} categories: {} -->>> your pre_define_categories {}: {}".format(len(all_categories), all_categories.keys(), len(pre_define_categories), pre_define_categories.keys()))
print("category: id --> {}".format(categories))
print(categories.keys())
print(categories.values())
if __name__ == '__main__':
pre_define_categories = {}
for i, cls in enumerate(classes):
pre_define_categories[cls] = i + 1
# pre_define_categories = {'a1': 1, 'a3': 2, 'a6': 3, 'a9': 4, "a10": 5}
only_care_pre_define_categories = True
# only_care_pre_define_categories = False
if os.path.exists(path2 + "/annotations"):
shutil.rmtree(path2 + "/annotations")
os.makedirs(path2 + "/annotations")
if os.path.exists(path2 + "/train2017"):
shutil.rmtree(path2 + "/train2017")
os.makedirs(path2 + "/train2017")
if os.path.exists(path2 + "/val2017"):
shutil.rmtree(path2 +"/val2017")
os.makedirs(path2 + "/val2017")
save_json_train = path2 + 'annotations/instances_train2017.json'
save_json_val = path2 + 'annotations/instances_val2017.json'
xml_list = glob.glob(xml_dir + "/*.xml")
xml_list = np.sort(xml_list)
np.random.seed(100)
np.random.shuffle(xml_list)
train_num = int(len(xml_list)*train_ratio)
xml_list_train = xml_list[:train_num]
xml_list_val = xml_list[train_num:]
convert(xml_list_train, save_json_train)
convert(xml_list_val, save_json_val)
f1 = open(path2 + "train.txt", "w")
for xml in xml_list_train:
img = img_dir + xml.split("\\")[-1][:-4] + ".JPG"
f1.write(os.path.basename(xml)[:-4] + "\n")
shutil.copyfile(img, path2 + "/train2017/" + os.path.basename(img))
f2 = open(path2 + "test.txt", "w")
for xml in xml_list_val:
img = img_dir + xml.split("\\")[-1][:-4] + ".JPG"
f2.write(os.path.basename(xml)[:-4] + "\n")
shutil.copyfile(img, path2 + "/val2017/" + os.path.basename(img))
f1.close()
f2.close()
print("-------------------------------")
print("train number:", len(xml_list_train))
print("val number:", len(xml_list_val))
二、将COCO格式的json文件转化为VOC格式的xml文件
如果是要将COCO格式的json文件转化为VOC格式的xml文件,将anno和xml_dir改成json文件路径和转化后的xml文件保存路径,执行下面代码即可完成转化。
# coco2voc.py
# pip install pycocotools
import os
import time
import json
import pandas as pd
from tqdm import tqdm
from pycocotools.coco import COCO
#json文件路径和用于存放xml文件的路径
anno = 'C:/Users/user/Desktop/val/instances_val2017.json'
xml_dir = 'C:/Users/user/Desktop/val/xml/'
coco = COCO(anno) # 读文件
cats = coco.loadCats(coco.getCatIds()) # 这里loadCats就是coco提供的接口,获取类别
# Create anno dir
dttm = time.strftime("%Y%m%d%H%M%S", time.localtime())
def trans_id(category_id):
names = []
namesid = []
for i in range(0, len(cats)):
names.append(cats[i]['name'])
namesid.append(cats[i]['id'])
index = namesid.index(category_id)
return index
def convert(anno,xml_dir):
with open(anno, 'r') as load_f:
f = json.load(load_f)
imgs = f['images'] #json文件的img_id和图片对应关系 imgs列表表示多少张图
cat = f['categories']
df_cate = pd.DataFrame(f['categories']) # json中的类别
df_cate_sort = df_cate.sort_values(["id"], ascending=True) # 按照类别id排序
categories = list(df_cate_sort['name']) # 获取所有类别名称
print('categories = ', categories)
df_anno = pd.DataFrame(f['annotations']) # json中的annotation
for i in tqdm(range(len(imgs))): # 大循环是images所有图片,Tqdm是可扩展的Python进度条,可以在长循环中添加一个进度提示信息
xml_content = []
file_name = imgs[i]['file_name'] # 通过img_id找到图片的信息
height = imgs[i]['height']
img_id = imgs[i]['id']
width = imgs[i]['width']
version =['"1.0"','"utf-8"']
# xml文件添加属性
xml_content.append("<?xml version=" + version[0] +" "+ "encoding="+ version[1] + "?>")
xml_content.append("<annotation>")
xml_content.append(" <filename>" + file_name + "</filename>")
xml_content.append(" <size>")
xml_content.append(" <width>" + str(width) + "</width>")
xml_content.append(" <height>" + str(height) + "</height>")
xml_content.append(" <depth>"+ "3" + "</depth>")
xml_content.append(" </size>")
# 通过img_id找到annotations
annos = df_anno[df_anno["image_id"].isin([img_id])] # (2,8)表示一张图有两个框
for index, row in annos.iterrows(): # 一张图的所有annotation信息
bbox = row["bbox"]
category_id = row["category_id"]
cate_name = categories[trans_id(category_id)]
# add new object
xml_content.append(" <object>")
xml_content.append(" <name>" + cate_name + "</name>")
xml_content.append(" <truncated>0</truncated>")
xml_content.append(" <difficult>0</difficult>")
xml_content.append(" <bndbox>")
xml_content.append(" <xmin>" + str(int(bbox[0])) + "</xmin>")
xml_content.append(" <ymin>" + str(int(bbox[1])) + "</ymin>")
xml_content.append(" <xmax>" + str(int(bbox[0] + bbox[2])) + "</xmax>")
xml_content.append(" <ymax>" + str(int(bbox[1] + bbox[3])) + "</ymax>")
xml_content.append(" </bndbox>")
xml_content.append(" </object>")
xml_content.append("</annotation>")
x = xml_content
xml_content = [x[i] for i in range(0, len(x)) if x[i] != "\n"]
### list存入文件
#xml_path = os.path.join(xml_dir, file_name.replace('.xml', '.jpg'))
xml_path = os.path.join(xml_dir, file_name.split('j')[0]+'xml')
print(xml_path)
with open(xml_path, 'w+', encoding="utf8") as f:
f.write('\n'.join(xml_content))
xml_content[:] = []
if __name__ == '__main__':
convert(anno,xml_dir)
三、VOC到YOLO
import xml.etree.ElementTree as ET
import os
# box [xmin,ymin,xmax,ymax]
def convert(size, box):
x_center = (box[2] + box[0]) / 2.0
y_center = (box[3] + box[1]) / 2.0
# 归一化
x = x_center / size[0]
y = y_center / size[1]
# 求宽高并归一化
w = (box[2] - box[0]) / size[0]
h = (box[3] - box[1]) / size[1]
return (x, y, w, h)
def convert_annotation(xml_paths, yolo_paths, classes):
xml_files = os.listdir(xml_paths)
# 生成无序文件列表
print(f'xml_files:{xml_files}')
for file in xml_files:
xml_file_path = os.path.join(xml_paths, file)
yolo_txt_path = os.path.join(yolo_paths, file.split(".")[0]
+ ".txt")
tree = ET.parse(xml_file_path)
root = tree.getroot()
size = root.find("size")
# 获取xml的width和height的值
w = int(size.find("width").text)
h = int(size.find("height").text)
# object标签可能会存在多个,所以要迭代
with open(yolo_txt_path, 'w') as f:
for obj in root.iter("object"):
difficult = obj.find("difficult").text
# 种类类别
cls = obj.find("name").text
if cls not in classes or difficult == 1:
continue
# 转换成训练模式读取的标签
cls_id = classes.index(cls)
xml_box = obj.find("bndbox")
box = (float(xml_box.find("xmin").text), float(xml_box.find("ymin").text),
float(xml_box.find("xmax").text), float(xml_box.find("ymax").text))
boxex = convert((w, h), box)
# yolo标准格式类别 x_center,y_center,width,height
f.write(str(cls_id) + " " + " ".join([str(s) for s in boxex]) + '\n')
if __name__ == "__main__":
# 数据的类别
classes_train = ['ignored regions', 'pedestrian', 'people',
'bicycle','car', 'van', 'truck',
'tricycle','awning-tricycle','bus',
'motor', 'others']
# xml存储地址
xml_dir = "./xml1/"
# yolo存储地址
yolo_txt_dir = "./Yolo_txt/"
# voc转yolo
convert_annotation(xml_paths=xml_dir, yolo_paths=yolo_txt_dir,
classes=classes_train)
在转换之前先要制定classes_train(训练集的类别),xml_dir(voc格式的路径)、yolo_txt_dir(yolo格式标注存储的路径)
四、yolo转voc
from xml.dom.minidom import Document
import os
import cv2
# def makexml(txtPath, xmlPath, picPath): # txt所在文件夹路径,xml文件保存路径,图片所在文件夹路径
def makexml(picPath, txtPath, xmlPath): # txt所在文件夹路径,xml文件保存路径,图片所在文件夹路径
"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件
"""
dic = {'0': "0", # 创建字典用来对类型进行转换
'1': "1", # 此处的字典要与自己的classes.txt文件中的类对应,且顺序要一致
}
files = os.listdir(txtPath)
print(files)
for i, name in enumerate(files):
xmlBuilder = Document()
annotation = xmlBuilder.createElement("annotation") # 创建annotation标签
xmlBuilder.appendChild(annotation)
txtFile = open(txtPath + name)
# print(txtFile)
txtList = txtFile.readlines()
# print(txtList)
img = cv2.imread(picPath + name[0:-4] + ".jpg")
print(name[0:-4])
Pheight, Pwidth, Pdepth = img.shape
folder = xmlBuilder.createElement("folder") # folder标签
foldercontent = xmlBuilder.createTextNode("driving_annotation_dataset")
folder.appendChild(foldercontent)
annotation.appendChild(folder) # folder标签结束
filename = xmlBuilder.createElement("filename") # filename标签
filenamecontent = xmlBuilder.createTextNode(name[0:-4] + ".jpg")
filename.appendChild(filenamecontent)
annotation.appendChild(filename) # filename标签结束
size = xmlBuilder.createElement("size") # size标签
width = xmlBuilder.createElement("width") # size子标签width
widthcontent = xmlBuilder.createTextNode(str(Pwidth))
width.appendChild(widthcontent)
size.appendChild(width) # size子标签width结束
height = xmlBuilder.createElement("height") # size子标签height
heightcontent = xmlBuilder.createTextNode(str(Pheight))
height.appendChild(heightcontent)
size.appendChild(height) # size子标签height结束
depth = xmlBuilder.createElement("depth") # size子标签depth
depthcontent = xmlBuilder.createTextNode(str(Pdepth))
depth.appendChild(depthcontent)
size.appendChild(depth) # size子标签depth结束
annotation.appendChild(size) # size标签结束
for j in txtList:
oneline = j.strip().split(" ")
object = xmlBuilder.createElement("object") # object 标签
picname = xmlBuilder.createElement("name") # name标签
namecontent = xmlBuilder.createTextNode(dic[oneline[0]])
# print(namecontent)
picname.appendChild(namecontent)
object.appendChild(picname) # name标签结束
pose = xmlBuilder.createElement("pose") # pose标签
posecontent = xmlBuilder.createTextNode("Unspecified")
pose.appendChild(posecontent)
object.appendChild(pose) # pose标签结束
truncated = xmlBuilder.createElement("truncated") # truncated标签
truncatedContent = xmlBuilder.createTextNode("0")
truncated.appendChild(truncatedContent)
object.appendChild(truncated) # truncated标签结束
difficult = xmlBuilder.createElement("difficult") # difficult标签
difficultcontent = xmlBuilder.createTextNode("0")
difficult.appendChild(difficultcontent)
object.appendChild(difficult) # difficult标签结束
bndbox = xmlBuilder.createElement("bndbox") # bndbox标签
xmin = xmlBuilder.createElement("xmin") # xmin标签
mathData = int(((float(oneline[1])) * Pwidth + 1) - (float(oneline[3])) * 0.5 * Pwidth)
xminContent = xmlBuilder.createTextNode(str(mathData))
xmin.appendChild(xminContent)
bndbox.appendChild(xmin) # xmin标签结束
ymin = xmlBuilder.createElement("ymin") # ymin标签
mathData = int(((float(oneline[2])) * Pheight + 1) - (float(oneline[4])) * 0.5 * Pheight)
yminContent = xmlBuilder.createTextNode(str(mathData))
ymin.appendChild(yminContent)
bndbox.appendChild(ymin) # ymin标签结束
xmax = xmlBuilder.createElement("xmax") # xmax标签
mathData = int(((float(oneline[1])) * Pwidth + 1) + (float(oneline[3])) * 0.5 * Pwidth)
xmaxContent = xmlBuilder.createTextNode(str(mathData))
xmax.appendChild(xmaxContent)
bndbox.appendChild(xmax) # xmax标签结束
ymax = xmlBuilder.createElement("ymax") # ymax标签
mathData = int(((float(oneline[2])) * Pheight + 1) + (float(oneline[4])) * 0.5 * Pheight)
ymaxContent = xmlBuilder.createTextNode(str(mathData))
ymax.appendChild(ymaxContent)
bndbox.appendChild(ymax) # ymax标签结束
object.appendChild(bndbox) # bndbox标签结束
annotation.appendChild(object) # object标签结束
f = open(xmlPath + name[0:-4] + ".xml", 'w')
xmlBuilder.writexml(f, indent='\t', newl='\n', addindent='\t', encoding='utf-8')
f.close()
if __name__ == "__main__":
picPath = "model/YOLOX/datasets/VOC/VOCdevkit/VOC2007/JPEGImages/" # 图片所在文件夹路径,后面的/一定要带上
txtPath = "model/YOLOX/datasets/VOC/VOCdevkit/VOC2007/labels/lables/" # txt所在文件夹路径,后面的/一定要带上
xmlPath = "model/YOLOX/datasets/VOC/VOCdevkit/VOC2007/Annotations/" # xml文件保存路径,后面的/一定要带上
makexml(picPath, txtPath, xmlPath)
以上代码只需要依照自身情况对dic、picPath、txtPath、xmlPath进行更改即可转换。以上是yolo格式转voc格式
五、yolo转coco
"""
YOLO 格式的数据集转化为 COCO 格式的数据集
--root_path 输入根路径
"""
import os
import cv2
import json
from tqdm import tqdm
import argparse
import glob
parser = argparse.ArgumentParser("ROOT SETTING")
parser.add_argument('--root_path', type=str, default='coco', help="root path of images and labels")
arg = parser.parse_args()
# 默认划分比例为 8:1:1。 第一个划分点在8/10处,第二个在9/10。
VAL_SPLIT_POINT = 4 / 5
TEST_SPLIT_POINT = 9 / 10
root_path = arg.root_path
print(root_path)
# 原始标签路径
originLabelsDir = os.path.join(root_path, 'labels/*/*.txt')
# 原始标签对应的图片路径
originImagesDir = os.path.join(root_path, 'images/*/*.jpg')
# dataset用于保存所有数据的图片信息和标注信息
train_dataset = {'categories': [], 'annotations': [], 'images': []}
val_dataset = {'categories': [], 'annotations': [], 'images': []}
test_dataset = {'categories': [], 'annotations': [], 'images': []}
# 打开类别标签
with open(os.path.join(root_path, 'classes.txt')) as f:
classes = f.read().strip().split()
# 建立类别标签和数字id的对应关系
for i, cls in enumerate(classes, 1):
train_dataset['categories'].append({'id': i, 'name': cls, 'supercategory': 'fish'})
val_dataset['categories'].append({'id': i, 'name': cls, 'supercategory': 'fish'})
test_dataset['categories'].append({'id': i, 'name': cls, 'supercategory': 'fish'})
# 读取images文件夹的图片名称
indexes = glob.glob(originImagesDir)
print(len(indexes))
# ---------------接着将,以上数据转换为COCO所需要的格式---------------
for k, index in enumerate(tqdm(indexes)):
txtFile = index.replace('images', 'labels').replace('jpg', 'txt')
# 用opencv读取图片,得到图像的宽和高
im = cv2.imread(index)
H, W, _ = im.shape
# 切换dataset的引用对象,从而划分数据集
if k + 1 > round(len(indexes) * VAL_SPLIT_POINT):
if k + 1 > round(len(indexes) * TEST_SPLIT_POINT):
dataset = test_dataset
else:
dataset = val_dataset
else:
dataset = train_dataset
# 添加图像的信息到dataset中
if (os.path.exists(txtFile)):
with open(txtFile, 'r') as fr:
dataset['images'].append({'file_name': index.replace("\\", "/"),
'id': k,
'width': W,
'height': H})
labelList = fr.readlines()
for label in labelList:
label = label.strip().split()
x = float(label[1])
y = float(label[2])
w = float(label[3])
h = float(label[4])
# convert x,y,w,h to x1,y1,x2,y2
# imagePath = os.path.join(originImagesDir,
# txtFile.replace('txt', 'jpg'))
image = cv2.imread(index)
x1 = (x - w / 2) * W
y1 = (y - h / 2) * H
x2 = (x + w / 2) * W
y2 = (y + h / 2) * H
x1 = int(x1)
y1 = int(y1)
x2 = int(x2)
y2 = int(y2)
# 为了与coco标签方式对,标签序号从1开始计算
cls_id = int(label[0]) + 1
width = max(0, x2 - x1)
height = max(0, y2 - y1)
dataset['annotations'].append({
'area': width * height,
'bbox': [x1, y1, width, height],
'category_id': int(cls_id),
'id': i,
'image_id': k,
'iscrowd': 0,
# mask, 矩形是从左上角点按顺时针的四个顶点
'segmentation': [[x1, y1, x2, y1, x2, y2, x1, y2]]
})
# print(dataset)
# break
else:
continue
# 保存结果的文件夹
folder = os.path.join(root_path, 'annotations')
if not os.path.exists(folder):
os.makedirs(folder)
for phase in ['train', 'val', 'test']:
json_name = os.path.join(root_path, 'annotations/{}.json'.format(phase))
with open(json_name, 'w', encoding="utf-8") as f:
if phase == 'train':
json.dump(train_dataset, f, ensure_ascii=False, indent=1)
if phase == 'val':
json.dump(val_dataset, f, ensure_ascii=False, indent=1)
if phase == 'test':
json.dump(test_dataset, f, ensure_ascii=False, indent=1)
六、人脸数据集转yolo
from xml.dom.minidom import Document
import os
import cv2
def convert(size, box):
x_center = (float(box[2]) + float(box[0])) / 2.0
y_center = (float(box[3]) + float(box[1])) / 2.0
# 归一化
x = x_center / size[0]
y = y_center / size[1]
# 求宽高并归一化
w = (float(box[2]) - float(box[0])) / size[0]
h = (float(box[3]) - float(box[1])) / size[1]
return (x, y, w, h)
def makexml(picPath, facePath, txtPath):
dic = {'0': "0", # 创建字典用来对类型进行转换
}
files = os.listdir(facePath)
# print("1", files)
for i, name in enumerate(files):
txtFile = open(facePath + name)
txtList = txtFile.readlines()
print("name", name)
img = cv2.imread(picPath + name[0:-4] + ".png")
Pheight, Pwidth, Pdepth = img.shape
yolo_txt_path = os.path.join(txtPath, name.split(".")[0]
+ ".txt")
with open(yolo_txt_path, 'w') as f:
for j in txtList:
box = j.strip().split(" ")
if len(j) < 4:
pass
else:
boxex = convert((Pwidth, Pheight), box)
# yolo标准格式类别 x_center,y_center,width,height
f.write("0" + " " + " ".join([str(s) for s in boxex]) + '\n')
if __name__ == "__main__":
picPath = "model/datasets/DarkFace_Train_2021/image/"
facePath = "model/datasets/DarkFace_Train_2021/label/"
txtPath = "model/datasets/DarkFace_Train_2021/labels/"
makexml(picPath, facePath, txtPath)
七、LEVIE数据集转yolo
import os
import cv2
def convert(size, box):
x_center = (int(box[3]) + int(box[1])) / 2.0
y_center = (int(box[4]) + int(box[2])) / 2.0
# 归一化
x = x_center / int(size[0])
y = y_center / int(size[1])
# 求宽高并归一化
w = (int(box[3]) - int(box[1])) / size[0]
h = (int(box[4]) - int(box[2])) / size[1]
return (int(box[0]), x, y, w, h)
def makexml(picPath, txtPath, yolo_paths): # txt所在文件夹路径,yolo文件保存路径,图片所在文件夹路径
"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件
"""
files = os.listdir(txtPath)
for i, name in enumerate(files):
yolo_txt_path = os.path.join(yolo_paths, name.split(".")[0]
+ ".txt")
txtFile = open(txtPath + name)
with open(yolo_txt_path, 'w') as f:
txtList = txtFile.readlines()
img = cv2.imread(picPath + name[0:-4] + ".jpg")
Pheight, Pwidth, _ = img.shape
for j in txtList:
oneline = j.strip().split(" ")
obj = oneline[0]
xmin = oneline[1]
if int(xmin) < 0 :
xmin = "1"
ymax = oneline[2]
if int(ymax) < 0 :
ymax = "1"
xmax = oneline[3]
ymin = oneline[4]
box = convert((Pwidth, Pheight), oneline)
f.write(str(box[0]) + " " + str(box[1]) + " " + str(box[2]) + " " + str(box[3]) + " " + str(box[4]) + '\n')
if __name__ == "__main__":
picPath = "./out/" # 图片所在文件夹路径,后面的/一定要带上
txtPath = "./labels/" # txt所在文件夹路径,后面的/一定要带上
yolo = "./xml/" # xml文件保存路径,后面的/一定要带上
makexml(picPath, txtPath, yolo)
八、NWPU VHR-10 dataset
import os
import cv2
def convert(size, box):
x_center = (int(box[3]) + int(box[1])) / 2.0
y_center = (int(box[4]) + int(box[2])) / 2.0
# 归一化
x = x_center / int(size[0])
y = y_center / int(size[1])
# 求宽高并归一化
w = (int(box[3]) - int(box[1])) / size[0]
h = (int(box[4]) - int(box[2])) / size[1]
return (int(box[0]), x, y, w, h)
def makexml(picPath, txtPath, yolo_paths): # txt所在文件夹路径,yolo文件保存路径,图片所在文件夹路径
"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件
"""
files = os.listdir(txtPath)
for i, name in enumerate(files):
print(name)
yolo_txt_path = os.path.join(yolo_paths, name.split(".")[0]
+ ".txt")
txtFile = open(txtPath + name)
with open(yolo_txt_path, 'w') as f:
txtList = txtFile.readlines()
img = cv2.imread(picPath + name[0:-4] + ".jpg")
Pheight, Pwidth, _ = img.shape
for j in txtList:
oneline = j.strip().split(",")
a = int(oneline[4])
b = int(oneline[0][1:])
c = int(oneline[1][:-1])
d = int(oneline[2][1:])
e = int(oneline[3][:-1])
oneline = (int(oneline[4]), int(oneline[0][1:]), int(oneline[1][:-1]), int(oneline[2][1:]), int(oneline[3][:-1]))
box = convert((Pwidth, Pheight), oneline)
f.write(str(box[0]) + " " + str(box[1]) + " " + str(box[2]) + " " + str(box[3]) + " " + str(box[4]) + '\n')
if __name__ == "__main__":
picPath = "./image/" # 图片所在文件夹路径,后面的/一定要带上
txtPath = "./txt/" # txt所在文件夹路径,后面的/一定要带上
yolo = "./xml/" # xml文件保存路径,后面的/一定要带上
makexml(picPath, txtPath, yolo)
九、UCAS_AOD
import os
import cv2
import math
def convert(size, box):
x_center = box[1] + box[3] / 2.0
y_center = box[2] + box[4] / 2.0
# 归一化
x = x_center / int(size[0])
y = y_center / int(size[1])
# 求宽高并归一化
w = box[3] / size[0]
h = box[4] / size[1]
return (int(box[0]), x, y, w, h)
def fun(str_num):
before_e = float(str_num.split('e')[0])
sign = str_num.split('e')[1][:1]
after_e = int(str_num.split('e')[1][1:])
if sign == '+':
float_num = before_e * math.pow(10, after_e)
elif sign == '-':
float_num = before_e * math.pow(10, -after_e)
else:
float_num = None
print('error: unknown sign')
return float_num
def makexml(picPath, txtPath, yolo_paths): # txt所在文件夹路径,yolo文件保存路径,图片所在文件夹路径
"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件
"""
files = os.listdir(txtPath)
for i, name in enumerate(files):
print(name)
yolo_txt_path = os.path.join(yolo_paths, name.split(".")[0]
+ ".txt")
txtFile = open(txtPath + name)
with open(yolo_txt_path, 'w') as f:
txtList = txtFile.readlines()
img = cv2.imread(picPath + name[0:-4] + ".png")
Pheight, Pwidth, _ = img.shape
for j in txtList:
oneline = j.strip().split("\t")
try:
int(oneline[9])
except ValueError:
a = fun(oneline[9])
else:
a = int(oneline[9])
try:
int(oneline[10])
except ValueError:
b = fun(oneline[10])
else:
b = int(oneline[10])
try:
int(oneline[11])
except ValueError:
c = fun(oneline[11])
else:
c = int(oneline[11])
try:
int(oneline[12])
except ValueError:
d = fun(oneline[12])
else:
d = int(oneline[12])
oneline = (1, a, b, c, d)
box = convert((Pwidth, Pheight), oneline)
f.write(str(box[0]) + " " + str(box[1]) + " " + str(box[2]) + " " + str(box[3]) + " " + str(box[4]) + '\n')
if __name__ == "__main__":
picPath = "./CAR/" # 图片所在文件夹路径,后面的/一定要带上
txtPath = "./labels/" # txt所在文件夹路径,后面的/一定要带上
yolo = "./xml/" # xml文件保存路径,后面的/一定要带上
makexml(picPath, txtPath, yolo)
``
在运行代码之前数据集根目录下放置classes.txt文件。只需要指定–root_path即可进行转换
以上部分感谢博主的分享:https://blog.csdn.net/qq_40502460/article/details/116564254