mar20数据集OBB标签转yolo格式
前言
MAR20是目前规模最大的遥感图像军用飞机目标识别数据集,包含3842张图像、20种军用飞机型号以及22341个实例,并且每个目标实例具有水平边界框和有向边界框两种标注方式。
因目前没有将MAR20的OBB标签转为YOLO格式的代码,故自己用python写了一些,在此记录过程。
1.mar20中OBB标签数据结构
// 1.xml
<annotation>
<filename>1.jpg</filename>
<source><database>MAR20</database></source>
<size><width>859</width><height>831</height><depth>3</depth></size><segmented>0</segmented>
<object>
<type>robndbox</type>
<name>A2</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<robndbox>
<x_left_top>516.2807017543859</x_left_top>
<y_left_top>414.78947368421046</y_left_top>
<x_right_top>570.6666666666666</x_right_top>
<y_right_top>445.4912280701754</y_right_top>
<x_right_bottom>524.1754385964912</x_right_bottom>
<y_right_bottom>517.4210526315788</y_right_bottom>
<x_left_bottom>469.7894736842104</x_left_bottom>
<y_left_bottom>484.0877192982456</y_left_bottom>
</robndbox>
<angle>0</angle>
</object>
<object>
此处省略剩余object...
</object>
</annotation>
可以看到,上面的xml中包含的信息主要为图像宽高(width,height)、目标(object)的旋转框信息(name和四点坐标),其中angle为多余信息,因为四个点的坐标已经可以画出旋转框了。
2.python代码
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import copy
from lxml.etree import Element, SubElement, tostring, ElementTree
import xml.etree.ElementTree as ET
import pickle
import os
import numpy as np
import cv2
from os import listdir, getcwd
from os.path import join
# classes = ["0", "1", "2", "3"] # 类别
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
def convert_annotation(xmls_path, txts_path, xml_name):
xml_path = os.path.join(xmls_path, xml_name)
in_file = open(xml_path, encoding="UTF-8")
txt_path = os.path.join(txts_path,xml_name).split('.xml')[0] + '.txt'
# txt_path = xml_path.split('.xml')[0] + '.txt'
# in_file = open('./label_xml\%s.xml' % (image_id), encoding='UTF-8')
#
# out_file = open('./label_txt\%s.txt' % (image_id), 'w') # 生成txt格式文件
tree = ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
width = int(size.find('width').text)
height = int(size.find('height').text)
if width == 0 or height == 0:
print("%s文件中width和height为0" % xml_name)
for obj in root.iter('object'):
cls = obj.find('name').text
# print(cls)
# if cls not in classes:
# continue
# cls_id = classes.index(cls)
cls_id = 0
xmlbox = obj.find('robndbox')
b = ((float(xmlbox.find('x_left_top').text), float(xmlbox.find('y_left_top').text)),
(float(xmlbox.find('x_right_top').text),float(xmlbox.find('y_right_top').text)),
(float(xmlbox.find('x_left_bottom').text),float(xmlbox.find('y_left_bottom').text)),
(float(xmlbox.find('x_right_bottom').text),float(xmlbox.find('y_right_bottom').text))
)
poly = np.float32(np.array(b))
# 四点坐标归一化
poly[:, 0] = poly[:, 0] / width
poly[:, 1] = poly[:, 1] / height
rect = cv2.minAreaRect(poly) # 得到最小外接矩形的(中心(x,y), (宽,高), 旋转角度)
c_x = rect[0][0]
c_y = rect[0][1]
w = rect[1][0]
h = rect[1][1]
if w < h:
w, h = h, w
the = int(int(rect[-1]) + 90) % 180
else:
the = (int(rect[-1]) + 180)
with open(txt_path, 'a', encoding="UTF-8") as out_file: # 生成yolo的txt格式文件
out_file.write('%s %s %s %s %s %s\n' % (cls_id, c_x, c_y, w, h, the))
# xml_path = os.path.join(CURRENT_DIR, './label_xml/')
xmls_path = r"E:\code-dmx\data\MAR20\Annotations\Oriented Bounding Boxes"
txts_path = r"E:\code-dmx\data\MAR20\Annotations\output\txt"
# xml list
img_xmls = os.listdir(xmls_path)
from tqdm import tqdm
for img_xml in tqdm(img_xmls):
#label_name = img_xml.split('.')[0]
#print(label_name)
convert_annotation(xmls_path,txts_path, img_xml)
因本人无需识别飞机的具体型号,故未添加所有类别的cls_id到txt中,如需要可修改classes自行添加。