DOTA数据集 | 数据前&后处理操作系列

github:https://github.com/mary-0830/DOTA

一、把DOTA数据集进行切割,生成600*600大小的图片和xml文件(hbb和obb都ok!)

更新2020.6.27
tarin_crop.py

import os
import scipy.misc as misc
from xml.dom.minidom import Document
import numpy as np
import copy, cv2

def save_to_xml(save_path, im_width, im_height, objects_axis, label_name, name, hbb=True):
    im_depth = 0
    object_num = len(objects_axis)
    doc = Document()

    annotation = doc.createElement('annotation')
    doc.appendChild(annotation)

    folder = doc.createElement('folder')
    folder_name = doc.createTextNode('VOC2007')
    folder.appendChild(folder_name)
    annotation.appendChild(folder)

    filename = doc.createElement('filename')
    filename_name = doc.createTextNode(name)
    filename.appendChild(filename_name)
    annotation.appendChild(filename)

    source = doc.createElement('source')
    annotation.appendChild(source)

    database = doc.createElement('database')
    database.appendChild(doc.createTextNode('The VOC2007 Database'))
    source.appendChild(database)

    annotation_s = doc.createElement('annotation')
    annotation_s.appendChild(doc.createTextNode('PASCAL VOC2007'))
    source.appendChild(annotation_s)

    image = doc.createElement('image')
    image.appendChild(doc.createTextNode('flickr'))
    source.appendChild(image)

    flickrid = doc.createElement('flickrid')
    flickrid.appendChild(doc.createTextNode('322409915'))
    source.appendChild(flickrid)

    owner = doc.createElement('owner')
    annotation.appendChild(owner)

    flickrid_o = doc.createElement('flickrid')
    flickrid_o.appendChild(doc.createTextNode('knautia'))
    owner.appendChild(flickrid_o)

    name_o = doc.createElement('name')
    name_o.appendChild(doc.createTextNode('yang'))
    owner.appendChild(name_o)


    size = doc.createElement('size')
    annotation.appendChild(size)
    width = doc.createElement('width')
    width.appendChild(doc.createTextNode(str(im_width)))
    height = doc.createElement('height')
    height.appendChild(doc.createTextNode(str(im_height)))
    depth = doc.createElement('depth')
    depth.appendChild(doc.createTextNode(str(im_depth)))
    size.appendChild(width)
    size.appendChild(height)
    size.appendChild(depth)
    segmented = doc.createElement('segmented')
    segmented.appendChild(doc.createTextNode('0'))
    annotation.appendChild(segmented)
    for i in range(object_num):
        objects = doc.createElement('object')
        annotation.appendChild(objects)
        object_name = doc.createElement('name')
        object_name.appendChild(doc.createTextNode(label_name[int(objects_axis[i][-1])]))
        objects.appendChild(object_name)
        pose = doc.createElement('pose')
        pose.appendChild(doc.createTextNode('Unspecified'))
        objects.appendChild(pose)
        truncated = doc.createElement('truncated')
        truncated.appendChild(doc.createTextNode('1'))
        objects.appendChild(truncated)
        difficult = doc.createElement('difficult')
        difficult.appendChild(doc.createTextNode('0'))
        objects.appendChild(difficult)
        bndbox = doc.createElement('bndbox')
        objects.appendChild(bndbox)
        if hbb:
           x0 = doc.createElement('xmin')
           x0.appendChild(doc.createTextNode(str((objects_axis[i][0]))))
           bndbox.appendChild(x0)
           y0 = doc.createElement('ymin')
           y0.appendChild(doc.createTextNode(str((objects_axis[i][1]))))
           bndbox.appendChild(y0)


           x1 = doc.createElement('xmax')
           x1.appendChild(doc.createTextNode(str((objects_axis[i][2]))))
           bndbox.appendChild(x1)
           y1 = doc.createElement('ymax')
           y1.appendChild(doc.createTextNode(str((objects_axis[i][5]))))
           bndbox.appendChild(y1)       
        else:

            x0 = doc.createElement('x0')
            x0.appendChild(doc.createTextNode(str((objects_axis[i][0]))))
            bndbox.appendChild(x0)
            y0 = doc.createElement('y0')
            y0.appendChild(doc.createTextNode(str((objects_axis[i][1]))))
            bndbox.appendChild(y0)

            x1 = doc.createElement('x1')
            x1.appendChild(doc.createTextNode(str((objects_axis[i][2]))))
            bndbox.appendChild(x1)
            y1 = doc.createElement('y1')
            y1.appendChild(doc.createTextNode(str((objects_axis[i][3]))))
            bndbox.appendChild(y1)
            
            x2 = doc.createElement('x2')
            x2.appendChild(doc.createTextNode(str((objects_axis[i][4]))))
            bndbox.appendChild(x2)
            y2 = doc.createElement('y2')
            y2.appendChild(doc.createTextNode(str((objects_axis[i][5]))))
            bndbox.appendChild(y2)

            x3 = doc.createElement('x3')
            x3.appendChild(doc.createTextNode(str((objects_axis[i][6]))))
            bndbox.appendChild(x3)
            y3 = doc.createElement('y3')
            y3.appendChild(doc.createTextNode(str((objects_axis[i][7]))))
            bndbox.appendChild(y3)
        
    f = open(save_path,'w')
    f.write(doc.toprettyxml(indent = ''))
    f.close() 

class_list = ['plane', 'baseball-diamond', 'bridge', 'ground-track-field', 
'small-vehicle', 'large-vehicle', 'ship', 
'tennis-court', 'basketball-court',  
'storage-tank', 'soccer-ball-field', 
'roundabout', 'harbor', 
'swimming-pool', 'helicopter']




def format_label(txt_list):
    format_data = []
    for i in txt_list[2:]:
        format_data.append(
        [int(xy) for xy in i.split(' ')[:8]] + [class_list.index(i.split(' ')[8])]
        # {'x0': int(i.split(' ')[0]),
        # 'x1': int(i.split(' ')[2]),
        # 'x2': int(i.split(' ')[4]),
        # 'x3': int(i.split(' ')[6]),
        # 'y1': int(i.split(' ')[1]),
        # 'y2': int(i.split(' ')[3]),
        # 'y3': int(i.split(' ')[5]),
        # 'y4': int(i.split(' ')[7]),
        # 'class': class_list.index(i.split(' ')[8]) if i.split(' ')[8] in class_list else 0, 
        # 'difficulty': int(i.split(' ')[9])}
        )
        if i.split(' ')[8] not in class_list :
            print ('warning found a new label :', i.split(' ')[8])
            exit()
    return np.array(format_data)

def clip_image(file_idx, image, boxes_all, width, height):
    # print ('image shape', image.shape)
    if len(boxes_all) > 0:
        shape = image.shape
        for start_h in range(0, shape[0], 256):
            for start_w in range(0, shape[1], 256):
                boxes = copy.deepcopy(boxes_all)
                box = np.zeros_like(boxes_all)
                start_h_new = start_h
                start_w_new = start_w
                if start_h + height > shape[0]:
                  start_h_new = shape[0] - height
                if start_w + width > shape[1]:
                  start_w_new = shape[1] - width
                top_left_row = max(start_h_new, 0)
                top_left_col = max(start_w_new, 0)
                bottom_right_row = min(start_h + height, shape[0])
                bottom_right_col = min(start_w + width, shape[1])


                subImage = image[top_left_row:bottom_right_row, top_left_col: bottom_right_col]

                box[:, 0] = boxes[:, 0] - top_left_col
                box[:, 2] = boxes[:, 2] - top_left_col
                box[:, 4] = boxes[:, 4] - top_left_col
                box[:, 6] = boxes[:, 6] - top_left_col

                box[:, 1] = boxes[:, 1] - top_left_row
                box[:, 3] = boxes[:, 3] - top_left_row
                box[:, 5] = boxes[:, 5] - top_left_row
                box[:, 7] = boxes[:, 7] - top_left_row
                box[:, 8] = boxes[:, 8]
                center_y = 0.25*(box[:, 1] + box[:, 3] + box[:, 5] + box[:, 7])
                center_x = 0.25*(box[:, 0] + box[:, 2] + box[:, 4] + box[:, 6])
                # print('center_y', center_y)
                # print('center_x', center_x)
                # print ('boxes', boxes)
                # print ('boxes_all', boxes_all)
                # print ('top_left_col', top_left_col, 'top_left_row', top_left_row)

                cond1 = np.intersect1d(np.where(center_y[:]>=0 )[0], np.where(center_x[:]>=0 )[0])
                cond2 = np.intersect1d(np.where(center_y[:] <= (bottom_right_row - top_left_row))[0],
                                        np.where(center_x[:] <= (bottom_right_col - top_left_col))[0])
                idx = np.intersect1d(cond1, cond2)
                # idx = np.where(center_y[:]>=0 and center_x[:]>=0 and center_y[:] <= (bottom_right_row - top_left_row) and center_x[:] <= (bottom_right_col - top_left_col))[0]
                # save_path, im_width, im_height, objects_axis, label_name
                if len(idx) > 0:
                    name="%s_%04d_%04d.png" % (file_idx, top_left_row, top_left_col)
                    print(name)
                    xml = os.path.join(save_dir, 'labeltxt', "%s_%04d_%04d.xml" % (file_idx, top_left_row, top_left_col))
                    save_to_xml(xml, subImage.shape[1], subImage.shape[0], box[idx, :], class_list, str(name))
                    # print ('save xml : ', xml)
                    if subImage.shape[0] > 5 and subImage.shape[1] >5:
                        img = os.path.join(save_dir, 'images', "%s_%04d_%04d.png" % (file_idx, top_left_row, top_left_col))
                        cv2.imwrite(img, subImage)
        
    
    

print ('class_list', len(class_list))
raw_data = 'D:/datasets/DOTA/train/'
raw_images_dir = os.path.join(raw_data, 'images')
raw_label_dir = os.path.join(raw_data, 'labelTxt')

save_dir = 'D:/datasets/DOTA_clip/train/'

images = [i for i in os.listdir(raw_images_dir) if 'png' in i]
labels = [i for i in os.listdir(raw_label_dir) if 'txt' in i]

print ('find image', len(images))
print ('find label', len(labels))

min_length = 1e10
max_length = 1

for idx, img in enumerate(images):
# img = 'P1524.png'
    print (idx, 'read image', img)
    img_data = misc.imread(os.path.join(raw_images_dir, img))

    # if len(img_data.shape) == 2:
        # img_data = img_data[:, :, np.newaxis]
        # print ('find gray image')

    txt_data = open(os.path.join(raw_label_dir, img.replace('png', 'txt')), 'r').readlines()
    # print (idx, len(format_label(txt_data)), img_data.shape)
    # if max(img_data.shape[:2]) > max_length:
        # max_length = max(img_data.shape[:2])
    # if min(img_data.shape[:2]) < min_length:
        # min_length = min(img_data.shape[:2])
    # if idx % 50 ==0:
        # print (idx, len(format_label(txt_data)), img_data.shape)
        # print (idx, 'min_length', min_length, 'max_length', max_length)
    box = format_label(txt_data)
    clip_image(img.strip('.png'), img_data, box, 600, 600)
        
    
#     rm train/images/*   &&   rm train/labeltxt/*

在这里插入图片描述
在这里插入图片描述

二、在切割后图片中,进行统计各个类别的目标数量

cls_object.py

# -*- coding: utf-8 -*-
# -*- coding:utf-8 -*-
#根据xml文件统计目标种类以及数量
import os
import xml.etree.ElementTree as ET
import numpy as np
np.set_printoptions(suppress=True, threshold=np.nan)
import matplotlib
from PIL import Image
 
def parse_obj(xml_path, filename):
  tree=ET.parse(xml_path+filename)
  objects=[]
  for obj in tree.findall('object'):
    obj_struct={}
    obj_struct['name']=obj.find('name').text
    objects.append(obj_struct)
  return objects
 
 
def read_image(image_path, filename):
  im=Image.open(image_path+filename)
  W=im.size[0]
  H=im.size[1]
  area=W*H
  im_info=[W,H,area]
  return im_info
 
 
if __name__ == '__main__':
  xml_path='D:/datasets/DOTA_clip/val/labeltxt/'
  filenamess=os.listdir(xml_path)
  filenames=[]
  for name in filenamess:
    name=name.replace('.xml','')
    filenames.append(name)
  recs={}
  obs_shape={}
  classnames=[]
  num_objs={}
  obj_avg={}
  for i,name in enumerate(filenames):
    recs[name]=parse_obj(xml_path, name+ '.xml' )
  for name in filenames:
    for object in recs[name]:
      if object['name'] not in num_objs.keys():
         num_objs[object['name']]=1
      else:
         num_objs[object['name']]+=1
      if object['name'] not in classnames:
         classnames.append(object['name'])
  for name in classnames:
    print('{}:{}个'.format(name,num_objs[name]))
  print('信息统计算完毕。')

在这里插入图片描述

三、分割后的图片中,统计各个类别的图片数量及xml文件

cls_get.py

# -*- coding: utf-8 -*-
import os
import os.path
import shutil

# 修改文件的xml和img图片的位置  
fileDir_ann = r'D:/datasets/DOTA_clip/val/labeltxt/'
fileDir_img = r'D:/datasets/DOTA_clip/val/images/'
 #存放包含需要的类的图片
saveDir_img = r'D:/datasets/DOTA_clip/helicopter/val/images'
        
if not os.path.exists(saveDir_img):
    os.mkdir(saveDir_img)
 
 
names = locals()
 
for files in os.walk(fileDir_ann):
    #遍历Annotations中的所有文件
    for file in files[2]:
        print (file + "-->start!")
 
        #存放包含需要的类的图片对应的xml文件
        saveDir_ann = r'D:/datasets/DOTA_clip/helicopter/val/annotations/'
 
        if not os.path.exists(saveDir_ann):
            os.mkdir(saveDir_ann)
        fp = open(fileDir_ann + file)       
        saveDir_ann = saveDir_ann + file
        fp_w = open(saveDir_ann, 'w')
        # 修改为自己数据集的类别
        classes = ['plane', 'baseball-diamond', 'bridge', 'ground-track-field', 
                   'small-vehicle', 'large-vehicle', 'ship', 
                   'tennis-court', 'basketball-court',  
                   'storage-tank', 'soccer-ball-field', 
                   'roundabout', 'harbor', 
                   'swimming-pool', 'helicopter']  
 
        lines = fp.readlines()
 
        #记录所有的\t<object>\n的位置
        ind_start = []
 
        #记录所有的\t</object>\n的位置
        ind_end = []
 
        lines_id_start = lines[:]
        lines_id_end = lines[:]
 
        # 根据xml文件中的格式进行修改
        while "<object>\n" in lines_id_start:
            a = lines_id_start.index("<object>\n")
            ind_start.append(a)
            lines_id_start[a] = "delete"
 
        while "</object>\n" in lines_id_end:
            b = lines_id_end.index("</object>\n")
            ind_end.append(b)
            lines_id_end[b] = "delete"
 
        for k in range(0,len(ind_start)):
            for j in range(0,len(classes)):
                if classes[j] in lines[ind_start[k]+1]:
                    a = ind_start[k]
                    names['block%d'%k] = lines[a:ind_end[k]+1]
                    break
        # 修改为自己所需要的类别,可以创建多个类别
        # 根据xml格式进行修改
        classes1 = '<name>large-vehicle</name>\n'
 
        string_start = lines[0:ind_start[0]]
        print(string_start)
        string_end = lines[ind_end[-1] + 1:]
 
        a = 0
        for k in range(0,len(ind_start)):
            if classes1 in names['block%d'%k]:
                a += 1
                string_start += names['block%d'%k]
 
        string_start += string_end
        for c in range(0,len(string_start)):
            fp_w.write(string_start[c])
        fp_w.close()
 
        if a == 0:
            os.remove(saveDir_ann)
        else:
            # 。png或者是.jpg文件,根据自己的格式进行修改
            name_img = fileDir_img + os.path.splitext(file)[0] + ".png"
            shutil.copy(name_img,saveDir_img)
        fp.close()


下面是检测大型汽车的结果:
在这里插入图片描述
在这里插入图片描述

四、从预测结果图中找出某个类别的图片

find_same_name.py

下面的代码是找出桥梁的图片。

# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-
# !/usr/bin/env python
import shutil
import os
import glob
from PIL import Image
import re

#指定找到文件后,另存为的文件夹绝对路径
outDir = os.path.abspath('D:/datasets/output') 

#指定第一个文件夹的位置
imageDir1 = os.path.abspath('D:/datasets/DOTA_clip/bridge/val/images')

#定义要处理的第一个文件夹变量
image1 = [] #image1指文件夹里的文件,包括文件后缀格式;
imgname1 = [] #imgname1指里面的文件名称,不包括文件后缀格式

#通过glob.glob来获取第一个文件夹下,所有'.png'文件
imageList1 = glob.glob(os.path.join(imageDir1, '*.png'))

#遍历所有文件,获取文件名称(包括后缀)
for item in imageList1:
    image1.append(os.path.basename(item))

#遍历文件名称,去除后缀,只保留名称
for item in image1:
    (temp1, temp2) = os.path.splitext(item)
    imgname1.append(temp1)

#对于第二个文件夹绝对路径,做同样的操作
imageDir2 = os.path.abspath('D:/datasets/R2CNN_20180922_DOTA_v28/R2CNN_20180922_DOTA_v28')
image2 = []
imgname2 = []
imageList2 = glob.glob(os.path.join(imageDir2, '*.jpg'))
    
        
for item in imageList2:
    image2.append(os.path.basename(item))

for item in image2:
    (temp1, temp2) = os.path.splitext(item)
    temp3 = temp1[0:15]  # 取前15位字符
    imgname2.append(temp3)

#通过遍历,获取第一个文件夹下,文件名称(不包括后缀)与第二个文件夹相同的文件,
#并另存在outDir文件夹下。文件名称与第一个文件夹里的文件相同,后缀格式亦保持不变。
List = []
for item1 in imgname1:
    for item2 in imgname2:
        if item1 == item2:
            temp = item1
            List.append(temp)
#            print(List)
#            print(temp)
# 如何在两个列表中,取出第二个列表对应的第一个列表的元素 .
#1,先根据数字在第二个列表的位置找第一个列表的数
#2,再根据第一个列表数字位置找第二个
        
print(List)
for i in List:
    # 字符串前加上f可以使得{}里的变量不被转换成字符串
    old_path0 = f'D:/datasets/R2CNN_20180922_DOTA_v28/R2CNN_20180922_DOTA_v28/{i}_r.jpg'
    old_path1 = f'D:/datasets/R2CNN_20180922_DOTA_v28/R2CNN_20180922_DOTA_v28/{i}_h.jpg'
    new_path0 = f'D:/datasets/output/bridge/{i}_r.jpg'
    new_path1 = f'D:/datasets/output/bridge/{i}_h.jpg'
    shutil.copy2(old_path0, new_path0); shutil.copy2(old_path1, new_path1)
                      

在这里插入图片描述

五、读DOTA数据集的xml文件,得到每个对象的类别以及每个框的坐标,并存到tmp1.txt

get_cls_and_xy.py

以下介绍两种获取对象类别和坐标的方法,分别使用xml元素树切分的方法,供大家使用。

# -*- coding: utf-8 -*-
# 方法一:用元素树的方法
# 读xml文件中的一个rect
import xml.etree.ElementTree as ET
import sys
import numpy as np
#import importlib
 
#importlib.reload(sys)
#sys.setdefaultencoding('utf-8')
xml_path="D:/datasets/DOTA_clip/val/labeltxt/P0003_0000_0000.xml"
root = ET.parse(xml_path).getroot() #获取元素树的根节点
rect={}
objects=[]
line=[]
for name in root.iter('name'):
    rect['name'] = name.text
for ob in root.iter('object'):
    for bndbox in ob.iter('bndbox'):
        for x0 in bndbox.iter('x0'):
            rect['x0'] = x0.text
        for y0 in bndbox.iter('y0'):
            rect['y0'] = y0.text
        for x1 in bndbox.iter('x1'):
            rect['x1'] = x1.text
        for y1 in bndbox.iter('y1'):
            rect['y1'] = y1.text
        for x2 in bndbox.iter('x2'):
            rect['x2'] = x2.text
        for y2 in bndbox.iter('y2'):
            rect['y2'] = y2.text
        for x3 in bndbox.iter('x3'):
            rect['x3'] = x3.text
        for y3 in bndbox.iter('y3'):
            rect['y3'] = y3.text
        line = rect['name'] + " "+ rect['x0']+ " "+rect['y0']+" "+rect['x1']+" "+rect['y1']+" "+rect['x2']+" "+rect['y2']+" "+rect['x3']+" "+rect['y3']
#        print(line)
        objects.append(line)
        print(objects)

# f1 = open('D:/datasets/output/tmp1.txt', 'w')
np.savetxt('D:/datasets/output/tmp1.txt', objects, fmt = '%s')


# --------------------------------------------------------------------------
# 方法二:split切分的方法
#import re
#
#xml_path="D:/datasets/DOTA_clip/val/labeltxt/P0003_0000_0000.xml"
#
#text = open(xml_path, 'r').read().split('\n')[20:-2]
#
#for i in range(0, len(text)-1, 16):
#	name = text[i+1].split('>')[1].split('<')[0]
#	x0 = text[i+6].split('>')[1].split('<')[0]
#	y0 = text[i+7].split('>')[1].split('<')[0]
#	x1 = text[i+8].split('>')[1].split('<')[0]
#	y1 = text[i+9].split('>')[1].split('<')[0]
#	x2 = text[i+10].split('>')[1].split('<')[0]
#	y2 = text[i+11].split('>')[1].split('<')[0]
#	x3 = text[i+12].split('>')[1].split('<')[0]
#	y3 = text[i+13].split('>')[1].split('<')[0]
#	output = f'{name} {x0} {y0} {x1} {y1} {x2} {y2} {x3} {y3}'
#	print(output)

在这里插入图片描述

推荐这个元素树使用详解链接,个人认为讲得很好!

六、读取检测生成的pkl文件

read_pkl.py

# -*- coding: utf-8 -*-
    
import pickle, pprint


pkl_file = open(r'D:/datasets/R2CNN_20180922_DOTA_v28/R2CNN_20180922_DOTA_v28_detections_r.pkl', 'rb')

data = pickle.load(pkl_file, encoding='bytes')
pprint.pprint(data)

pkl_file.close()

七、提取预测结果中,某个类别的可视化

适用情况:
对两个模型中,某些类别ap相差比较大的情况分析:
(1)用预测出来的数据(如,results.txt)的文件,提取出该类别的数据进行可视化
(2)观察结果

Extract_category.py

# -*- coding:utf-8 -*-
import numpy as np
import json

f = open(r"D:\code\check\detections_test2017__results.json", 'r', encoding='utf-8')
arr = json.loads(f.read())
f.close()

file_ = open(r"D:\code\check\detections_test2017__results.txt", 'w')
for i in arr:
    if i['category_id'] == 5:
        file_.write(
            "%08d %d %f %f %f %f %f\n" % (
            i['image_id'], i['category_id'], i['score'],
            i['bbox'][0], i['bbox'][1], i['bbox'][2], i['bbox'][3]))

extract_pic.py

# -*- coding:utf-8 -*-
# 配合 Extract_category.py 一起用,先Extract_category.py再进行下面的代码操作

import cv2
import numpy as py
import os


def get_pic(txt_path, pic_path, save_path):  
    with open(txt_path, 'r') as fp:
        while(1):
            line = fp.readline()
            if not line:
                print("txt is over!!!")
                break
            str = line.split(" ")
            img_id = str[0]                    
            x = round(float(str[3]))
            y = round(float(str[4]))
            w = round(float(str[5])) + x
            h = round(float(str[6])) + y
            ap = round(float(str[2]), 5)
            if ap >= 0.5:
                if os.path.exists(save_path + img_id + ".png"):  
                    img = cv2.imread(save_path + img_id + ".png") 
                    if str[1] == '1':
                            # cv2.rectangle(img,(x,y-22),(x+50,y),(0,255,0), thickness = -1)
                            cv2.putText(img, "plane", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                            cv2.rectangle(img,(x,y),(w,h),(0,0,255),2,4,0)
                            cv2.circle(img, (x, y), 3, (0, 0, 255), -1)  # 画heatmap
                        
                    elif str[1] == '2':
                        # cv2.rectangle(img,(x,y-22),(x+100,y),(0,0,255), thickness = -1)
                        cv2.putText(img, "ship", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(0,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '3':
                        # cv2.rectangle(img,(x,y-22),(x+40,y),(172,172,0), thickness = -1)             
                        cv2.putText(img, "stroage-tank", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(172,172,0),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '4':
                        # cv2.rectangle(img,(x,y-22),(x+35,y),(172,0,172), thickness = -1)                  
                        cv2.putText(img, "baseball-diamond", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(172,0,172),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '5':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "tennis-court", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '6':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "basketball-court", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '7':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "ground-track-field", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '8':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "harbor", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '9':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "bridge", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '10':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "small-vehicle", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '11':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "large-vehicle", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '13':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "helicopter", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '14':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "roundabout", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '15':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "soccer-ball-field", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '16':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "swimming-pool", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    cv2.imwrite(save_path + img_id +".png", img)
                    print(str[0]+".png is save....OK~~~")
                elif os.path.exists(pic_path + img_id + ".png") :
                    img = cv2.imread(pic_path + img_id + ".png")    
                    if str[1] == '1':
                            # cv2.rectangle(img,(x,y-22),(x+50,y),(0,255,0), thickness = -1)
                            cv2.putText(img, "plane", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                            cv2.rectangle(img,(x,y),(w,h),(0,0,255),2,4,0)
                            cv2.circle(img, (x, y), 3, (0, 0, 255), -1)  # 画heatmap
                        
                    elif str[1] == '2':
                        # cv2.rectangle(img,(x,y-22),(x+100,y),(0,0,255), thickness = -1)
                        cv2.putText(img, "ship", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(0,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '3':
                        # cv2.rectangle(img,(x,y-22),(x+40,y),(172,172,0), thickness = -1)             
                        cv2.putText(img, "stroage-tank", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(172,172,0),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '4':
                        # cv2.rectangle(img,(x,y-22),(x+35,y),(172,0,172), thickness = -1)                  
                        cv2.putText(img, "baseball-diamond", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(172,0,172),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '5':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "tennis-court", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '6':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "basketball-court", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '7':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "ground-track-field", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '8':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "harbor", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '9':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "bridge", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '10':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "small-vehicle", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '11':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "large-vehicle", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '13':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "helicopter", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '14':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "roundabout", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    elif str[1] == '15':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "soccer-ball-field", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)
                    
                    elif str[1] == '16':
                        # cv2.rectangle(img,(x,y-22),(x+90,y),(255,0,255), thickness = -1)               
                        cv2.putText(img, "swimming-pool", (x,y-5), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (0,0,0))
                        cv2.rectangle(img,(x,y),(w,h),(255,0,255),2,4,0)
                        cv2.circle(img, (x, y), 3, (0, 0, 255), -1)

                    cv2.imwrite(save_path + img_id +".png", img)
                    print(str[0]+".png is save....OK!!!")

if __name__ == '__main__':
    #样本图片路径
    pic_path = r"D:/code/check/test/"  
    # txt存放的路径
    txt_path="D:/code/check/detections_test2017__results.txt"
    # 画出来的图片保存的路径
    save_path="D:/code/check/out/"
    get_pic(txt_path, pic_path, save_path)

八、json—>txt

json2txt.py

# -*- coding: UTF-8 -*-
import json
# json文件转成txt文件
f = open(r"C:\Users\Desktop\results.json", 'r', encoding='utf-8')

# papers = []
# for line in f.readlines():
#     dic = json.loads(line)
#     papers.append(dic)

arr = json.loads(f.read(), strict=False)
# print(80*'=')
# print(arr)
# print(80*'=')
f.close()

file_ = open(r"C:\Users\Desktop\yolov4_test.txt", 'w')
# x1, y1, x2, y2
# for i in arr:
#     f.write(
#         "%08d %d %f %f %f %f %f\n" % (
#         i['image_id'], i['category_id'], i['score'],
#         i['bbox'][0], i['bbox'][1], i['bbox'][2] + i['bbox'][0], i['bbox'][3] + i['bbox'][1]))



# 下面的代码是获取gt中id,category,bbox
# for i in arr['annotations']:  
#     k = i['image_id']
#     file_.write(
#         "%08d %d %f %f %f %f\n" % (
#         i['image_id'],  i['category_id'],
#         i['bbox'][0], i['bbox'][1], i['bbox'][2], i['bbox'][3]))

# 下面的代码是获取预测文件中的id,category,bbox
# for i in arr:  
#     k = i['image_id']
#     file_.write(
#         "%08d %d %f %f %f %f\n" % (
#         i['image_id'],  i['category_id'],
#         i['bbox'][0], i['bbox'][1], i['bbox'][2], i['bbox'][3]))

# 下面的代码时获取预测文件中,id,category, score, bbox
for i in arr:
    k = i['image_id']
    file_.write(
        "%08d %d %f %f %f %f %f\n" % (
        i['image_id'], i['category_id'], i['score'],
        i['bbox'][0], i['bbox'][1], i['bbox'][2], i['bbox'][3]))

file_.close()

九、修改类别

modify_cls.py

# -*- coding: UTF-8 -*-
import json
# json文件转成txt文件
f = open(r"C:\Users\Desktop\val2017_bbox_results.json", 'r', encoding='utf-8')

arr = json.loads(f.read(), strict=False)
f.close()

file_ = open(r"C:\Users\Desktop\retinanet_pre.txt", 'w')

# 下面的代码时获取预测文件中,id,category, score, bbox
for i in arr:
    k = i['image_id']
    if i['category_id'] <= 16:
        file_.write(
        "%08d %d %f %f %f %f %f\n" % (
        i['image_id'], i['category_id'] - 1, i['score'],
        i['bbox'][0], i['bbox'][1], i['bbox'][2], i['bbox'][3]))
    else:
        print("error!")


    # file_.write(
    #     "%08d %d %f %f %f %f %f\n" % (
    #     i['image_id'], i['category_id'], i['score'],
    #     i['bbox'][0], i['bbox'][1], i['bbox'][2], i['bbox'][3]))

file_.close()

十、按文件夹内文件的顺序重命名

rename.py

import os
path = "/home/jjliao/Visdrone_coco/VisDrone2019-DET-train/images_origin"
new_path = "/home/jjliao/Visdrone_coco/VisDrone2019-DET-train/images"
filelist = os.listdir(path) #该文件夹下所有的文件(包括文件夹)
# import pdb
# pdb.set_trace()
filelist.sort(key=lambda x:x.split('.')[0])
count = 100001
for file in filelist:
    print(file)
for file in filelist:   #遍历所有文件
    Olddir=os.path.join(path,file)   #原来的文件路径
    # if os.path.isdir(Olddir):   #如果是文件夹则跳过
    #     continue
    filename=os.path.splitext(file)[0]   #文件名
    filetype=os.path.splitext(file)[1]   #文件扩展名
    Newdir=os.path.join(new_path,str(count)+filetype)  #用字符串函数zfill 以0补全所需位数
    os.rename(Olddir,Newdir)#重命名
    count+=1


十一、txt—>xml

txt2xml.py

import os
from PIL import Image

# 把下面的路径改成你自己的路径即可
root_dir = "D:/DOTA/val/"
# annotations_dir = root_dir+"annotations/"
annotations_dir = root_dir+"labelTxt-v1.0/labelTxt/"
image_dir = root_dir + "images/part1/images/"
xml_dir = root_dir+"annotations/"
# 下面的类别也换成你自己数据类别,也可适用于其他的数据集转换
class_name = ['ignored regions','pedestrian','people','bicycle','car','van','truck','tricycle','awning-tricycle','bus','motor','others']

for filename in os.listdir(annotations_dir):
    fin = open(annotations_dir+filename, 'r')
    image_name = filename.split('.')[0]
    img = Image.open(image_dir+image_name+".png") # 若图像数据是“png”转换成“.png”即可
    xml_name = xml_dir+image_name+'.xml'
    with open(xml_name, 'w') as fout:
        fout.write('<annotation>'+'\n')
        
        fout.write('\t'+'<folder>VOC2007</folder>'+'\n')
        fout.write('\t'+'<filename>'+image_name+'.jpg'+'</filename>'+'\n')
        
        fout.write('\t'+'<source>'+'\n')
        fout.write('\t\t'+'<database>'+'VisDrone2018 Database'+'</database>'+'\n')
        fout.write('\t\t'+'<annotation>'+'VisDrone2018'+'</annotation>'+'\n')
        fout.write('\t\t'+'<image>'+'flickr'+'</image>'+'\n')
        fout.write('\t\t'+'<flickrid>'+'Unspecified'+'</flickrid>'+'\n')
        fout.write('\t'+'</source>'+'\n')
        
        fout.write('\t'+'<owner>'+'\n')
        fout.write('\t\t'+'<flickrid>'+'Haipeng Zhang'+'</flickrid>'+'\n')
        fout.write('\t\t'+'<name>'+'Haipeng Zhang'+'</name>'+'\n')
        fout.write('\t'+'</owner>'+'\n')
        
        fout.write('\t'+'<size>'+'\n')
        fout.write('\t\t'+'<width>'+str(img.size[0])+'</width>'+'\n')
        fout.write('\t\t'+'<height>'+str(img.size[1])+'</height>'+'\n')
        fout.write('\t\t'+'<depth>'+'3'+'</depth>'+'\n')
        fout.write('\t'+'</size>'+'\n')
        
        fout.write('\t'+'<segmented>'+'0'+'</segmented>'+'\n')

        for line in fin.readlines():
            line = line.split(',')
            fout.write('\t'+'<object>'+'\n')
            fout.write('\t\t'+'<name>'+class_name[int(line[5])]+'</name>'+'\n')
            fout.write('\t\t'+'<pose>'+'Unspecified'+'</pose>'+'\n')
            fout.write('\t\t'+'<truncated>'+line[6]+'</truncated>'+'\n')
            fout.write('\t\t'+'<difficult>'+str(int(line[7]))+'</difficult>'+'\n')
            fout.write('\t\t'+'<bndbox>'+'\n')
            fout.write('\t\t\t'+'<xmin>'+line[0]+'</xmin>'+'\n')
            fout.write('\t\t\t'+'<ymin>'+line[1]+'</ymin>'+'\n')
            # pay attention to this point!(0-based)
            fout.write('\t\t\t'+'<xmax>'+str(int(line[0])+int(line[2])-1)+'</xmax>'+'\n')
            fout.write('\t\t\t'+'<ymax>'+str(int(line[1])+int(line[3])-1)+'</ymax>'+'\n')
            fout.write('\t\t'+'</bndbox>'+'\n')
            fout.write('\t'+'</object>'+'\n')
             
        fin.close()
        fout.write('</annotation>')
  • 26
    点赞
  • 188
    收藏
    觉得还不错? 一键收藏
  • 36
    评论
### 回答1: 将DOTA(Defense of the Ancients)数据集处理成YOLO(You Only Look Once)格式可以通过以下步骤完成: 1. 数据预处理:首先,需要对DOTA数据集进行预处理,包括将标注文件和图像文件分开保存。标注文件通常以.xml或.txt格式提供,其中包含了每个图像中目标的位置和类别信息。图像文件则以.jpg或.png格式保存。 2. 类别和标签映射:根据DOTA数据集的类别标签定义,创建一个类别到整数的映射表。例如,将“plane”映射为0,“ship”映射为1,“storage tank”映射为2等等。 3. 调整坐标:DOTA数据集使用四边形框标记目标的位置,而YOLO需要使用矩形框表示。因此,需要将四边形框转换为矩形框。可以使用几何转换算法,例如旋转矩形框的最小外接矩形或最小旋转矩形,将四边形框调整为矩形框。 4. 数据标注格式转换:使用上述的类别标签映射和调整后的矩形框位置,可以将DOTA数据集的标注格式转换为YOLO格式。YOLO格式通常是一个文本文件,每一行对应一张图像,以如下格式呈现:类别索引 x中心坐标 y中心坐标 宽度 高度。例如,对于一辆车在图像中的位置,可以表示为“0 0.5 0.6 0.3 0.2”,其中0是类别的索引,(0.5, 0.6)是矩形框的中心坐标,0.3和0.2分别是矩形框的宽度和高度。 5. 数据分割:将处理后的YOLO格式标注文件和对应的图像文件分别移动到训练、验证和测试数据集的相应文件夹中,以便YOLO模型可以正确加载和训练。 通过以上步骤,就可以将DOTA数据集处理成YOLO格式,便于后续使用YOLO算法进行目标检测。处理后的数据集可以直接用于训练YOLO模型。 ### 回答2: 要将Dota数据集处理成YOLO格式,需要进行以下步骤: 1. 下载Dota数据集:首先,需要从官方网站或其他资源中下载Dota数据集。该数据集包含Dota游戏中的图像及其对应的边界框标注信息。 2. 数据预处理:对数据集中的图像进行预处理操作。包括调整图像大小、转换图像格式等。可以使用图像处理库(如OpenCV)来实现这些操作。 3. 边界框转换:将Dota数据集中的边界框标注转换为YOLO格式。YOLO格式中的边界框使用相对坐标和尺寸表示,同时需要标注每个边界框的类别。因此,需要将Dota数据集中的边界框标注信息转换为YOLO格式要求的标签格式。 4. 标签文件生成:将转换后的YOLO格式的标签与对应的图像文件进行关联,生成YOLO格式的标签文件。YOLO要求每个图像对应一个标签文件,该文件包含每个边界框的类别和位置信息。 5. 数据集划分:将处理后的数据集划分为训练集、验证集和测试集,可以按比例划分,通常是70%的数据用于训练,20%用于验证,10%用于测试。 6. 数据集路径配置:将数据集的路径配置到YOLO配置文件中,以便YOLO模型能够正确读取和处理数据集。 7. 训练模型:使用YOLO框架进行训练。通过配置YOLO框架的参数和超参数,选择适当的优化算法、学习率等进行训练。此步骤需要使用GPU进行模型训练,以加速计算量。 8. 模型评估和测试:使用训练好的YOLO模型,对测试集进行评估和测试。评估指标可以使用mAP、IoU等常用指标进行衡量。 以上是将Dota数据集处理成YOLO格式的主要步骤,通过这些步骤可以将Dota数据集转换为适用于YOLO模型的输入形式,从而进行目标检测任务。具体的实现过程可能需要根据数据集和工具的不同进行相应的调整。 ### 回答3: 要将DOTA数据集处理成YOLO格式,首先需要理解DOTA数据集的结构和YOLO的要求。 DOTA数据集是一个常用的对象检测数据集,包括大量的航拍图像和标注信息。每个图像的标注信息通常以文本文件(.txt)的形式提供,其中包含了检测目标的类别、边界框的位置和其他相关信息。 YOLO(You Only Look Once)是一种基于深度学习的目标检测算法,要求输入图像和对应的标注信息采用特定的格式。 处理DOTA数据集成YOLO格式的步骤如下: 1. 首先,进入DOTA数据集的文件夹,检查数据集的目录结构。一般来说,数据集包含图像文件夹和标注文件夹。 2. 在YOLO格式中,每个图像对应一个文本文件,而不是使用DOTA数据集中的单个文本文件。因此,需要遍历每张图像并处理它们。 3. 对于每张图像,读取对应的标注文件,解析文本内容。通常标注文件中的每一行表示一个目标,其中包括目标的类别、边界框的位置和其他相关信息。将这些信息提取出来。 4. 将目标的类别转换为YOLO格式要求的类别序号。可以根据目标的类别名称和YOLO中的类别列表进行匹配。 5. 将边界框的位置信息转换为YOLO格式要求的形式。YOLO格式要求的边界框位置是相对于图像尺寸的标准化坐标,通常用边界框的中心点坐标、宽度和高度来表示。 6. 将转换后的目标类别和边界框位置信息写入新的文本文件,作为该图像的标注文件。 7. 重复上述步骤,处理所有的图像。 处理完成后,DOTA数据集就可以转换成YOLO格式,可以用于训练YOLO模型。但需要注意的是,不同的YOLO版本对于标注信息的要求可能有所不同,需要根据具体的YOLO版本进行相应的调整和修改。
评论 36
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值