coco数据类型与voc数据类型的相互转换以及这两种类型的详细介绍。

Louise^

已于 2022-05-24 21:03:32 修改

阅读量5.1k

点赞数 2

文章标签：目标检测深度学习计算机视觉 python

于 2022-05-20 20:49:14 首次发布

本文链接：https://blog.csdn.net/qq_62679853/article/details/124887371

版权

声明：本文为作者将多篇文章以及自己的理解缝合而出的文章，仅用于交流与学习，本文代码均为转载，如有不正确之处敬请指正，侵删。参考文章：将coco数据集转换为voc格式_迷若烟雨的博客-CSDN博客_coco转voc

Python将voc数据格式转化为coco数据格式_萤-火的博客-CSDN博客_voc转coco数据集

coco数据集格式介绍_ Clear butterfly的博客-CSDN博客_coco格式数据集

VOC数据集格式介绍_望天边星宿的博客-CSDN博客_voc数据集格式

XML——XML介绍和基本语法_KLeonard的博客-CSDN博客_xml

Json数据格式_Liumotor的博客-CSDN博客_json数组数据格式

xml 和 json各自的优缺点_张驰Terry的博客-CSDN博客_xml与json优缺点

在实现coco数据格式与voc数据集格式转换之前，首先我们要自己什么是coco数据集格式与voc数据集格式。

coco数据集格式（标签为json）

coco数据类型主要有5种标注类型，分别为object,detection,keypoint detection,suff segmentation,panoptic segmentation。一个json文件包括info，images,licenses,annotations四个基本类型。info包含了test,train,val所有实例的整体信息，如年份，贡献者。

image主要包含了图片的信息，如id，图片的长宽，图片名。licenses对应json文件的name和id。annotations主要包含了annotation的id，category id。每张图片上每个对象都有对应的一个annotation id，一张图片上有几个物体就会有几个annotation id，图片上有几个类别就会有几个category id。在做训练集时，了解json文件就已经足够了，json文件可用labelme生成。

JSON是一种轻量级、基于文本的、可读的数据传输格式，而不是一门语言，基本上所有的编程语言都支持JSON数据格式

Json的结构
对象：对象在js中表示为“{}”括起来的内容，数据结构为 {key：value,key：value,...}的键值对的结构，在面向对象的语言中，key为对象的属性，value为对应的属性值，所以很容易理解，取值方法为对象.key 获取属性值，这个属性值的类型可以是数字、字符串、数组、对象几种。
数组：数组在js中是中括号“[]”括起来的内容，数据结构为 ["java","javascript","vb",...]，取值方式和所有语言中一样，使用索引获取，字段值的类型可以是数字、字符串、数组、对象几种。

json格式的优点：

数据格式比较简单，易于读写，格式都是压缩的，占用带宽小，浏览器解析快2. 易于解析这种语言，客户端JavaScript可以简单的通过eval()进行JSON数据的读取3. 构造友好，支持多种语言

json格式的缺点：
没有XML格式这么推广的深入人心和使用广泛，没有XML那么通用性2. JSON格式目前在Web Service中推广还属于初级阶段

voc数据集格式（标签为xml）

voc数据集格式包含5种文件夹，JPEGImages,Annotations,ImageSets,SegmentationClass,SegmentationObject。

JPEGImages就是你想要训练的原始图片。Annotations是你通过labellmg生成的xml类别文件。ImageSets为txt文件类型，保存的是train.txt，val.txt，trainval.txt。SegmentationClass是Names文件，记录你想要分类的class。SegmentationObject保存物体分割后的数据。

#imagesets包含文件夹Action,Layout,Main,而txt文件实际上在Main文件夹中。

什么是XML？
XML是由万维网联盟（W3C）创建的标记语言，用于定义编码人类和机器可以读取的文档的语法。它通过使用定义文档结构的标签以及如何存储和传输文档来实现这一点.

XML的作用是什么？

XML的作用只是数据保存和数据交换 XML 元素指的是从（<标记名称）开始标签到（</标记名称>）结束标签的部分。

元素可包含其他元素、文本或者两者的混合物，另外元素也可以拥有属性。“<”表示一个标记的开始，“>”表示一个标记的结束。XML中只要有起始标记，就必须有结束标记，而且在使用嵌套结构时，标记之间不能交叉。在XML中不含任何内容的标记叫做空标记，比如<site></site>。空标记有一种特殊的写法，以<开始，然后是元素名称，然后以/>结束。比如<site />。

格式为：<标记名称属性名1="属性值1" 属性名1="属性值1" ……>内容</标记名称>

在了解voc格式和coco格式后就是来实现它们的转换了。

XML优点：
1.格式统一，符合标准2.容易与其他系统进行远程交互，数据共享比较方便3.调用将 XML 用作传输的现有服务。4.使用 XSLT 可以动态转换 XML。这是企业服务总线 (ESB) 方案中的理想功能。

XML缺点：
XML文件格式文件庞大，格式复杂，传输占用带宽2. 服务器端和客户端都需要花费大量代码来解析XML，不论服务器端和客户端代码变的异常复杂和不容易维护3. 客户端不同浏览器之间解析XML的方式不一致，需要重复编写很多代码4. 服务器端和客户端解析XML花费资源和时间。

接下来是json格式与xml格式的转换。

首先是制作出自己的数据集

先在网上收集你想要的图片，然后在labellmg上打开文件夹对图进行标注，下图为打开文件夹位置。

随后对自己需要的图片进行标注，如下图。

标注好的图片会生成xml类别的文件，xml类别的文件属于voc数据类型的文件。

xml转为json的代码，代码转自Python将voc数据格式转化为coco数据格式_萤-火的博客-CSDN博客_voc转coco数据集

import xml.etree.ElementTree as ET
import os
import json

coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []

category_set = dict()
image_set = set()

category_item_id = -1
image_id = 20180000000
annotation_id = 0


def addCatItem(name):
    global category_item_id
    category_item = dict()
    category_item['supercategory'] = 'none'
    category_item_id += 1
    category_item['id'] = category_item_id
    category_item['name'] = name
    coco['categories'].append(category_item)
    category_set[name] = category_item_id
    return category_item_id


def addImgItem(file_name, size):
    global image_id
    if file_name is None:
        raise Exception('Could not find filename tag in xml file.')
    if size['width'] is None:
        raise Exception('Could not find width tag in xml file.')
    if size['height'] is None:
        raise Exception('Could not find height tag in xml file.')
    image_id += 1
    image_item = dict()
    image_item['id'] = image_id
    image_item['file_name'] = file_name
    image_item['width'] = size['width']
    image_item['height'] = size['height']
    coco['images'].append(image_item)
    image_set.add(file_name)
    return image_id


def addAnnoItem(object_name, image_id, category_id, bbox):
    global annotation_id
    annotation_item = dict()
    annotation_item['segmentation'] = []
    seg = []
    # bbox[] is x,y,w,h
    # left_top
    seg.append(bbox[0])
    seg.append(bbox[1])
    # left_bottom
    seg.append(bbox[0])
    seg.append(bbox[1] + bbox[3])
    # right_bottom
    seg.append(bbox[0] + bbox[2])
    seg.append(bbox[1] + bbox[3])
    # right_top
    seg.append(bbox[0] + bbox[2])
    seg.append(bbox[1])

    annotation_item['segmentation'].append(seg)

    annotation_item['area'] = bbox[2] * bbox[3]
    annotation_item['iscrowd'] = 0
    annotation_item['ignore'] = 0
    annotation_item['image_id'] = image_id
    annotation_item['bbox'] = bbox
    annotation_item['category_id'] = category_id
    annotation_id += 1
    annotation_item['id'] = annotation_id
    coco['annotations'].append(annotation_item)


def parseXmlFiles(xml_path):
    for f in os.listdir(xml_path):
        if not f.endswith('.xml'):
            continue

        bndbox = dict()
        size = dict()
        current_image_id = None
        current_category_id = None
        file_name = None
        size['width'] = None
        size['height'] = None
        size['depth'] = None

        xml_file = os.path.join(xml_path, f)
        print(xml_file)

        tree = ET.parse(xml_file)
        root = tree.getroot()
        if root.tag != 'annotation':
            raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))

        # elem is <folder>, <filename>, <size>, <object>
        for elem in root:
            current_parent = elem.tag
            current_sub = None
            object_name = None

            if elem.tag == 'folder':
                continue

            if elem.tag == 'filename':
                file_name = elem.text
                if file_name in category_set:
                    raise Exception('file_name duplicated')

            # add img item only after parse <size> tag
            elif current_image_id is None and file_name is not None and size['width'] is not None:
                if file_name not in image_set:
                    current_image_id = addImgItem(file_name, size)
                    print('add image with {} and {}'.format(file_name, size))
                else:
                    raise Exception('duplicated image: {}'.format(file_name))
                    # subelem is <width>, <height>, <depth>, <name>, <bndbox>
            for subelem in elem:
                bndbox['xmin'] = None
                bndbox['xmax'] = None
                bndbox['ymin'] = None
                bndbox['ymax'] = None

                current_sub = subelem.tag
                if current_parent == 'object' and subelem.tag == 'name':
                    object_name = subelem.text
                    if object_name not in category_set:
                        current_category_id = addCatItem(object_name)
                    else:
                        current_category_id = category_set[object_name]

                elif current_parent == 'size':
                    if size[subelem.tag] is not None:
                        raise Exception('xml structure broken at size tag.')
                    size[subelem.tag] = int(subelem.text)

                # option is <xmin>, <ymin>, <xmax>, <ymax>, when subelem is <bndbox>
                for option in subelem:
                    if current_sub == 'bndbox':
                        if bndbox[option.tag] is not None:
                            raise Exception('xml structure corrupted at bndbox tag.')
                        bndbox[option.tag] = int(option.text)

                # only after parse the <object> tag
                if bndbox['xmin'] is not None:
                    if object_name is None:
                        raise Exception('xml structure broken at bndbox tag')
                    if current_image_id is None:
                        raise Exception('xml structure broken at bndbox tag')
                    if current_category_id is None:
                        raise Exception('xml structure broken at bndbox tag')
                    bbox = []
                    # x
                    bbox.append(bndbox['xmin'])
                    # y
                    bbox.append(bndbox['ymin'])
                    # w
                    bbox.append(bndbox['xmax'] - bndbox['xmin'])
                    # h
                    bbox.append(bndbox['ymax'] - bndbox['ymin'])
                    print('add annotation with {},{},{},{}'.format(object_name, current_image_id, current_category_id,
                                                                   bbox))
                    addAnnoItem(object_name, current_image_id, current_category_id, bbox)


if __name__ == '__main__':
    xml_path = 'C:\\Users\\Louise\\Desktop\\cat.xml'    # 这是xml文件所在的地址
    json_file = 'C:\\Users\\Louise\\Desktop.json'                                     # 这是你要生成的json文件
    parseXmlFiles(xml_path)                                       # 只需要改动这两个参数就行了
    json.dump(coco, open(json_file, 'w'))

json转为xml的代码，转自将coco数据集转换为voc格式_迷若烟雨的博客-CSDN博客_coco转voc

from pycocotools.coco import COCO
import skimage.io as io
import matplotlib.pyplot as plt
import pylab,os,cv2,shutil
from lxml import etree, objectify
from tqdm import tqdm
import random
from PIL import Image
 
pylab.rcParams['figure.figsize'] = (8.0, 10.0)
 
dataDir='..'
CK5cats=['chair','dining table','person']
 
CKdir="../"
CKimg_dir=CKdir+"/"+"images"
CKanno_dir=CKdir+"/"+"Annotations"
 
def mkr(dir):
    if not os.path.exists(dir):
        os.makedirs(dir)
 
def showimg(coco,dataType,img,CK5Ids):
    global dataDir
    I = io.imread('%s/%s/%s' % (dataDir, dataType, img['file_name']))
    plt.imshow(I)
    plt.axis('off')
    annIds = coco.getAnnIds(imgIds=img['id'], catIds=CK5Ids, iscrowd=None)
    anns = coco.loadAnns(annIds)
    coco.showAnns(anns)
    plt.show()
 
def save_annotations(dataType,filename,objs):
    annopath=CKanno_dir+"/"+filename[:-3]+"xml"
    img_path=dataDir+"/"+dataType+"/"+filename
    dst_path=CKimg_dir+"/"+filename
    img=cv2.imread(img_path)
    im=Image.open(img_path)
    if im.mode!="RGB":
        print(filename+" not a RGB image")
        im.close()
        return
    im.close()
    shutil.copy(img_path, dst_path)
    E = objectify.ElementMaker(annotate=False)
    anno_tree = E.annotation(
        E.folder('1'),
        E.filename(filename),
        E.source(
            E.database('CKdemo'),
            E.annotation('VOC'),
            E.image('CK')
        ),
        E.size(
            E.width(img.shape[1]),
            E.height(img.shape[0]),
            E.depth(img.shape[2])
        ),
        E.segmented(0)
    )
    for obj in objs:
        E2 = objectify.ElementMaker(annotate=False)
        anno_tree2 = E2.object(
            E.name(obj[0]),
            E.pose(),
            E.truncated("0"),
            E.difficult(0),
            E.bndbox(
                E.xmin(obj[2]),
                E.ymin(obj[3]),
                E.xmax(obj[4]),
                E.ymax(obj[5])
            )
        )
        anno_tree.append(anno_tree2)
    etree.ElementTree(anno_tree).write(annopath, pretty_print=True)
 
def showbycv(coco,dataType,img,classes,CK5Ids):
    global dataDir
    filename= img['file_name']
    filepath='%s/%s/%s' % (dataDir, dataType,filename)
    I = cv2.imread(filepath)
    annIds = coco.getAnnIds(imgIds=img['id'], catIds=CK5Ids, iscrowd=None)
    anns = coco.loadAnns(annIds)
    objs=[]
    for ann in anns:
        name=classes[ann['category_id']]
        if name in CK5cats:
            if 'bbox' in ann:
                bbox = ann['bbox']
                xmin=(int)(bbox[0])
                ymin=(int)(bbox[1])
                xmax=(int)(bbox[2]+bbox[0])
                ymax=(int)(bbox[3]+bbox[1])
                obj=[name,1.0,xmin,ymin,xmax,ymax]
                objs.append(obj)
                cv2.rectangle(I, (xmin,ymin),(xmax,ymax),(255,0,0))
                cv2.putText(I,name,(xmin,ymin),3,1,(0,0,255))
    save_annotations(dataType,filename,objs)
    cv2.imshow("img",I)
    cv2.waitKey(1)
 
def catid2name(coco):
    classes=dict()
    for cat in coco.dataset['categories']:
        classes[cat['id']]=cat['name']
        #print(str(cat['id'])+":"+cat['name'])
    return classes
 
def get_CK5():
    mkr(CKimg_dir)
    mkr(CKanno_dir)
    dataTypes=['train2014','val2014']
    for dataType in dataTypes:
        annFile = '{}/annotations/instances_{}.json'.format(dataDir, dataType)
        coco = COCO(annFile)
        CK5Ids = coco.getCatIds(catNms=CK5cats)
        classes=catid2name(coco)
        for srccat in CK5cats:
            print(dataType + ":" + srccat)
            catIds = coco.getCatIds(catNms=[srccat])
            imgIds = coco.getImgIds(catIds=catIds)
            #imgIds=imgIds[0:100]
            for imgId in tqdm(imgIds):
                img=coco.loadImgs(imgId)[0]
                showbycv(coco,dataType,img,classes,CK5Ids)
                #showimg(coco,dataType,img,CK5Ids)
 
#split train and test for training
def split_traintest(trainratio=0.7,valratio=0.2,testratio=0.1):
    dataset_dir=CKdir
    files=os.listdir(CKimg_dir)
    trains=[]
    vals=[]
    trainvals=[]
    tests=[]
    random.shuffle(files)
    for i in range(len(files)):
        filepath=CKimg_dir+"/"+files[i][:-3]+"jpg"
        if(i<trainratio*len(files)):
            trains.append(files[i])
            trainvals.append(files[i])
        elif i<(trainratio+valratio)*len(files):
            vals.append(files[i])
            trainvals.append(files[i])
        else:
            tests.append(files[i])
    #write txt files for yolo
    with open(dataset_dir+"/trainval.txt","w")as f:
        for line in trainvals:
            line=CKimg_dir+"/"+line
            f.write(line+"\n")
    with open(dataset_dir+"/test.txt","w") as f:
        for line in tests:
            line=CKimg_dir+"/"+line
            f.write(line+"\n")
    #write files for voc
    maindir=dataset_dir+"/"+"ImageSets/Main"
    mkr(maindir)
    with open(maindir+"/train.txt","w") as f:
        for line in trains:
            line=line[:line.rfind(".")]
            f.write(line+"\n")
    with open(maindir+"/val.txt","w") as f:
        for line in vals:
            line=line[:line.rfind(".")]
            f.write(line+"\n")
    with open(maindir+"/trainval.txt","w") as f:
        for line in trainvals:
            line=line[:line.rfind(".")]
            f.write(line+"\n")
    with open(maindir+"/test.txt","w") as f:
        for line in tests:
            line=line[:line.rfind(".")]
            f.write(line+"\n")
    print("spliting done")
 
 
if __name__=="__main__":
    get_CK5()
    split_traintest()

Louise^

关注

2
点赞
踩
21

收藏

觉得还不错? 一键收藏
0
评论
coco数据类型与voc数据类型的相互转换以及这两种类型的详细介绍。

声明：本文为作者将多篇文章以及自己的理解缝合而出的文章，仅用于交流与学习，本文代码均为转载，侵删。参考文章：将 .json 格式转换成 .xml格式_轮子去哪儿了的博客-CSDN博客_json转xmlPython将voc数据格式转化为coco数据格式_萤-火的博客-CSDN博客_voc转coco数据集coco数据集格式介绍_ Clear butterfly的博客-CSDN博客_coco格式数据集VOC数据集格式介绍_望天边星宿的博客-CSDN博客_voc数据集格式在实现coco数据格式与v
复制链接

扫一扫