WPI交通信号灯数据集格式转换成VOC2007

最新推荐文章于 2024-08-23 08:25:03 发布

Atarasin

最新推荐文章于 2024-08-23 08:25:03 发布

阅读量1.5k

点赞数 4

分类专栏：图像相关知识文章标签： python 深度学习经验分享

本文链接：https://blog.csdn.net/Azahaxia/article/details/108330265

版权

图像相关知识专栏收录该内容

3 篇文章 0 订阅

订阅专栏

WPI交通信号灯数据集格式转换成VOC2007

1.简述

初学交通信号灯目标检测时，总是苦于找不到合适的交通灯数据集。即使找到了数据集，也往往因为格式不同而无法直接使用。因为大部分目标检测代码都只支持VOC或COCO数据集格式，为了保证程序的正常运行，必须先将数据集格式转换成支持的格式。这也是我在初学目标检测时遇到的比较麻烦的问题之一。

本文将介绍如何通过python脚本将WPI数据集格式转换成VOC2007数据集格式。

2.WPI数据格式介绍

首先介绍一下WPI数据集的文件格式：

└─WPI
├─test
│ ├─labels
│ │ label1.mat
│ │ label2.mat
│ │ …
│ │ label17.mat
│ │ readme.txt
│ ├─seq1
│ ├─seq2
│ ├─…
│ ├─seq17
│
└─trainval
├─labels
│ readme.txt
│ label01.mat
│ …
│ label07.mat
├─seq01
├─…
└─seq07

WPI数据集由两个文件夹组成，分别为test和trainval文件夹，其内部格式是相同的。文件夹内均有标注文件和对应交通灯图片。

其中，标注文件放在labels文件夹内，以mat格式存储标注数据，可用MATLAB打开。关于标注数据各个值所代表的含义可查看readme.txt文件。图片文件放在seq文件夹内，label序号和图片所放的文件夹seq序号一一对应。

3.格式转换步骤

首先要先做一些准备工作。查看WPI文件夹，可以发现标注文件和图片文件命名并不一致。为了保持一致，需要做出相应修改：

将labels内标注文件命名改为对应的seqX.mat；
将trainval内序号从0X改为X.

改好后，文件树结构应为：

└─WPI
├─test
│ ├─labels
│ │ seq1.mat
│ │ seq2.mat
│ │ …
│ │ seq17.mat
│ │ readme.txt
│ ├─seq1
│ ├─seq2
│ ├─…
│ ├─seq17
│
└─trainval
├─labels
│ readme.txt
│ seq1.mat
│ …
│ seq7.mat
├─seq1
├─…
└─seq7
test文件夹中有些图片并没有对应的标注文件，为了脚本需要，要将其无标注的图片删去。那么如何知道哪些图片没有标注呢？
下面我以test文件夹下的seq6举例。红圈1表示的是seq6中的图片的标注信息， $110\times7$ 表示seq6中前110张图片包含一类红绿灯目标， $101\times7$ 表示seq6中前101张图片包含另一类红绿灯目标。而红圈2说明了seq6文件夹内一个有156张图片，即从第111张图片开始就不存在任何标注信息。我们只需要删除从第111张到最后一张图片即可。接下来从seq1到seq17重复一次就好了。

可以说这个步骤很繁琐了，但当时感觉这个用代码写水平不太够，也只能用这种笨方法了。当然，如果你觉得数据集够了，也可以不用test数据集，只用train数据集就不用做上面的步骤了。但那样的话得稍稍修改下代码。

准备工作做好之后就可以进行数据集格式的转换了。整个过程主要分为三步：

将WPI内的图片重新命名并将其放入VOC2007内的JPEGImages文件夹内；
将mat标注文件转换为VOC标注格式的XML文件，并放入VOC2007内的Annotations文件夹内；
将数据集重新分割为test和trainval并生成对应txt文件，放入VOC2007内的ImageSets\Main文件夹内.

3.1.Step 1

脚本代码如下：

import os
import shutil


trainval_seqs = ['seq1', 'seq2', 'seq3', 'seq4', 'seq5', 'seq6', 'seq7']
test_seqs = ['seq1', 'seq2', 'seq3', 'seq4', 'seq5', 'seq6', 'seq7', 'seq8', 'seq9', 'seq10',
             'seq11', 'seq12', 'seq13', 'seq14', 'seq15', 'seq16', 'seq17']
splits = {'test': test_seqs,
          'trainval': trainval_seqs
          }

# 以下两个参数需根据文件路径修改
newpath = r"C:\Users\zh99\Desktop\database\VOC2007\JPEGImages"  # VOC数据集JPEGImages文件夹路径
home = r"C:\Users\zh99\Desktop\database\WPI"  # wpi数据集路径


def file_rename(path, file_name, new_name):
    """
    修改文件名字
    :param path: 文件所在文件夹路径
    :param file_name: 文件名
    :param new_name: 修改后的文件名
    :return:
    """
    file = os.path.join(path, file_name)
    dirname, filename = os.path.split(file)  # 分离文件路径和文件名(不含后缀)
    new_file = os.path.join(dirname, new_name)
    os.rename(file, new_file)


def file_move(path, file_name, new_path):
    """
    移动文件到指定文件夹
    :param path: 文件路径
    :param file_name: 文件名
    :param new_path: 指定文件夹路径
    :return:
    """
    file = os.path.join(path, file_name)
    shutil.move(file, new_path)


def all_rename_and_move():
    for split, seqs in splits.items():
        if split == 'trainval':
            # print(seqs)
            for seq in seqs:
                path = os.path.join(home, split, seq)  # 文件夹路径
                temp_file = os.listdir(path)  # 文件名列表
                total_frame = []  # 取出其中jpg文件
                for file in temp_file:
                    if file.endswith(".jpg"):
                        total_frame.append(file)

                for file in total_frame:
                    file_rename(path, file, '%s_%s_%s' % (split, seq, file[-8:]))
                    file_move(path, '%s_%s_%s' % (split, seq, file[-8:]), newpath)
        else:
            for seq in seqs:
                path = os.path.join(home, split, seq)  # 文件夹路径
                temp_file = os.listdir(path)  # 文件名列表
                total_frame = []  # 取出其中jpg文件
                for file in temp_file:
                    if file.endswith(".jpg"):
                        total_frame.append(file)

                for file in total_frame:
                    file_rename(path, file, '%s_%s_%s' % (split, seq, file[-8:]))
                    file_move(path, '%s_%s_%s' % (split, seq, file[-8:]), newpath)


if __name__ == '__main__':
    all_rename_and_move()

3.2.Step 2

脚本代码如下：

"""
1.JPEGImages文件夹:改名
2.Annotations文件夹:xml文件 ok!
(1)mat=>txt
(2)txt=>xml
3.ImageSets.Main文件夹:数据集分割成trainval,test
"""
"""
the second step!
"""
import os
from scipy import io
from to_JPEGImages import home
from PIL import Image
import glob  # 文件搜索匹配

TXTPath = home
txt_splits = ['test', 'trainval']
WPI_CLASSES = (  # always index 0
    'GAL', 'GAR', 'GAF', 'GC', 'RAL', 'RC')
# 以下两个参数需根据文件路径修改
ImagePath = r"C:\Users\zh99\Desktop\database\VOC2007\JPEGImages"   # VOC数据集JPEGImages文件夹路径
XMLPath = r"C:\Users\zh99\Desktop\database\VOC2007\Annotations"    # VOC数据集Annotations文件夹路径


def read_mat(path, split):
    """
    读取mat文件数据
    :param path:
    :param split:
    :return:返回一个字典,内容为{seq:data}
    """
    matpath = os.path.join(path, split, 'labels')
    temp_file = os.listdir(matpath)
    matfile = []  # 存储mat文件名
    gt_dic = {}
    for file in temp_file:
        if file.endswith(".mat"):
            matfile.append(file)
    for mat in matfile:
        mat_dict = io.loadmat(os.path.join(matpath, mat))  # 加载mat文件 返回字典
        gt_data = mat_dict['GroundTruth']
        gt_data = gt_data.squeeze()  # 去掉维度为1的维度 得到一个np数组列表
        # [object_num, frame_num, 7] => [frame_num, object_num, 7] 把相同图片的目标合并
        object_num = len(gt_data)
        frame_num = 0
        for i in range(object_num):  # 得到图片数量
            temp_num = gt_data[i][-1][-3]
            if temp_num > frame_num:
                frame_num = temp_num
        gt = [[] for i in range(frame_num)]
        for obejct in gt_data:
            for frame in obejct:
                frame = list(frame)
                gt[frame[-3] - 1].append(frame)
        gt_dic[mat] = gt
    return gt_dic


def create_txt(gt_dic, split, imagepath):
    """
    将字典的数据写入txt文件
    :param gt_dic: 从mat读取的数据
    :param split:
    :param imagepath: 图片存放路径
    :return:
    """
    file_path = os.path.join(home, 'annotation.txt')
    gtdata_txt = open(file_path, 'a')  # 若文件已存在,则将数据写在后面而不是覆盖 --'a'
    for seq, data in gt_dic.items():
        for frame_id, objects in enumerate(data):
            gtdata_txt.write('%s_%s_%s' % (split, seq[:-4], str(data[frame_id][0][-3]).rjust(4, '0')))
            for x, y, w, h, _, _, label in objects:
                coordinate = change_coordinate(x, y, w, h)
                label = label - 1
                gtdata_txt.write(" " + ",".join([str(a) for a in coordinate]) + ',' + str(label))
            gtdata_txt.write('\n')
    gtdata_txt.close()


def creat_annotation():
    for split in txt_splits:
        gt_dic = read_mat(home, split)
        create_txt(gt_dic, split, ImagePath)


def change_coordinate(x, y, w, h):
    xmin = x
    ymin = y
    xmax = x + w
    ymax = y + h
    return xmin, ymin, xmax, ymax


def txt2xml(txtpath, xmlpath, imagepath):
    """
    txt => xml
    :param txtpath: annotation.txt路径
    :param xmlpath: xml保存路径
    :param imagepath: 图片路径
    :return:
    """
    # 打开txt文件
    lines = open(txtpath + '/' + 'annotation.txt').read().splitlines()
    gts = {}  # name: ['ob1', 'ob2', ...]
    for line in lines:
        gt = line.split(' ')
        key = gt[0]
        gt.pop(0)
        gts[key] = gt
    # 获得图片名字
    ImagePathList = glob.glob(imagepath + '/*.jpg')  # 字符串列表
    ImageBaseNames = []
    for image in ImagePathList:
        ImageBaseNames.append(os.path.basename(image))
    ImageNames = []  # 无后缀
    for image in ImageBaseNames:
        name, _ = os.path.splitext(image)  # 分开名字和后缀
        ImageNames.append(name)
    for name in ImageNames:
        img = Image.open(imagepath + '/' + name + '.jpg')
        width, height = img.size
        # 打开xml文件
        xml_file = open((xmlpath + '/' + name + '.xml'), 'w')
        xml_file.write('<annotation>\n')
        xml_file.write('    <folder>VOC2007</folder>\n')
        xml_file.write('    <filename>' + name + '.jpg' + '</filename>\n')
        xml_file.write('    <size>\n')
        xml_file.write('        <width>' + str(width) + '</width>\n')
        xml_file.write('        <height>' + str(height) + '</height>\n')
        xml_file.write('        <depth>3</depth>\n')
        xml_file.write('    </size>\n')

        gts_data = gts[name]
        for ob_id in range(len(gts_data)):
            gt_data = gts_data[ob_id].split(',')
            xml_file.write('    <object>\n')
            xml_file.write('        <name>' + WPI_CLASSES[int(gt_data[-1])] + '</name>\n')
            xml_file.write('        <pose>Unspecified</pose>\n')
            xml_file.write('        <truncated>0</truncated>\n')
            xml_file.write('        <difficult>0</difficult>\n')
            xml_file.write('        <bndbox>\n')
            xml_file.write('            <xmin>' + gt_data[0] + '</xmin>\n')
            xml_file.write('            <ymin>' + gt_data[1] + '</ymin>\n')
            xml_file.write('            <xmax>' + gt_data[2] + '</xmax>\n')
            xml_file.write('            <ymax>' + gt_data[3] + '</ymax>\n')
            xml_file.write('        </bndbox>\n')
            xml_file.write('    </object>\n')
        xml_file.write('</annotation>')


if __name__ == '__main__':
    creat_annotation()
    txt2xml(TXTPath, XMLPath, ImagePath)

3.3.Step 3

脚本代码如下：

"""
1.JPEGImages文件夹:改名
2.Annotations文件夹:xml文件
(1)mat=>txt
(2)txt=>xml
3.ImageSets.Main文件夹:数据集分割成train,val,test: 1752,751,626 ok!
"""
"""
the third step!
"""
import os
import random
from to_Annotations import WPI_CLASSES, TXTPath

sample_split = ['test', 'trainval', 'train', 'val']
# 以下参数需根据文件路径修改
root_dir = r"C:\Users\zh99\Desktop\database\VOC2007"  # VOC根路径


def dataset_split(trainval_percent=0.8, train_percent=0.7):
    """
    数据集划分
    :param trainval_percent:
    :param train_percent:
    :return:
    """
    # 0.8trainval 0.2test
    xmlfilepath = root_dir + '/Annotations'
    txtsavepath = root_dir + '/ImageSets/Main'
    total_xml = os.listdir(xmlfilepath)

    num = len(total_xml)  # 3129
    list = range(num)
    tv = int(num * trainval_percent)  # 2503
    tr = int(tv * train_percent)  # 2503*0.7=1752
    trainval = random.sample(list, tv)
    train = random.sample(trainval, tr)

    ftrainval = open(root_dir + '/ImageSets/Main/trainval.txt', 'w')
    ftest = open(root_dir + '/ImageSets/Main/test.txt', 'w')
    ftrain = open(root_dir + '/ImageSets/Main/train.txt', 'w')
    fval = open(root_dir + '/ImageSets/Main/val.txt', 'w')

    for i in list:
        name = total_xml[i][:-4] + '\n'
        if i in trainval:
            ftrainval.write(name)
            if i in train:
                ftrain.write(name)
            else:
                fval.write(name)
        else:
            ftest.write(name)

    ftrainval.close()
    ftrain.close()
    fval.close()
    ftest.close()


def sample_describe(split='trainval', cls_id=0):
    """
    生成cls_split.txt
    :param split:
    :param cls: 类别
    :return:
    """
    describe_txt = open(root_dir + '/ImageSets/Main/%s_%s.txt' % (WPI_CLASSES[cls_id], split), 'w')
    split_txt = open(root_dir + '/ImageSets/Main/%s.txt' % split).read().splitlines()
    annotation_lines = open(os.path.join(TXTPath, 'annotation.txt')).read().splitlines()
    split_gts = {}  # name: ['ob1', 'ob2', ...] 选出对应split的图片信息
    for line in annotation_lines:
        gt = line.split(' ')
        key = gt[0]
        if key in split_txt:
            gt.pop(0)
            split_gts[key] = gt
    for name, bboxs in split_gts.items():
        sample = -1
        for bbox_id, bbox in enumerate(bboxs):
            bbox_cls = bboxs[bbox_id][-1]
            if int(bbox_cls) == cls_id:
                sample = 1
        describe_txt.write(name + ' ' + str(sample) + '\n')
    describe_txt.close()


def all_sample_describe():
    for split in sample_split:
        for cls_id, cls in enumerate(WPI_CLASSES):
            sample_describe(split=split, cls_id=cls_id)


if __name__ == '__main__':
    dataset_split()
    all_sample_describe()