利用python完成ICDAR2015数据集格式转换,转换为PASCAL_VOC数据集格式

目录

 

1.ICDAR2015数据集

2.文件夹准备

3.python实现

4.转换之后的结果

5.参考博客:


1.ICDAR2015数据集

(1)下载

数据集资料找了很久,最后还是在csdn上下载的,下载地址:

mahttps://download.csdn.net/download/moonshapedpool/10645292

我没有积分,花了两元在淘宝买的代下.解压之后三个文件夹

(2)内容与格式

训练图像集:ch4_training_images

训练标注集:ch4_training_localization_transcription_gt

测试图像集:ch4_test_images

其中,ICDAR2015不包含测试标注集,但提供了测试web接口。因此,这里只对训练集进行了转换。

标注格式:x1,y1,x2,y2,x3,y3,x4,y4,text

其中,x1,y1为左上角坐标,x2,y2为右上角坐标,x3,y3为右下角坐标,x4,y4为左下角坐标。‘###’表示text难以辨认。

 

2.文件夹准备

新建文件夹VOC2007,并且在下面新建Annotations,ImageSets,JPEGImages文件夹,然后在ImageSets新建Main文件夹,目录如下:

VOC2007

-VOC2007/Annotations

-VOC2007/ImageSets

-VOC2007/ImageSets/Main

-VOC2007/JPEGImages
 

3.python实现

(1)在pycharm中新建项目,基于python2,新建一个python文件,然后利用File->settings安装依赖包:

备注:此处遇到一个问题,我电脑是win10,64位的,直接安装PIL会失败,然后百度出来的解决方案是安装Pillow-PIL.

参考博客:

https://blog.csdn.net/weixin_39837709/article/details/79829428

(2)在新建的python文件中添加以下代码,

# ! /usr/bin/python
# coding:utf-8

import os, sys
import glob
from PIL import Image
import cv2
import numpy as np

# target dir
base_dir = "E:/RuiJie/py-faster-rcnn/VOC2007"

target_img_dir = base_dir + "/" + "JPEGImages/"
target_ann_dir = base_dir + "/" + "Annotations/"
target_set_dir = base_dir + "/" + "ImageSets/"

# source train dir
train_img_dir = "E:/RuiJie/py-faster-rcnn/ICDAR2015data/ch4_training_images/"
train_txt_dir = "E:/RuiJie/py-faster-rcnn/ICDAR2015data/ch4_training_localization_transcription_gt/"

test_img_dir = "E:/RuiJie/py-faster-rcnn/ICDAR2015data/ch4_test_images"

# rename and move img to target_img_dir
# train img

for file in os.listdir(train_img_dir):
    os.rename(os.path.join(train_img_dir, file),
              os.path.join(target_img_dir, "ICDAR2015_Train_" + os.path.basename(file)))

for file in os.listdir(test_img_dir):
    os.rename(os.path.join(test_img_dir, file),
              os.path.join(target_img_dir, "ICDAR2015_Test_" + os.path.basename(file)))

gt_list = []
img_list = []

for file_name in os.listdir(target_img_dir):
    img_list.append(file_name)

for idx in range(len(img_list)):
    img_name = target_img_dir + img_list[idx]
    gt_name = train_txt_dir + 'gt_img_' + img_list[idx].split('.')[0].split('_')[3] + '.txt'

    # print gt_name
    gt_obj = open(gt_name, 'r')

    gt_txt = gt_obj.read()

    gt_split = gt_txt.split('\n')

    img = cv2.imread(img_name)

    im = Image.open(img_name)
    imgwidth, imgheight = im.size

    # write in xml file
    xml_file = open((target_ann_dir + img_list[idx].split('.')[0] + '.xml'), 'w')
    xml_file.write('<annotation>\n')
    xml_file.write('    <folder>VOC2007</folder>\n')
    xml_file.write('    <filename>' + img_list[idx] + '</filename>\n')
    xml_file.write('    <size>\n')
    xml_file.write('        <width>' + str(imgwidth) + '</width>\n')
    xml_file.write('        <height>' + str(imgheight) + '</height>\n')
    xml_file.write('        <depth>3</depth>\n')
    xml_file.write('    </size>\n')

    f = False
    difficult = 0
    for gt_line in open(gt_name):
        gt_ind = gt_line.split(',')
        if len(gt_ind) > 3:
            gt_ind[0] = filter(str.isdigit, gt_ind[0])
            pt1 = (int(gt_ind[0]), int(gt_ind[1]))
            pt2 = (int(gt_ind[2]), int(gt_ind[3]))
            pt3 = (int(gt_ind[4]), int(gt_ind[5]))
            pt4 = (int(gt_ind[6]), int(gt_ind[7]))
            dtxt = gt_ind[8]
            if "###" in dtxt:
                difficult = 1
            else:
                difficult = 0

            edge1 = np.sqrt((pt1[0] - pt2[0]) * (pt1[0] - pt2[0]) + (pt1[1] - pt2[1]) * (pt1[1] - pt2[1]))
            edge2 = np.sqrt((pt2[0] - pt3[0]) * (pt2[0] - pt3[0]) + (pt2[1] - pt3[1]) * (pt2[1] - pt3[1]))

            angle = 0

            if edge1 > edge2:

                width = edge1
                height = edge2
                if pt1[0] - pt2[0] != 0:
                    angle = -np.arctan(float(pt1[1] - pt2[1]) / float(pt1[0] - pt2[0])) / 3.1415926 * 180
                else:
                    angle = 90.0
            elif edge2 >= edge1:
                width = edge2
                height = edge1
                # print pt2[0], pt3[0]
                if pt2[0] - pt3[0] != 0:
                    angle = -np.arctan(float(pt2[1] - pt3[1]) / float(pt2[0] - pt3[0])) / 3.1415926 * 180
                else:
                    angle = 90.0
            if angle < -45.0:
                angle = angle + 180

            x_ctr = float(pt1[0] + pt3[0]) / 2  # pt1[0] + np.abs(float(pt1[0] - pt3[0])) / 2
            y_ctr = float(pt1[1] + pt3[1]) / 2  # pt1[1] + np.abs(float(pt1[1] - pt3[1])) / 2

            # write the region of text on xml file
            xml_file.write('    <object>\n')
            xml_file.write('        <name>text</name>\n')
            xml_file.write('        <pose>Unspecified</pose>\n')
            xml_file.write('        <truncated>0</truncated>\n')
            xml_file.write('        <difficult>' + str(difficult) + '</difficult>\n')
            xml_file.write('        <bndbox>\n')
            xml_file.write('            <x>' + str(x_ctr) + '</x>\n')
            xml_file.write('            <y>' + str(y_ctr) + '</y>\n')
            xml_file.write('            <w>' + str(width) + '</w>\n')
            xml_file.write('            <h>' + str(height) + '</h>\n')
            xml_file.write('            <theta>' + str(angle) + '</theta>\n')
            xml_file.write('        </bndbox>\n')
            xml_file.write('    </object>\n')

    xml_file.write('</annotation>')

# write info into target_set_dir
img_lists = glob.glob(target_ann_dir + '/*.xml')
img_names = []
for item in img_lists:
    temp1, temp2 = os.path.splitext(os.path.basename(item))
    img_names.append(temp1)

train_fd = open(target_set_dir + "/Main/trainval.txt", 'w')
for item in img_names:
    train_fd.write(str(item) + '\n')

注意:修改路径

base_dir为自己电脑中存放之前新建的VOC2007的路径
train_img_dir,train_txt_dir,test_img_dir分别为自己电脑中存放IDCAR2015三个文件夹的路径

 

4.转换之后的结果

VOC2007\Annotations下有1500个xml文件

VOC2007\ImageSets\Main下有1个trainval.txt文件

E:\RuiJie\py-faster-rcnn\VOC2007\JPEGImages下有1500张图片

转换之后的格式

 

5.参考博客:

https://blog.csdn.net/u013250416/article/details/78821877

 

 

  • 1
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 12
    评论
评论 12
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值