多方向文本检测数据集MSRA-TD500和多方向文本识别数据集HUST-TR400坐标转换

1.数据介绍
多方向文本检测数据集(MSRA-TD 500)
C. Yao, X. Bai, W. Liu, Y. Ma, Z. Tu. Detecting texts of arbitrary orientations in natural images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12), Providence, Rhode Island, 2012. (MSRA-TD 500 Dataset)
下载地址:http://pages.ucsd.edu/~ztu/publication/MSRA-TD500.zip
Format of the ground truth files
在这里插入图片描述
多方向文本识别数据集(HUST-TR 400)
C. Yao, X. Bai, W. Liu. A Unified Framework for Multi-Oriented Text Detection and Recognition. IEEE Transactions on Image Processing (TIP), 23(11): 4737 - 4749, 2014.
下载地址:http://mclab.eic.hust.edu.cn/UpLoadFiles/dataset/HUST-TR400.zip
Format of the ground truth files
在这里插入图片描述
2.转换
将上述ground truth files(gt文件)原旋转坐标,批量转换成多边形坐标txt文件,类似于ICDAR2017那样的形式。
批量转换代码如下:

# coding:utf-8

from math import *
import math
import os


def rotate(angle, x, y):
    """
    基于原点的弧度旋转
    :param angle:   弧度
    :param x:       x
    :param y:       y
    :return:
    """
    rotatex = math.cos(angle) * x - math.sin(angle) * y
    rotatey = math.cos(angle) * y + math.sin(angle) * x
    return rotatex, rotatey


def xy_rorate(theta, x, y, centerx, centery):
    """
    针对中心点进行旋转
    :param theta:
    :param x:
    :param y:
    :param centerx:
    :param centery:
    :return:
    """
    r_x, r_y = rotate(theta, x - centerx, y - centery)
    return centerx + r_x, centery + r_y


def rec_rotate(x, y, width, height, theta):
    """
    传入矩形的x,y和宽度高度,弧度,转成QUAD格式
    :param x:
    :param y:
    :param width:
    :param height:
    :param theta:
    :return:
    """
    centerx = x + width / 2
    centery = y + height / 2

    x1, y1 = xy_rorate(theta, x, y, centerx, centery)
    x2, y2 = xy_rorate(theta, x + width, y, centerx, centery)
    x3, y3 = xy_rorate(theta, x, y + height, centerx, centery)
    x4, y4 = xy_rorate(theta, x + width, y + height, centerx, centery)

    return x1, y1, x3, y3, x4, y4, x2, y2


dst_dir = 'D:/迅雷下载/MSRA-TD500/MSRA-TD500/train_gt/'
save_dir = "D:/迅雷下载/MSRA-TD500/MSRA-TD500/train_txt/"
for fileName in os.listdir(dst_dir):

    fname = dst_dir+fileName
    print(fname)
    if fname.endswith(".gt"):
        f = open(fname, 'r')
        savestr = ''
        for line in f:
            line = line.strip()
            line = line.split(' ')
            line = list(map(float, line))          # MSRA-TD500 gt
            # line = list(map(float, line[0:6]))   # HUST-TR400 gt
            x, y = line[2], line[3]
            w, h = line[4], line[5]
            # centralx=x+w/2
            # centraly = y + h / 2
            points = [x, y, x, y + h, x + w, y + h, x + w, y]
            pointsrotate = rec_rotate(x, y, w, h, line[-1])
            savestr = savestr + str(int(pointsrotate[0])) + ',' + str(int(pointsrotate[1])) + ',' + str(
                int(pointsrotate[2])) + ',' + str(int(pointsrotate[3])) + ',' + str(int(pointsrotate[4])) + ',' + str(
                int(pointsrotate[5])) + ',' + str(int(pointsrotate[6])) + ',' + str(int(pointsrotate[7])) + ',' + 'text\n'
        savename = save_dir+fileName.split(".")[0]+'.txt'
        savef = open(savename, 'w')
        savef.write(savestr)
        savef.close()

转换后的txt文件格式如下:
在这里插入图片描述

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值