ICDAR2015数据集标注转为ICDAR2013格式

icdar2015为八个角坐标格式,为了训练需要左上角和右下角坐标格式,而且去掉标注为###的框,由于我是训练检测,所以全部标注为text类

测试集标注

需要创建好Test_GT文件夹

for i in range(1,501):
    txts = 'Challenge4_Test_Task1_GT/gt_img_' + str(i) + '.txt'
    w_txts = 'Test_GT/gt_img_' + str(i) + '.txt'
    lines = open(txts,encoding='utf-8-sig').readlines()

    txt_file = open(w_txts,'w')
    for line in lines:
        numbers = line.split(',')
        x1 = min(numbers[0],numbers[2],numbers[4],numbers[6])
        x2 = max(numbers[0],numbers[2],numbers[4],numbers[6])
        y1 = min(numbers[1],numbers[3],numbers[5],numbers[7])
        y2 = max(numbers[1],numbers[3],numbers[5],numbers[7])

        if '###' not in numbers[8]:
            txt_file.write(x1 + ' ' + y1 + ' ' + x2 + ' ' + y2 + ' ' + 'text'  + '\n')

训练集标注

需要创建好Training_GT文件夹

for i in range(1,501):
    txts = 'ch4_training_localization_transcription_gt/gt_img_' + str(i) + '.txt'
    w_txts = 'Training_GT/gt_img_' + str(i) + '.txt'
    lines = open(txts,encoding='utf-8-sig').readlines()

    txt_file = open(w_txts,'w')
    for line in lines:
        numbers = line.split(',')
        x1 = min(numbers[0],numbers[2],numbers[4],numbers[6])
        x2 = max(numbers[0],numbers[2],numbers[4],numbers[6])
        y1 = min(numbers[1],numbers[3],numbers[5],numbers[7])
        y2 = max(numbers[1],numbers[3],numbers[5],numbers[7])

        if '###' not in numbers[8]:
            txt_file.write(x1 + ' ' + y1 + ' ' + x2 + ' ' + y2 + ' ' + 'text'  + '\n')
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值