java fix sence_vedastr: A scene text recognition toolbox based on pytorch

Introduction

Vedastr is an open source scene text recognition toolbox based on PyTorch. It is designed to be flexible

in order to support rapid implementation and evaluation for scene text recognition task.

Features

Modular design

We decompose the scene text recognition framework into different components and one can

easily construct a customized scene text recognition framework by combining different modules.

Flexibility

Vedastr is flexible enough to be able to easily change the components within a module.

Module expansibility

It is easy to integrate a new module into the vedastr project.

Support of multiple frameworks

The toolbox supports several popular scene text recognition framework, e.g., CRNN,

TPS-ResNet-BiLSTM-Attention, Transformer, etc.

Good performance

We re-implement the best model in deep-text-recognition-benchmark

and get better average accuracy. What's more, we implement a simple baseline(ResNet-FC)

and the performance is acceptable.

License

This project is released under Apache 2.0 license.

Benchmark and model zoo

Note:

We test our model on IIIT5K_3000,

SVT,

IC03_867,

IC13_1015,

IC15_2077,SVTP,

CUTE80. The training data we used is MJSynth(MJ) and

SynthText(ST). You can find the

datasets below.

MODEL

CASE SENSITIVE

IIIT5k_3000

SVT

IC03_867

IC13_1015

IC15_2077

SVTP

CUTE80

AVERAGE

False

87.33

87.79

95.04

92.61

74.45

81.09

74.91

84.95

False

85.03

86.4

94

91.03

70.29

77.67

71.43

82.38

False

88.87

88.87

96.19

93.99

79.08

84.81

84.67

87.55

AVERAGE : Average accuracy over all test datasets

TPS : Spatial transformer network

Small-SATRN: On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention,

training phase is case sensitive while testing phase is case insensitive.

CASE SENSITIVE : If true, the output is case sensitive and contain common characters.

If false, the output is not case sentive and contains only numbers and letters.

Installation

Requirements

Linux

Python 3.6+

PyTorch 1.2.0 or higher

CUDA 9.0 or higher

We have tested the following versions of OS and softwares:

OS: Ubuntu 16.04.6 LTS

CUDA: 9.0

Python 3.6.9

Install vedastr

a. Create a conda virtual environment and activate it.

conda create -n vedastr python=3.6 -y

conda activate vedastr

b. Install PyTorch and torchvision following the official instructions,

e.g.,

conda installpytorch torchvision -c pytorch

c. Clone the vedastr repository.

git clone https://github.com/Media-Smart/vedastr.git

cdvedastr

vedastr_root=${PWD}

d. Install dependencies.

pip install -r requirements.txt

Prepare data

a. Download Lmdb data from deep-text-recognition-benchmark,

which contains training data, validation data and evaluation data.

b. Make directory data as follows:

cd ${vedastr_root}

mkdir ${vedastr_root}/data

c. Put the download Lmdb data into this data directory, the structure of data directory will look like as follows:

data

└── data_lmdb_release

├── evaluation

├── training

│   ├── MJ

│   │   ├── MJ_test

│   │   ├── MJ_train

│   │   └── MJ_valid

│   └── ST

└── validation

Train

a. Config

Modify some configuration accordingly in the config file like configs/tps_resnet_bilstm_attn.py

b. Run

python tools/trainval.py configs/tps_resnet_bilstm_attn.py

Snapshots and logs will be generated at vedastr/workdir.

Test

a. Config

Modify some configuration accordingly in the config file like configs/tps_resnet_bilstm_attn.py

b. Run

python tools/test.py configs/tps_resnet_bilstm_attn.py path_to_tps_resnet_bilstm_attn_weights

Demo

a. Run

python tools/demo.py config-path weight-path img-path

Contact

This repository is currently maintained by Jun Sun(@ChaseMonsterAway), Hongxiang Cai (@hxcai), Yichao Xiong (@mileistone).

Credits

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值