python计算moran_MORAN: 一种用于场景文本识别的多目标纠正注意网络

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

Python 2.7

Python 3.6

MORAN is a network with rectification mechanism for general scene text recognition. The paper (accepted to appear in Pattern Recognition, 2019) in arXiv, final version is available now.

Recent Update

2019.03.21 Fix a bug about Fractional Pickup.

Support Python 3.

Improvements of MORAN v2:

More stable rectification network for one-stage training

Replace VGG backbone by ResNet

Use bidirectional decoder (a trick borrowed from ASTER)

Version

IIIT5K

SVT

IC03

IC13

SVT-P

CUTE80

IC15 (1811)

IC15 (2077)

MORAN v1 (curriculum training)*

91.2

88.3

95.0

92.4

76.1

77.4

74.7

68.8

MORAN v2 (one-stage training)

93.4

88.3

94.2

93.2

79.7

81.9

77.8

73.9

*The results of v1 were reported in our paper. If this project is helpful for your research, please cite our Pattern Recognition paper.

Requirements

(Welcome to develop MORAN together.)

We recommend you to use Anaconda to manage your libraries.

Python 2.7 or Python 3.6 (Python 3 is faster than Python 2)

PyTorch 0.3.* (Higher version causes slow training, please ref to issue#8)

Or use pip to install the libraries. (Maybe the torch is different from the anaconda version. Please check carefully and fix the warnings in training stage if necessary.)

pip install -r requirements.txt

Data Preparation

Please convert your own dataset to LMDB format by using the tool (run in Python 2.7) provided by @Baoguang Shi.

You can also download the training (NIPS 2014, CVPR 2016) and testing datasets prepared by us.

The raw pictures of testing datasets can be found here.

Training and Testing

Modify the path to dataset folder in train_MORAN.sh:

--train_nips path_to_dataset \

--train_cvpr path_to_dataset \

--valroot path_to_dataset \

And start training: (manually decrease the learning rate for your task)

sh train_MORAN.sh

The training process should take less than 20s for 100 iterations on a 1080Ti.

Demo

Download the model parameter file demo.pth.

Put it into root folder. Then, execute the demo.py for more visualizations.

python demo.py

Citation

@article{cluo2019moran,

author = {Canjie Luo and Lianwen Jin and Zenghui Sun},

title = {MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition},

journal = {Pattern Recognition},

volume = {90},

pages = {109--118},

year = {2019},

publisher = {Elsevier}

}

Acknowledgment

The repo is developed based on @Jieru Mei's crnn.pytorch and @marvis' ocr_attention. Thanks for your contribution.

Attention

The project is only free for academic research purposes.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值