文字检测与识别资源

置顶 PeaceInMind

已于 2022-01-21 15:59:25 修改

阅读量6.8w

点赞数 56

分类专栏：图像文字检测与识别文章标签：计算机视觉深度学习人工智能

于 2016-05-12 20:38:45 首次发布

本文链接：https://blog.csdn.net/PeaceInMind/article/details/51387367

版权

图像同时被 2 个专栏收录

24 篇文章 1 订阅

订阅专栏

文字检测与识别

10 篇文章 41 订阅

订阅专栏

本文写成时主要参考了[1,2], 后面加了一些自己收集的，不过大家都在更新，所以区别不是很大。蓝色部分代表最近新增的部分

[2015-PAMI-Overview]Text Detection and Recognition in Imagery: A Survey[paper]

[2014-Front.Comput.Sci-Overview]Scene Text Detection and Recognition: Recent Advances and Future Trends[paper]

[2018-arxiv ]Scene Text Detection and Recognition: The Deep Learning Era[paper][github]

未看或未总结

[2021-CVPR]Primitive Representation Learning for Scene Text Recognition

[2021-CVPR]Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter

[2021-CVPR]Sequence-to-Sequence Contrastive Learning for Text Recognition

[2021-CVPR]Self-attention based Text Knowledge Mining for Text Detection

[2021-PAMI-检测识别]ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting

[2021-CVPR-识别-Oral]Dictionary-guided Scene Text Recognition

[2020-ACMMM]Textray: Contour-based geometric modeling for arbitrary-shaped scene text detection.

[2020-CVPR-识别]On Vocabulary Reliance in Scene Text Recognition

[2020-CVPR-识别]SCATTER: Selective Context Attentional Scene Text Recognizer

[2020-CVPR-识别]Towards Accurate Scene Text Recognition With Semantic Reasoning Networks

[2020-CVPR-识别]SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

[2020-CVPR-检测]Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection

[2020-CVPR-检测]ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection

[2020-ECCV]Sequential Deformation for Accurate Scene Text Detection

[2020-ECCV]Adaptive Text Recognition through Visual Matching

[2020-ECCV]RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition

[2020-ECCV]AutoSTR: Efficient Backbone Search for Scene Text Recognition

[2020-ECCV]AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

[2020-ECCV]Character Region Attention For Text Spotting

[2019-ICCV]Geometry Normalization Networks for Accurate Scene Text Detection

[2019-ICCV]Symmetry constrained rectification network for scene text recognition.

[2019-ICCV-检测识别]TextDragon: An end-to-end framework for arbitrary shaped text spotting

[2019-ICCV-检测识别]Convolutional Character Networks.

[2019-ICCV-检测识别]Towards unconstrained end-to-end textspotting

[2018-CVPR]An end-to-end textspotter with explicit alignment and attention.

[2018-CVPR]Fots: Fast oriented text spotting with a unified network.

[2018-CVPR]Geometry-Aware Scene Text Detection with Instance Transformation Network

[2018-CVPR]Rotation-Sensitive Regression for Oriented Scene Text Detection

[2018-ECCV]Textsnake: A flexible representation for detecting text of arbitrary shapes. In Proc. ECCV, 2018.

自然场景文字检测

[2021-CVPR-检测矩形]MOST: A Multi-Oriented Scene Text Detector with Localization Refinement

[2021-CVPR-检测任意]Fourier Contour Embedding for Arbitrary-Shaped Text Detection

确实加深了自己对傅里叶变换的理解，这个博客上有gif形象地展示了轮廓拟合的过程MMOCR 更新！FCENet 了解一下？！ - 知乎，对个人的冲击还是挺大的

[2020-ECCV-检测和识别]Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting

[2020-CVPR] ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network[paper][code]

主要引入了Bezier-Curve替代box-regression的box, 挺优美的，也开源了代码。不过还是想看下回归的准确率，因为个人觉得是不太高的

[2020-AAAI ] Real-time Scene Text Detection with Differentiable Binarization[paper][code]

论文解决的论文挺创新的,解决怎么去学一个threshold

[2019-IJCAI] Omnidirectional Scene Text Detection with Sequential-free Box Discretization[paper]

[201905-arxiv] Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation[paper]

[2019-ICCV] Geometry normalization networks for accurate scene text detection

[2019-CVPR] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes[paper]

[2019-CVPR]character region awareness for text detection[paper]

[2019-CVPPR ]learning shape aware embedding for scene text detection [paper]

[201903-arxiv] Pyramid Mask Text Detector [paper]

[201903-arxiv] Curve Text Detection with Local Segmentation Network and Curve Connection [paper]

[2019-PR] Curved scene text detection via transverse and longitudinal sequence connection [paper]

[201812-arxiv]TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [paper]

[201812-ACCV] TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network [paper]

[2019-AAAI] Scene Text Detection with Supervised Pyramid Context Network [paper]

[201811-arxiv] Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks [paper]

[201811-arxiv] A Novel Integrated Framework for Learning both Text Detection and Recognition[paper]

[2018-arxiv] Correlation Propagation Networks for Scene Text Detection[paper]

[2018-arxiv] Sliding Line Point Regression for Shape Robust Scene Text Detection [paper]

[2018-ECCV] Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping[paper]

[2018-CVPR] Learning markov clustering networks for scene text detection [paper]

[2018-CVPR] Geometry-Aware Scene Text Detection with Instance Transformation Network [paper]

[2018-ECCV]Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes[paper]

[2018-ECCV]TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes [paper]

[2018-arxiv]Shape Robust Text Detection with Progressive Scale Expansion Network[paper]

[2018-IJCAI] IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection [paper]

[2018-arxiv] An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches [paper]

[2018-CVPR]Rotation-Sensitive Regression for Oriented Scene Text Detection[paper]

[2018-CVPR] Single Shot Text Spotter with Explicit Alignment and Attention[paper]

[2018-CVPR] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation [paper]

[2018-arxiv]TextBoxes++: A Single-Shot Oriented Scene Text Detector[paper][code]

[2018-arxiv]FOTS: Fast OrientedText Spotting with a Unified Network[paper]

[2018-AAAI]PixelLink: Detecting Scene Text via Instance Segmentation[paper][code]

[2017-ICCV]Self-organized text detection with minimal post-processing [paper]

[2018-AAAI] Feature enhancement network: A refinened scene text detector [paper]

[2017-CVPR]Multiscale fcn with cascaded instance aware segmentation for arbitrary oriented word

spotting in the wild [paper]

[2017-ICCV]Towards end-to-end text spotting with convolutional recurrent neural networks [paper]

[2017-arXiv]Fused Text Segmentation Networks for Multi-oriented Scene Text Detection[paper]

[2017-arXiv]WeText: Scene Text Detection under Weak Supervision[paper]

[2017-ICCV]Single Shot Text Detector with Regional Attention[paper]

[2017-ICCV]WordSup: Exploiting Word Annotations for Character based Text Detection[paper]

[2017-arXiv]R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection[paper]

[2017-CVPR]EAST: An Efficient and Accurate Scene Text Detector [paper][code]

[2017-arXiv]Cascaded Segmentation-Detection Networks for Word-Level Text Spotting[paper]

[2017-arXiv]Deep Direct Regression for Multi-Oriented Scene Text Detection[paper]

[2017-CVPR]Detecting oriented text in natural images by linking segments [paper][code]

[2017-CVPR]Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection[paper]

[2017-arXiv]Arbitrary-Oriented Scene Text Detection via Rotation Proposals [paper]

[2017-AAAI]TextBoxes: A Fast Text Detector with a Single Deep Neural Network[paper][code]

[2016-arXiv]Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork [paper]

[2016-arXiv]DeepText : A Unified Framework for Text Proposal Generation and Text Detectionin Natural Images [paper] [data]

[2017-PR]TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [paper] [code]

[2016-arXiv] SceneText Detection via Holistic, Multi-Channel Prediction [paper]

[2016-CVPR] CannyText Detector: Fast and Robust Scene Text Localization Algorithm [paper]

[2016-CVPR]Synthetic Data for Text Localisation in Natural Images [paper] [data][code]

[2016-ECCV]Detecting Text in Natural Image with Connectionist Text Proposal Network[paper][demo][code]

[2016-TIP]Text-Attentional Convolutional Neural Networks for Scene Text Detection [paper]

[2016-IJDAR]TextCatcher: a method to detect curved and challenging text in natural scenes[paper]

[2016-CVPR]Multi-oriented text detection with fully convolutional networks [paper]

[2015-TPRMI]Real-time Lexicon-free Scene Text Localization and Recognition[paper]

[2015-CVPR]Symmetry-Based Text Line Detection in Natural Scenes[paper][code]

[2015-ICCV]FASText: Efficient unconstrained scene text detector[paper][code]

[2015-D.PhilThesis] Deep Learning for Text Spotting [paper]

[2015 ICDAR]Object Proposals for Text Extraction in the Wild [paper] [code]

[2014-ECCV] Deep Features for Text Spotting [paper] [code] [model] [GitXiv]

[2014-TPAMI] Word Spotting and Recognition with Embedded Attributes [paper] [homepage] [code]

[2014-TPRMI]Robust Text Detection in Natural Scene Images[paper]

[2014-ECCV] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [paper]

[2013-ICCV] Photo OCR: Reading Text in Uncontrolled Conditions[paper]

[2012-CVPR]Real-time scene text localization and recognition[paper][code]

[2010-CVPR]Detecting Text in Natural Scenes with Stroke Width Transform [paper] [code]

自然场景文字识别

[2021-CVPR-oral-识别] Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

作者从人的行为提出了几个假设： 1 语言模型和视觉模型是自洽的，可以独立work. 当然两者好的交互可以让文字识别做的更好。 2 文字的reasoning和完形填空差不多，context是双向的 3 迭代的，在一些难的场景中，人是迭代去矫正识别结果的。

[201905-arxiv] 2D Attentional Irregular Scene Text Recognizer[paper]

[201812-arxiv] Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks [paper]

[2019-CVPR] ESIR: End-to-end Scene Text Recognition via Iterative Rectification [paper]

[2019-AAAI]Show, Attend and Read: A Simple and Strong Baseline for Irregular Text single shot scene text retrieval [paper]

[2018-arxiv]Scene Text Recognition from Two-Dimensional Perspective [paper]

[2018-CVPR] AON: Towards Arbitrarily-Oriented Text Recognition [paper]

[2018-PRMI] ASTER: an attentional scene text recognizer with flexible rectification [paper]

[2018-arXiv]Edit Probability for Scene Text Recognition [paper]

[2018-AAAI]SEE: Towards Semi-Supervised End-to-End Scene Text Recognition[paper][code]

[2017-IJCAI] Learning to Read Irregular Text with Attention Mechanisms[paper]

[2017-ICCV] Focusing Attention: Towards Accurate Text Recognition in Natural Images[paper]

[2017-arXiv]AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition [paper]

[2017-arXiv]STN-OCR: A single Neural Network for Text Detection and Text Recognition[paper][code]

[2017-arXiv]Auto-Encoder Guided GAN for Chinese Calligraphy Synthesis[paper]

[2017-AAAI-网络图片]Detection and Recognition of Text Embedded in Online Images via Neural Context Models[paper][project]

[2017-arvix 文档识别] Full-Page Text Recognition: Learning Where to Start and When to Stop[paper]

[2016-AAAI]Reading Scene Text in Deep Convolutional Sequences [paper]

[2016-IJCV]Reading Text in the Wild with Convolutional Neural Networks [paper] [demo] [homepage]

[2016-CVPR]Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [paper]

[2016-CVPR] Robust Scene Text Recognition with Automatic Rectification [paper]

[2016-NIPs] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data[paper]

[2015-CoRR] AnEnd-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [paper] [code]

[2015-ICDAR]Automatic Script Identification in the Wild[paper]

[2015-ICLR] Deep structured output learning for unconstrained text recognition [paper]

[2014-NIPS]Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition [paper homepage] [model]

[2014-TIP] A Unified Framework for Multi-Oriented Text Detection and Recognition [paper]

[2012-ICPR]End-to-End Text Recognition with Convolutional Neural Networks [paper] [code] [SVHN Dataset]

数据集

Chinese Text in the Wild 2018

32,285 high resolution images，1,018,402 character instances，3,850 character categories，6 kinds of attributes

Total-Text 2017

1555 images,11459 text instances, includes curved text

COCO-Text (ComputerVision Group, Cornell) 2016

63,686images, 173,589 text instances, 3 fine-grained text attributes.

Task:text location and recognition

COCO-Text API

Synthetic Data for Text Localisation in Natural Image (VGG)2016

800k thousand images

8 million synthetic word instances

download

Synthetic Word Dataset (Oxford, VGG) 2014

9million images covering 90k English words

Task:text recognition, segmentation

download

IIIT 5K-Words 2012

5000images from Scene Texts and born-digital (2k training and 3k testing images)

Eachimage is a cropped word image of scene text with case-insensitive labels

Task:text recognition

download

StanfordSynth(Stanford, AI Group) 2012

Smallsingle-character images of 62 characters (0-9, a-z, A-Z)

Task:text recognition

download

MSRA Text Detection 500 Database(MSRA-TD500) 2012

500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)

Chinese,English or mixture of both

Task:text detection

Street View Text (SVT) 2010

350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)

Onlyword level bounding boxes are provided with case-insensitive labels

Task:text location

KAIST Scene_Text Database 2010

3000images of indoor and outdoor scenes containing text

Korean,English (Number), and Mixed (Korean + English + Number)

Task:text location, segmentation and recognition

Chars74k 2009

Over74K images from natural images, as well as a set of synthetically generatedcharacters

Smallsingle-character images of 62 characters (0-9, a-z, A-Z)

Task:text recognition

ICDARBenchmark Datasets

Dataset	Discription	Competition Paper
ICDAR 2015	1000 training images and 500 testing images	paper
ICDAR 2013	229 training images and 233 testing images	paper
ICDAR 2011	229 training images and 255 testing images	paper
ICDAR 2005	1001 training images and 489 testing images	paper
ICDAR 2003	181 training images and 251 testing images(word level and character level)	paper

开源库

Tesseract: c++ based tools for documents analysis and OCR,support 60+ languages [code]

Ocropy: Python-based tools for document analysis and OCR [code]

CLSTM : A small C++ implementation of LSTM networks,focused on OCR [code]

Convolutional Recurrent Neural Network,Torch7 based [code]

Attention-OCR: Visual Attention based OCR [code]

Umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm [code]

其他

DeepFont:Identify Your Font from An Image[paper]

Writer-independent Feature Learning for Offline Signature Verification using Deep Convolutional Neural Networks[paper]

End-to-End Interpretation of the French Street Name Signs Dataset [paper] [code]

Extracting text from an image using Ocropus [blog]

手写字识别

[2016-arXiv]Drawingand Recognizing Chinese Characters with Recurrent Neural Network [paper]

Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition [paper]

Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition [paper]

High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps [paper] [github]

DeepHCCR:Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel) [code]

如何用卷积神经网络CNN识别手写数字集？[blog][blog1][blog2] [blog4] [blog5] [code6]

Scan,Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTMAttention [paper]

MLPaint:the Real-Time Handwritten Digit Recognizer [blog][code][demo]

caffe-ocr: OCR with caffe deep learning framework [code] (单字分类器)

牌照等识别

ReadingCar License Plates Using Deep Convolutional Neural Networks and LSTMs [paper]

Numberplate recognition with Tensorflow [blog] [code]

end-to-end-for-plate-recognition[code]

ApplyingOCR Technology for Receipt Recognition[blog][mirror]

破解验证码

[2017-Arvix]Using Synthetic Data to Train NeuralNetworks is Model-Based Reasoning[paper]

Using deep learning to break a Captcha system [blog] [code]

Breakingreddit captcha with 96% accuracy [blog] [code]

I'mnot a human: Breaking the Google reCAPTCHA [paper]

NeuralNet CAPTCHA Cracker [slides] [code] [demo]

Recurrentneural networks for decoding CAPTCHAS [blog] [code] [demo]

Readingirctc captchas with 95% accuracy using deep learning [code]

端到端的OCR：基于CNN的实现 [blog]

IAm Robot: (Deep) Learning to Break Semantic Image CAPTCHAs [paper]

参考

[1]OCR - handong1587

[2]GitHub - chongyangtao/Awesome-Scene-Text-Recognition: A curated list of resources dedicated to scene text localization and recognition

PeaceInMind

关注

56
点赞
踩
258

收藏

觉得还不错? 一键收藏
27
评论
文字检测与识别资源

本文写成时主要参考了[1,2], 后面加了一些自己收集的，不过大家都在更新，所以区别不是很大。蓝色部分代表最近新增的部分综述自然场景文字检测自然场景文字识别数据集开源库其他手写字识别牌照等识别破解验证码参考[2015-PAMI-Overview]Text Detection and Recognition in Imagery: A Survey[...
复制链接

扫一扫