http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
Char74k dataset
In this dataset, symbols used in both English and Kannada are available.
In the English language, Latin script (excluding accents) and Hindu-Arabic numerals are used. For simplicity we call this the "English" characters set. Our dataset consists of:
- 64 classes (0-9, A-Z, a-z)
- 7705 characters obtained from natural images
- 3410 hand drawn characters using a tablet PC
- 62992 synthesised characters from computer fonts
http://openresearch.baidu.com/activitybulletin/618.jhtml
一段文字识别代码
这个网址介绍
Multi-Orientation Scene Text Detection and USTB-SV1K Dataset 并且提供了多方向多视角自然图像文本数据库
USTB-SV1K
Text detection in natural scene images is an important prerequisite for many content-based image analysis tasks, while most current research efforts only focus on horizontal or near horizontal scene text. In our paper, first we present a unified distance metric learning framework for adaptive hierarchical clustering, which can simultaneously learn similarity weights (to adaptively combine different feature similarities) and the clustering threshold (to automatically determine the number of clusters). Then, we propose an effective multi-orientation scene text detection system, which constructs text candidates by grouping characters based on this adaptive clustering. Our text candidates construction method consists of several sequential coarse-to-fine grouping steps: morphology-based grouping via single-link clustering, orientation-based grouping via divisive hierarchical clustering, and projection-based grouping also via divisive clustering. The effectiveness of our proposed system is evaluated on several public scene text databases, e.g., ICDAR Robust Reading Competition datasets (2011 and 2013), and MSRA-TD500. Specifically, on the multi-orientation text dataset MSRA-TD500, the
f
measure of our system is 70%, much better than 60% of one recent state-of-the-art performance。
We also construct and release a practical challenging multi-orientation scene text dataset (USTB-SV1K), which is available at http://prir.ustb.edu.cn/TexStar/MOMV-text-detection/.
Dataset description
.
We annotate an image in which a list of words to label with bounding boxes by the coordinates of the left-top point, width, height and inclination angle along with the ground truth word, which is similar to MSRA-TD500. We collect 1000 (500 for training and 500 for testing) street view (patch) images from 6 USA cities, i.e., New York, Boston, Los Angle, Washington DC, San Francisco, and Seattle. The set from each city includes about 160 ~ 180 images, about half of which are for training, and the rest for testing. There are three main challenges for detection and recognition on this dataset (se