Tesseract与ios的集成

最近接触了一个关于图文识别的项目,项目组决定使用Tesseract,大概查了一下但是对于ios的支持貌似不是很好,以下是为Tesseract包装过的oc版本。

 

原文链接:https://github.com/ldiqual/tesseract-ios

 

Tesseract for iOS

tesseract-ios is not actively maintained anymore. I encourage you to use gali8's Tesseract-OCR-iOS instead.

About

Tesseract-ios is an Objective-C wrapper for Tesseract OCR.

This project couldn't exist without the Ângelo Suzuki's blog post. A lot of code came from his article.

Requirements

  • iOS SDK 6.0, iOS 5.0+ (there is no support for armv6)
  • Tesseract and Leptonica libraries from the tesseract-ios-lib repo.

Installation

  • Add tesseract-ios as a group, and tessdata by reference to your project:

  • Go to your project settings, and ensure that C++ Standard Library => libstdc++:

Usage

Here is the default workflow to extract text from an image:

  • Instantiate Tesseract with data path and language
  • Set variables (character set, …)
  • Set the image to analyze
  • Start recognition
  • Get recognized text
  • Clear

Code Sample

#import "Tesseract.h"

Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
[tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"];
[tesseract setImage:[UIImage imageNamed:@"image_sample.jpg"]];
[tesseract recognize];

NSLog(@"%@", [tesseract recognizedText]);
[tesseract clear];

Method reference

-initWithDataPath:language:

- (id)initWithDataPath:(NSString *)dataPath language:(NSString *)language

Initialize a new Tesseract instance.

  • dataPath: a relative path from the application bundle to the .traineddata files. You can find these files from the tesseract downloads section.
  • language: language used for recognition. Ex: eng. Tesseract will search for a eng.traineddatafile in the dataPath directory.

Returns nil if instanciation failed.

-setVariableValue:forKey:

- (void)setVariableValue:(NSString *)value forKey:(NSString *)key

Set Tesseract variable key to value. See http://www.sk-spell.sk.cx/tesseract-ocr-en-variables for a complete (but not up-to-date) list.

For instance, use tessedit_char_whitelist to restrict characters to a specific set.

-setImage:

- (void)setImage:(UIImage *)image

Set the image to recognize.

-setLanguage:

- (BOOL)setLanguage:(NSString *)language

Override the language defined with -initWithDataPath:language:.

-recognize

- (BOOL)recognize

Start text recognition. You might want to launch this process in background with NSObject's -performSelectorInBackground:withObject:.

-recognizedText

- (NSString *)recognizedText

Get the text extracted from the image.

-clear

- (void) clear

Clears Tesseract object after text has been recognized from image. Preventing memory leaks.

 

 

备忘:看过一篇博文提到:为了提高效率需要对图片进行预处理(二值化、灰度、倾斜校正和图片切割),倾斜校正和图片切割可以用openCV的库处理

博文链接:http://www.cocoachina.com/bbs/read.php?tid=123463 (该博文有demo这里就不多链了)

 

比较有用的链接:

转载于:https://www.cnblogs.com/lemontee/p/4311355.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值