用VS2010对Tesseract3.02开发做OCR,上网查了很多资料,终于弄通了。
首先下载了tesseract-3.02.02-win32-lib-include-dirs,这个原本是谷歌上的,但一般无法下载,还是费了积分从本网上下载的。是tesseract3.02的头文件和库文件,但这还不够,需要leptonica.
我用的是1.68版,include和lib文件我已上传资源leptonica。
这些弄好以后,建立VS2010控制台程序,代码如下:
#include "stdafx.h"
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
using namespace tesseract;
int _tmain(int argc, _TCHAR* argv[])
{
char *outText;
tesseract::TessBaseAPI * api = new tesseract::TessBaseAPI();
// Initialize tesseract-ocr with English, without specifying tessdata path
if (api->Init(NULL, "chi_sim")) {
fprintf(stderr, "Could not initialize tesseract.\n");
exit(1);
}
// Open input image with leptonica library
Pix *image = pixRead("1.jpg");
api->SetImage(image);
// Get OCR result
outText = api->GetUTF8Text();
printf("OCR output:\n%s", outText);
// Destroy used object and release memory
api->End();
pixDestroy(&image);
return 0;
}
配置如下:(因为我是将leptonica和tesseract3.02放在了工程的同级目录)
包含目录:..\..\tesseract-3.02.02-win32-lib-include-dirs\include;..\..\leptonica\include
库目录:..\..\leptonica\include;..\..\leptonica\include
附加依赖项:
libtesseract302.lib
liblept168d.lib
liblept168.lib
giflib416-static-mtdll-debug.lib
libjpeg8c-static-mtdll-debug.lib
liblept168-static-mtdll-debug.lib
libpng143-static-mtdll-debug.lib
libtiff394-static-mtdll-debug.lib
zlib125-static-mtdll-debug.lib
编译运行就成了!工程文件地址