Use ---Tesseract4---in the QTcreator(Using MSV2015)

55 篇文章 0 订阅
16 篇文章 1 订阅

 tesseract是一个比较流行的OCR识别库,但是对于tesseract在qt中如何使用呢?为什么要将它配置在QT下呢,您也完全可以将tesseract配置在Visual Studio下,但是对于跨平台的软件工程而言,Visual Studio不是一个很好的选择。对于OCR识别为什么不选择深度学习呢?主要原因在于深度学习的模型很大程度都是基于Python的,依赖性较强,而且执行效率是个很大的问题。如果您想要实现C++与Python文件的深度学习调用,可以参考我的博客https://blog.csdn.net/m0_37690102/article/details/106069057

(C++调用基于Python3.7.4+Tensorflow1.13.1实现的CRNN模型(已经实现了CRNN模型的OCR识别,但是由于C++与py交互时消耗的时间较长,所以暂定选择C++与tesseract的交互))。但是这个存在一个很大的问题,那就是执行效率的问题,C++调用Python程序消耗的时间比较长,不利用软件的用户体验。但是C++与Python的深度学习的交互,也是一个值得优化的问题,那么后面再来优化这个问题吧。

  • ---->Tesseract is a popular OCR recognition library, but how tesseract is used in qt?Why configure it in QT? You can configure tesseract in Visual Studio, but Visual Studio is not a good choice for cross-platform software engineering.
  • Why not deep learning for OCR recognition?The main reason is that deep learning models are largely python-based, highly dependent, and execution efficiency is a big problem.
  • If you want to make deep learning calls to C++ and Python files, you can refer to my blog :https://blog.csdn.net/m0_37690102/article/details/106069057(C++ calls the CRNN model  based on Python3.7.4+Tensorflow1.13.1 (OCR recognition of the CRNN model has been finished, but since C++ takes a long time to interact with python, the C++ / tesseract interaction is tentatively selected).However, there is a big problem with this, which is the issue of execution efficiency. C++ calls to Python programs take a long time and do not take advantage of the user experience of the software.However, the deep learning interaction between C++ and Python is also a problem worthy of optimization, so let's optimize this problem later.

首先,新建一个QT UI工程,看看我的文件结构。

  • ---->First, I create a new QT UI project and look at my file structure.

复制 3rdparty\tesseract4\win64\bin\Release下的全部 dll 到程序主目录下(根据编译方式)

  • ---->Copy all the DLLS under 3rdparty tesseract4 win64 bin Release into the program's home directory (according to the compilation method)

 修改pro文件:我这里添加了opencv的库,对于opencv与QT之间的配置,您可以参考我的博客:https://blog.csdn.net/m0_37690102/article/details/106200025

INCLUDEPATH += C:\opencv\build\include
               C:\opencv\build\include\opencv
               C:\opencv\build\include\opencv2
               3rdparty\tesseract4\qt_tesseract.pri

LIBS +=-LC:\opencv\build\x64\vc14\lib -lopencv_world310
INCLUDEPATH += $$PWD/../opencv/build/x64/vc14
DEPENDPATH += $$PWD/../opencv/build/x64/vc14

 接下来,添加调用函数:

  • ---->Next, add the calling function:
#include "3rdparty/tesseract4/include/tesseract/baseapi.h"
#include "3rdparty/tesseract4/include/leptonica/allheaders.h"

我采用的是基于QT5.9.8+opencv3.1+MSVC2015+release版本进行编译的,运行效果如下.可见程序没有出错,证明,初步配置已经是成功了的。

  • ---->I used the compiled version based on QT5.9.8+opencv3.1+MSVC2015+release, and the running effect is as follows. It can be seen that there is no error in the program, which proves that the preliminary configuration has been successful.

接下来就是调用API函数,实现字符的识别。 

  • --->The next step is to call the API function to recognize the characters.
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();

出现以下错误: 

  • --->The following error occurred:

//========================================================================================================
//========================================================================================================
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
// Initialize tesseract-ocr with English, without specifying tessdata path
if (api->Init("tessdata", "chi_sim",tesseract::OEM_LSTM_ONLY))
{
	fprintf(stderr, "Could not initialize tesseract.\n");
	exit(1);
}
api->SetPageSegMode(tesseract::PSM_AUTO);    // 设置识别模式
// Open input image with leptonica library
Pix *image = pixRead("3.png");
api->SetImage(image);
//SetImage(const unsigned char* imagedata, int width, int height,int bytes_per_pixel, int bytes_per_line);
// Get OCR result
QString outText = QString(api->GetUTF8Text());
// Destroy used object and release memory
api->End();
pixDestroy(&image);
QMessageBox::information(this, tr("识别结果"),outText, QMessageBox::Yes, QMessageBox::Yes);
//========================================================================================================
//========================================================================================================
E:\QT5.9\Tools\QtCreator\bin\jom.exe -f Makefile.Release
	link /NOLOGO /DYNAMICBASE /NXCOMPAT /INCREMENTAL:NO /SUBSYSTEM:WINDOWS "/MANIFESTDEPENDENCY:type='win32' name='Microsoft.Windows.Common-Controls' version='6.0.0.0' publicKeyToken='6595b64144ccf1df' language='*' processorArchitecture='*'" /MANIFEST:embed /OUT:release\qt_tesseract.exe @C:\Users\ADMINI~1\AppData\Local\Temp\qt_tesseract.exe.14184.0.jom
mainwindow.obj : error LNK2019: 无法解析的外部符号 "public: __cdecl tesseract::TessBaseAPI::TessBaseAPI(void)" (??0TessBaseAPI@tesseract@@QEAA@XZ),该符号在函数 "public: __cdecl MainWindow::MainWindow(class QWidget *)" (??0MainWindow@@QEAA@PEAVQWidget@@@Z) 中被引用
mainwindow.obj : error LNK2019: 无法解析的外部符号 "public: int __cdecl tesseract::TessBaseAPI::Init(char const *,char const *,enum tesseract::OcrEngineMode,char * *,int,class GenericVector<class STRING> const *,class GenericVector<class STRING> const *,bool)" (?Init@TessBaseAPI@tesseract@@QEAAHPEBD0W4OcrEngineMode@2@PEAPEADHPEBV?$GenericVector@VSTRING@@@@3_N@Z),该符号在函数 "public: __cdecl MainWindow::MainWindow(class QWidget *)" (??0MainWindow@@QEAA@PEAVQWidget@@@Z) 中被引用
mainwindow.obj : error LNK2019: 无法解析的外部符号 "public: void __cdecl tesseract::TessBaseAPI::SetPageSegMode(enum tesseract::PageSegMode)" (?SetPageSegMode@TessBaseAPI@tesseract@@QEAAXW4PageSegMode@2@@Z),该符号在函数 "public: __cdecl MainWindow::MainWindow(class QWidget *)" (??0MainWindow@@QEAA@PEAVQWidget@@@Z) 中被引用
mainwindow.obj : error LNK2019: 无法解析的外部符号 "public: void __cdecl tesseract::TessBaseAPI::SetImage(struct Pix *)" (?SetImage@TessBaseAPI@tesseract@@QEAAXPEAUPix@@@Z),该符号在函数 "public: __cdecl MainWindow::MainWindow(class QWidget *)" (??0MainWindow@@QEAA@PEAVQWidget@@@Z) 中被引用
mainwindow.obj : error LNK2019: 无法解析的外部符号 "public: char * __cdecl tesseract::TessBaseAPI::GetUTF8Text(void)" (?GetUTF8Text@TessBaseAPI@tesseract@@QEAAPEADXZ),该符号在函数 "public: __cdecl MainWindow::MainWindow(class QWidget *)" (??0MainWindow@@QEAA@PEAVQWidget@@@Z) 中被引用
mainwindow.obj : error LNK2019: 无法解析的外部符号 "public: void __cdecl tesseract::TessBaseAPI::End(void)" (?End@TessBaseAPI@tesseract@@QEAAXXZ),该符号在函数 "public: __cdecl MainWindow::MainWindow(class QWidget *)" (??0MainWindow@@QEAA@PEAVQWidget@@@Z) 中被引用
mainwindow.obj : error LNK2019: 无法解析的外部符号 pixDestroy,该符号在函数 "public: __cdecl MainWindow::MainWindow(class QWidget *)" (??0MainWindow@@QEAA@PEAVQWidget@@@Z) 中被引用
mainwindow.obj : error LNK2019: 无法解析的外部符号 pixRead,该符号在函数 "public: __cdecl MainWindow::MainWindow(class QWidget *)" (??0MainWindow@@QEAA@PEAVQWidget@@@Z) 中被引用
release\qt_tesseract.exe : fatal error LNK1120: 8 个无法解析的外部命令

解决这个问题:对于这个问题我几乎查了一个多小时,解决方法也有很多,但是都没有很好的解决我的问题。网上定位的错误就是需要的头文件并没有包含进来,但是确实构建没有任何问题。突然意识到我需要添加的并不是头文件*.h。而是*.pri。选择include(3rdparty\tesseract4\qt_tesseract.pri)导入,解决问题。修改pro文件,修改后的如下:

  • --->Solve this problem: I have been looking up this problem for over an hour, and there are many solutions, but none of them have solved my problem well.
  • The problem with locating on the web is that the required header files are not included, but there is nothing wrong with the build.
  • I suddenly realized that what I needed to add was not a header file *.h.But *. Pri.Select include(3rdparty\tesseract4\qt_tesseract.pri) import to solve the problem.Modify the pro file as follows:
INCLUDEPATH += C:\opencv\build\include
               C:\opencv\build\include\opencv
               C:\opencv\build\include\opencv2

LIBS +=-LC:\opencv\build\x64\vc14\lib -lopencv_world310
INCLUDEPATH += $$PWD/../opencv/build/x64/vc14
DEPENDPATH += $$PWD/../opencv/build/x64/vc14

include(3rdparty\tesseract4\qt_tesseract.pri)

完美解决这些稀奇古怪的问题。如果出现下面的异常错误,就是你的chi_sim.traineddata的路径有问题。

  • --->Perfect for solving these bizarre problems.If the following exception error occurs, it is the path of your chi_sim.traineddata that has a problem.
Error opening data file tessdata//chi_sim.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'chi_sim'
Tesseract couldn't load any languages!
Could not initialize tesseract.

我修改成下面的格式就完美解决了,参考我的代码,我修改成了绝对路径。

  • --->I modified it to the following format, which is a perfect solution. Referring to my code, I modified it to the absolute path.
if (api->Init("E://0-546//05-OCR//qt_tesseract//qt_tesseract//tessdata//", "chi_sim",tesseract::OEM_LSTM_ONLY))
{
	fprintf(stderr, "Could not initialize tesseract.\n");
	exit(1);
}

如果出现没有报错,但是程序异常结束的问题。也就是构建成功但是运行时就异常结束是因为程序所需要的库没有加载进来(一般是dll)。找到所需的动态链接库放到正确位置即可。 当然,配置到系统环境变量里,也是可以的,更省事了,不过我一般不加所谓的正确位置,直接在程序生成的build目录,我的目录是:build-qt_tesseract-Desktop_Qt_5_9_8_MSVC2015_64bit-Release,放在release文件夹下即可。

  • --->If there is no error, but the program ended the problem.
  • That is, the build is successful but the runtime ends with an exception because the libraries needed by the program are not loaded (usually DLLS).Find the required dynamic link library and put it in the right place.Of course, it is also possible to configure to the system environment variables, which is more convenient.
  • Then, I usually do not add the so-called correct location, however directly in the build directory generated by the program. My directory is build-qt_tesseract-desktop_qt_5_9_8_msvc2015_64bit-release, which can be placed in the Release folder.

测试功能是否实现:

  • test

输入图像:

  • input the image

运行测试: 

  • run testing:

遇到的问题:

 部分图片的识别结果为空:出现Empty page!!的情况,查阅资料得知是因为图像分辨率的问题,需要做一些其他的处理。我增加了OTSU二值化及Laplacian锐化。可以参考我的代码:

  • questions:
  • --->The recognition result of some pictures is Empty.Empty page appears!!
  • According to the information, I know that because of the problem of image resolution,  needing to do some other processing.
  • I added OTSU binarization and Laplacian sharpening.You can refer to my code:
//========================================================================================================
cv::Mat gray;
cv::Mat testimage;
cv::cvtColor(im, gray, CV_BGR2GRAY);
threshold(gray, testimage, 0, 255, THRESH_BINARY_INV | THRESH_OTSU);
//-------------------------------------------------------------
namedWindow("OTSU", 0);//创建窗口
cvResizeWindow("OTSU", 500, 500); //创建一个500*500大小的窗口
imshow("OTSU", testimage);
//-------------------------------------------------------------
//Laplacian锐化
//-------------------------------------------------------------
Mat h1_kernel = (Mat_<char>(3, 3) << -1, -1, -1, -1, 8, -1, -1, -1, -1);
Mat h2_kernel = (Mat_<char>(3, 3) << 0, -1, 0, -1, 5, -1, 0, -1, 0);
Mat h1_result,h2_result;
filter2D(testimage, h1_result, CV_32F, h1_kernel);
filter2D(testimage, h2_result, CV_32F, h2_kernel);
convertScaleAbs(h1_result, h1_result);
convertScaleAbs(h2_result, h2_result);
//-------------------------------------------------------------
namedWindow("h2_result", 0);//创建窗口
cvResizeWindow("h2_result", 500, 500); //创建一个500*500大小的窗口
imshow("h2_result", h2_result);
//-------------------------------------------------------------
//========================================================================================================
//========================================================================================================

效果如下:

  • result:

但是,也存在不能识别的问题,如下所示。这个问题确实比较棘手。 

  • However, there are also unrecognized problems, as shown below.This is a tricky one.

 在这里,我修改源代码,将版面分析分析代码关闭:

  • Here, I modify the source code, the code is closed:
api->SetPageSegMode(tesseract::PSM_AUTO);    // 设置识别模式 //设置自动进行版面分析psm=PSM.SINGLE_BLOCK

再一次测试结果,是有识别结果的,但是识别效果不好。但是终归是有数据出来了,也算是很可观的,接下来的工作就是如何提升识别的准确率的问题了。 

  • Once again the test results, there is a recognition result, but the recognition effect is not good.
  • But after all, there are some data, which is very impressive. The following work is how to improve the accuracy of recognition.

Rerfer to my code:

  • If you have any questions,Please contact me in time. Thank you very much!!!

#include "mainwindow.h"
#include "ui_mainwindow.h"

#include <QMessageBox>
#include <QDebug>

#include "3rdparty/tesseract4/include/tesseract/baseapi.h"
#include "3rdparty/tesseract4/include/leptonica/allheaders.h"
//======================================================================
//2020-1-6 增加代码
#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/opencv.hpp>
#include <opencv2\ml.hpp>
#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
//======================================================================
using namespace cv;

//#pragma comment(lib, "tesseract400.lib")

MainWindow::MainWindow(QWidget *parent) :
    QMainWindow(parent),
    ui(new Ui::MainWindow)
{
    ui->setupUi(this);
    //========================================================================================================
    //========================================================================================================
    // Load image
    cv::Mat im = cv::imread("E://0-546//05-OCR//qt_tesseract//qt_tesseract//img//1_1_1_1.jpg");
    if (im.empty())
    {
        std::cout << "Cannot open source image!" << std::endl;
        exit(1);
    }
    //========================================================================================================
    cv::Mat gray;
    cv::Mat testimage;
    cv::cvtColor(im, gray, CV_BGR2GRAY);
    threshold(gray, testimage, 0, 255, THRESH_BINARY_INV | THRESH_OTSU);
    //-------------------------------------------------------------
    namedWindow("OTSU", 0);//创建窗口
    cvResizeWindow("OTSU", 500, 500); //创建一个500*500大小的窗口
    imshow("OTSU", testimage);
    //-------------------------------------------------------------
    //Laplacian锐化
    //-------------------------------------------------------------
    Mat h1_kernel = (Mat_<char>(3, 3) << -1, -1, -1, -1, 8, -1, -1, -1, -1);
    Mat h2_kernel = (Mat_<char>(3, 3) << 0, -1, 0, -1, 5, -1, 0, -1, 0);
    Mat h1_result,h2_result;
    filter2D(testimage, h1_result, CV_32F, h1_kernel);
    filter2D(testimage, h2_result, CV_32F, h2_kernel);
    convertScaleAbs(h1_result, h1_result);
    convertScaleAbs(h2_result, h2_result);
    //-------------------------------------------------------------
    namedWindow("h2_result", 0);//创建窗口
    cvResizeWindow("h2_result", 500, 500); //创建一个500*500大小的窗口
    imshow("h2_result", h1_result);
    //-------------------------------------------------------------
    //========================================================================================================
    //========================================================================================================
    tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
    // Initialize tesseract-ocr with English, without specifying tessdata path
    /*
         Initialize OCR engine to use English (eng) and The LSTM
         OCR engine.

         There are four OCR Engine Mode (oem) available

         OEM_TESSERACT_ONLY             Legacy engine only.
         OEM_LSTM_ONLY                  Neural nets LSTM engine only.
         OEM_TESSERACT_LSTM_COMBINED    Legacy + LSTM engines.
         OEM_DEFAULT                    Default, based on what is available.
    */
    if (api->Init("E://0-546//05-OCR//qt_tesseract//qt_tesseract//tessdata//", "chi_sim",tesseract::OEM_LSTM_ONLY))//
    {
        fprintf(stderr, "Could not initialize tesseract.\n");
        exit(1);
    }
    //api->SetVariable("tessedit_char_whitelist", "0123456789");
    //----------------------------------------------------------------------------------------
    //Set Page segmentation mode to PSM_AUTO (3)
    //api->SetPageSegMode(tesseract::PSM_AUTO);    // 设置识别模式 //设置自动进行版面分析psm=PSM.SINGLE_BLOCK
    //api->SetPageSegMode(tesseract::PSM_AUTO_ONLY);    // 设置识别模式 //设置自动进行版面分析psm=PSM.SINGLE_BLOCK
    //----------------------------------------------------------------------------------------
    //----------------------------------------------------------------------------------------
    // Set image data
    //api->SetImage((uchar*)testimage.data, testimage.cols, testimage.rows, 1, testimage.cols);
    api->SetImage((uchar*)gray.data, gray.cols, gray.rows, 1, gray.cols);//config="--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789"
    //api->SetImage((uchar*)h2_result.data, h2_result.cols, h2_result.rows, 1, h2_result.cols);
    //api->SetImage((uchar*)h1_result.data, h1_result.cols, h1_result.rows, 1, h1_result.cols);
    //api->SetImage(im.data, im.cols, im.rows, 3, im.step);
    //----------------------------------------------------------------------------------------
    // Open input image with leptonica library
    //Pix *image = pixRead("E://0-546//05-OCR//qt_tesseract//qt_tesseract//img//55.png");
    //api->SetImage(image);
    //SetImage(const unsigned char* imagedata, int width, int height,int bytes_per_pixel, int bytes_per_line);

    //----------------------------------------------------------------------------------------
    // Run Tesseract OCR on image
    // Get OCR result
    QString outText = QString(api->GetUTF8Text());
    qDebug()<< outText << endl;
    //----------------------------------------------------------------------------------------
    //----------------------------------------------------------------------------------------
    // Destroy used object and release memory
    api->End();
    //----------------------------------------------------------------------------------------
    //pixDestroy(&image);

    QMessageBox::information(this, tr("result"),outText, QMessageBox::Yes, QMessageBox::Yes);
    //========================================================================================================
    //========================================================================================================
}

MainWindow::~MainWindow()
{
    delete ui;
}

I hope I can help you,If you have any questions, please  comment on this blog or send me a private message. I will reply in my free time.

  • 2
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值