使用qt+PaddleOCR做一个OCR软件demo

一路前行，幸运相伴

已于 2022-10-10 17:51:15 修改

阅读量8.3k

点赞数 13

分类专栏：应用与工具文章标签：深度学习 paddlepaddle ocr qt

于 2021-07-01 14:54:58 首次发布

本文链接：https://blog.csdn.net/ShareProgress/article/details/118385614

版权

应用与工具专栏收录该内容

4 篇文章

订阅专栏

文章目录

使用qt+PaddleOCR做一个OCR软件demo
效果展示
1 建立环境
2 在工程代码中添加截图功能
- 2.1 新建截图类并连接信号
- 2.2 QImage转cv::Mat
3 在工程代码中集成PaddleOCR
4 对发布源码组织的说明

使用qt+PaddleOCR做一个OCR软件demo

效果展示

本人是比较喜欢在vs下做项目的，但考虑到发布一个小程序的源码，qtcreator的项目更容易发布，所以分别建了vs的项目和qtcreator项目，源代码都是一样的，只是环境配置不同

源码与发布版本下载
环境如下：Qt_5_13_2_MSVC2017_64bit-Release

1 建立环境

1.1 vs项目环境

将cpp_infer文件夹复制到OCR_Demo项目文件夹下，命名为PaddleOCR
将截图类复制到项目下，命名为ScreenWidget
新建筛选器等将这些文件添加到项目中
收集依赖文件，在项目目录下新建3rdparty文件夹
1. PaddleOCR预编译文件夹paddle_inference_install_dir
2. OpenCV
配置项目
【属性】——【常规】——【附加包含目录】添加：

【属性】——【常规】——【字符集】——【使用多字节字符集】

在C/C++ => 命令行中添加（“/utf-8”）

c/c+±-代码生成–运行库设置为多线程（/MT：

1.2 qtcreator项目环境

1.2.1 源代码整理

│  main.cpp
│  mainwindow.cpp
│  mainwindow.h
│  mainwindow.ui
│  OCR_Demo.pro
│  
├─PaddleOCR
│  ├─include
│  │      clipper.h
│  │      config.h
│  │      ocr_cls.h
│  │      ocr_det.h
│  │      ocr_rec.h
│  │      postprocess_op.h
│  │      preprocess_op.h
│  │      utility.h
│  │      
│  └─src
│          clipper.cpp
│          config.cpp
│          ocr_cls.cpp
│          ocr_det.cpp
│          ocr_rec.cpp
│          postprocess_op.cpp
│          preprocess_op.cpp
│          utility.cpp
│          
└─ScreenWidget
        screen.cpp
        screen.h
        screenwidget.cpp
        screenwidget.h

1.2.2 依赖项整理如下

├─3rdparty
│  │  config.txt
│  │  ppocr_keys_v1.txt
│  │  
│  ├─opencv
│  │  │  opencv_world440.dll
│  │  │  opencv_world440.lib
│  │  │  
│  │  └─include
│  │      └─opencv2
│  │                      
│  └─paddle_inference_install_dir
│      │  CMakeCache.txt
│      │  version.txt
│      │  
│      ├─paddle
│      │  ├─include
│      │  └─lib
│      │          paddle_inference.dll
│      │          paddle_inference.lib
│      │          
│      └─third_party
│          ├─install
│          │  ├─cryptopp
│          │  │  ├─include
│          │  │  └─lib
│          │  │          cryptopp-static.lib
│          │  │          
│          │  ├─gflags
│          │  │  ├─include
│          │  │  │  └─gflags
│          │  │  │          
│          │  │  └─lib
│          │  │          gflags_static.lib
│          │  │          
│          │  ├─glog
│          │  │  ├─include
│          │  │  │  └─glog
│          │  │  │          
│          │  │  └─lib
│          │  │          glog.lib
│          │  │          
│          │  ├─mkldnn
│          │  │  ├─include
│          │  │  └─lib
│          │  │          mkldnn.dll
│          │  │          mkldnn.lib
│          │  │          
│          │  ├─mklml
│          │  │  ├─include
│          │  │  └─lib
│          │  │          libiomp5md.dll
│          │  │          libiomp5md.lib
│          │  │          mklml.dll
│          │  │          mklml.lib
│          │  │          
│          │  ├─protobuf
│          │  │  ├─include
│          │  │  │  └─google
│          │  │  │      └─protobuf
│          │  │  └─lib
│          │  │          libprotobuf.lib
│          │  │          
│          │  └─xxhash
│          │      ├─include
│          │      │      xxhash.h
│          │      │      
│          │      └─lib
│          │              xxhash.lib
│          │              
│          └─threadpool
│                  ThreadPool.h
│

1.2.3 模型文件如下

├─Model
│  ├─ch_ppocr_mobile_v2.0_cls_infer
│  │      inference.pdiparams
│  │      inference.pdiparams.info
│  │      inference.pdmodel
│  │      
│  ├─ch_ppocr_server_v2.0_det_infer
│  │      inference.pdiparams
│  │      inference.pdiparams.info
│  │      inference.pdmodel
│  │      
│  └─ch_ppocr_server_v2.0_rec_infer
│          inference.pdiparams
│          inference.pdiparams.info
│          inference.pdmodel

1.2.4 在.pro文件中设置依赖库的模式

CONFIG(debug, debug|release) {
    QMAKE_CXXFLAGS_DEBUG += /MTd
}

CONFIG(release, debug|release) {
    QMAKE_CXXFLAGS_RELEASE += /MT
}

1.2.5 pro文件中添加.h和.lib

INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\paddle\fluid\inference
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\paddle\include
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\protobuf\include
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\glog\include
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\gflags\include
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\xxhash\include
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\zlib\include
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\boost
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\eigen3
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\mklml\include
INCLUDEPATH += $$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\mkldnn\include
INCLUDEPATH += $$PWD\..\3rdparty\opencv\include


LIBS += -L$$PWD\..\3rdparty\paddle_inference_install_dir\paddle\lib -lpaddle_inference
LIBS += -L$$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\mklml\lib -lmklml
LIBS += -L$$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\mklml\lib -llibiomp5md
LIBS += -L$$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\mkldnn\lib -lmkldnn
LIBS += -L$$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\glog\lib -lglog
LIBS += -L$$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\gflags\lib -lgflags_static
LIBS += -L$$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\protobuf\lib -llibprotobuf
LIBS += -L$$PWD\..\3rdparty\paddle_inference_install_dir\third_party\install\xxhash\lib -lxxhash
LIBS += -L$$PWD\..\3rdparty\opencv -lopencv_world440

2 在工程代码中添加截图功能

本工程中截图功能的介绍主要在使用Qt实现截图功能一文中讲解。

2.1 新建截图类并连接信号

ScreenWidget *sw = new ScreenWidget;
connect(sw, SIGNAL(sig_SelectImg(QImage)), this, SLOT(slt_SelectImg(QImage)), Qt::QueuedConnection);
sw->showFullScreen();

2.2 QImage转cv::Mat

截图出来的QImage格式为Format_ARGB32_Premultiplied，转为cv::Mat的类型应为CV_8UC4

cv::Mat QImage2cvMat(QImage image)
{
    cv::Mat mat;
    //qDebug() << image.format();
    switch (image.format())
    {
    case QImage::Format_ARGB32:
    case QImage::Format_RGB32:
    case QImage::Format_ARGB32_Premultiplied:
    {
        cv::Mat mat_temp = cv::Mat(image.height(), image.width(), CV_8UC4, (void*)image.constBits(), image.bytesPerLine());
        cvtColor(mat_temp, mat, cv::COLOR_BGRA2BGR);
    }
        break;
    case QImage::Format_RGB888:
        mat = cv::Mat(image.height(), image.width(), CV_8UC3, (void*)image.constBits(), image.bytesPerLine());
        break;
    case QImage::Format_Indexed8:
        mat = cv::Mat(image.height(), image.width(), CV_8UC1, (void*)image.constBits(), image.bytesPerLine());
        break;
    }
    return mat;
}

3 在工程代码中集成PaddleOCR

3.1 初始化PaddleOCR

主要参数在config.txt中，注意读取时的路径问题

config = new OCRConfig("config.txt");
config->PrintConfigInfo();

det = new DBDetector(config->det_model_dir, config->use_gpu, config->gpu_id,
config->gpu_mem, config->cpu_math_library_num_threads,
config->use_mkldnn, config->max_side_len, config->det_db_thresh,
config->det_db_box_thresh, config->det_db_unclip_ratio,
config->visualize, config->use_tensorrt, config->use_fp16);

//Classifier *cls = nullptr;
if (config->use_angle_cls == true) {
cls = new Classifier(config->cls_model_dir, config->use_gpu, config->gpu_id,
config->gpu_mem, config->cpu_math_library_num_threads,
config->use_mkldnn, config->cls_thresh,
config->use_tensorrt, config->use_fp16);
}

rec = new CRNNRecognizer(config->rec_model_dir, config->use_gpu, config->gpu_id,
config->gpu_mem, config->cpu_math_library_num_threads,
config->use_mkldnn, config->char_list_file,
config->use_tensorrt, config->use_fp16);

config.txt内容如下

# model load config
use_gpu 0
gpu_id  0
gpu_mem  4000
cpu_math_library_num_threads  10
use_mkldnn 1

# det config
max_side_len  960
det_db_thresh  0.3
det_db_box_thresh  0.5
det_db_unclip_ratio  1.6
det_model_dir  ./Model/ch_ppocr_server_v2.0_det_infer/

# cls config
use_angle_cls 0
cls_model_dir  ./Model/ch_ppocr_mobile_v2.0_cls_infer/
cls_thresh  0.9

# rec config
rec_model_dir  ./Model/ch_ppocr_server_v2.0_rec_infer/
char_list_file ./ppocr_keys_v1.txt

# show the detection results
visualize 1

# use_tensorrt
use_tensorrt 0
use_fp16   0

3.2 修改文字识别类的run函数，增加一个返回结果的值的参数

namespace PaddleOCR {

	void CRNNRecognizer::Run(std::vector<std::vector<std::vector<int>>> boxes,
		cv::Mat &img, Classifier *cls, std::vector<std::string> &list_str)
		......

3.3 对小区域截图进行扩充

实际使用过程中发现，如果只截图文字所在的一个小图，识别效果并不好，这里对这种图像进行扩充边缘，效果有所改善，这和目标检测（文字检测）有关。

copyMakeBorder(srcimg, srcimg, 100, 100, 100, 100, BORDER_CONSTANT, Scalar(255, 255, 255));

3.4 对图像应用OCR识别

void MainWindow::slt_SelectImg(QImage img)
{
    QImage im = img.copy();
    ScreenWidget *sw = qobject_cast<ScreenWidget*>(sender());//获取发射信号的对象
    delete sw;
    sw = nullptr;

    cv::Mat srcimg = QImage2cvMat(im);
    //Mat src;
    //cvtColor(srcimg, src, COLOR_RGB2GRAY);
    copyMakeBorder(srcimg, srcimg, 100, 100, 100, 100, BORDER_CONSTANT, Scalar(255, 255, 255));

    auto start = std::chrono::system_clock::now();
    std::vector<std::vector<std::vector<int>>> boxes;
    det->Run(srcimg, boxes);

    这里检测完成后并没有输出信息，只是在函数内部打印，我们要在此添加一个变量用于接收结果
    std::vector<std::string> str_res;
    rec->Run(boxes, srcimg, cls, str_res);

    //std::cout << str_res.size();
    ui->textBrowser->append(QString::number(str_res.size()));
    std::string str;
    for (int i = 0; i < str_res.size(); i++)
    {
        //str += str_res[i];
        ui->textBrowser->append(str2qstr(str_res[i]));
    }

    ui->textBrowser->append(str2qstr(str));
}