PaddleOCR 多线程版本（python）

qq_30811703

已于 2023-05-30 15:37:52 修改

阅读量2.3k

点赞数 1

文章标签： python flask 开发语言

于 2023-05-30 15:01:18 首次发布

本文链接：https://blog.csdn.net/qq_30811703/article/details/130948665

版权

该代码示例展示了如何利用Python的Flask框架封装PaddleOCR库，创建一个HTTP服务进行图像文字识别。服务中初始化了5个PaddleOCR实例以实现多线程处理，通过线程锁确保并发安全。用户上传图片后，服务将图片保存并使用选定的OCR实例进行识别，返回识别结果。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

封装成http服务进行调用

import threading
from flask import Flask, request, jsonify
from paddleocr import PaddleOCR

app = Flask(__name__)

# Initialize 5 PaddleOCR instances and locks
ocr_instances = [PaddleOCR(enable_mkldnn=True, use_angle_cls=True, lang='en') for _ in range(5)]
locks = [threading.Lock() for _ in range(5)]


@app.route('/ocr', methods=['POST'])
def ocr_handler():
    if 'img' not in request.files:
        return jsonify({'error': 'No image file provided'})

    image_file = request.files['img']
    image_path = 'C:\\Users\\dukun\\Downloads\\ppocr_img\\ppocr_img\\imgs_en\\' + image_file.filename
    # 创建一个空白的图像文件
    with open(image_path, 'wb') as f:
        # 仅写入一个字节作为占位符
        f.write(b'\x00')
    # Specify the path to save the image
    image_file.save(image_path)
    # image_path = 'C:\\Users\\dukun\\AppData\\Local\\Temp\\16853451072921665941938794632449408042page_2.jpg'

    # Get the index of the OCR instance to use
    model_index = id(request._get_current_object()) % len(ocr_instances)
    ocr = ocr_instances[model_index]

    # Acquire the lock for the OCR instance
    lock = locks[model_index]
    lock.acquire()
    print("model_index", model_index)
    try:
        result = ocr.ocr(image_path, cls=True)

        # Process the OCR result
        ocr_result = []
        for res in result:
            for line in res:
                ocr_result.append(line)

        return jsonify({'result': ocr_result})
    finally:
        # Release the lock after OCR processing is complete
        lock.release()


if __name__ == '__main__':
    app.run()