如何制作属于自己的图片OCR功能

hit56笔记

已于 2023-08-08 16:01:17 修改

阅读量176

点赞数

文章标签：机器学习

于 2023-04-10 15:03:34 首次发布

本文链接：https://blog.csdn.net/zh515858237/article/details/130060599

版权

文章介绍了OCR技术的实践，包括百度的PaddlePaddleOCR库、一个开源的ChineseOCR_lite项目、谷歌的tesseractOCR工具及其在Python中的应用，以及Facebook的抠图模型Segment-Anything。提供了详细的安装步骤和代码示例，特别针对中文识别进行了配置说明。

摘要由CSDN通过智能技术生成

经过本人的多次实践探索，已上线至我的网站： www.hit56.com ，可以在上面直接体验图片OCR功能
在这里插入图片描述

一、百度的PaddlePaddle

https://github.com/PaddlePaddle/PaddleOCR

二、一个开源软件

https://github.com/DayBreak-u/chineseocr_lite

三、谷歌的OCR实践方案

https://github.com/tesseract-ocr/tesseract在这里插入代码片

1. 安装软件包

pip install opencv-python
pip install pytesseract

2. 安装语言包

# CentOS 系统
yum install -y tesseract 
yum install -y tesseract-langpack-chi_sim
yum install -y tesseract-langpack-chi_tra

# Ubuntu 系统
apt-get install tesseract
apt-get install tesseract-ocr-chi-sim
apt-get install tesseract-ocr-chi-tra

3. 运行代码

import cv2
import sys
import pytesseract
if __name__ == '__main__':
  if len(sys.argv) < 2:
    print('Usage: python ocr_demo.py image.jpg')
    sys.exit(1)

  # 使用命令行参数
  imPath = sys.argv[1]

  # -l 识别中文
  # --oem 使用LSTM作为OCR引擎，可选值为0、1、2、3；
  #  0    Legacy engine only.
  #  1    Neural nets LSTM engine only.
  #  2    Legacy + LSTM engines.
  #  3    Default, based on what is available.
  # --psm 设置Page Segmentation模式为自动
  config = ('-l chi_sim --oem 1 --psm 3')

  im = cv2.imread(imPath, cv2.IMREAD_COLOR)

  # 进行识别，本质上是调用tesseract命令行工具
  text = pytesseract.image_to_string(im, config=config)

  # 打印结果
  print(text)