python脚本实现图片转文字

cocosgirl

于 2024-04-03 10:56:15 发布

阅读量234

点赞数 2

文章标签： python 脚本

本文链接：https://blog.csdn.net/cocos2dGirl/article/details/137335349

版权

本文介绍了如何在Ubuntu系统上安装tesseract-ocr和pytesseract库，以及如何通过Python脚本实现包含中文和英文的图片文字识别，提供了一个基础的代码示例。

摘要由CSDN通过智能技术生成

有时候需要复制图片上的文字，一个个敲起来还是太麻烦了，用python脚本转换成长文字粘贴，很实用。

安装环境依赖，以ubuntu系统为例：

一. 安装 tesseract-ocr 包

sudo apt-get install tesseract-ocr

二. 安装 pytesseract

pip install pytesseract

三. 安装中文字库

sudo apt-get install tesseract-ocr-chi-sim

四.代码

# -*- coding: UTF-8 -*-

from PIL import Image
import pytesseract


# 识别带中文（包括英文数字都可识别）
text = pytesseract.image_to_string(Image.open('/sfile/img_tmp.png'), lang='chi_sim')
print(text)

print('==========================\n\n')
# 识别英文（不能识别中文）
text = pytesseract.image_to_string(Image.open('/sfile/0403.png'))
print(text)