步骤:
(1)安装依赖
pip install pytesseract
pip install pillow
(2)安装orc工具【一直next】
https://digi.bib.uni-mannheim.de/tesseract/
(3)添加环境变量
A:Path添加ORC安装目录
eg:
B:添加数据包路径
eg:
验证:
tesseract -v
(4)下载语言包【git拉取】
git clone https://gitcode.com/mirrors/tesseract-ocr/tessdata.git
(5)将语言包放到ORC工具的tessdata目录下
(6)编写python识别代码
import pytesseract
from PIL import Image, ImageOps
def main():
demoTest()
def demoTest():
image = Image.open('C:\\Users\\Style\\Desktop\\临时\\bak\\aaaa.png')
gray_image = ImageOps.grayscale(image)
# 打开灰色图片,加强扫描准确性
threshold = 128
binary_image = gray_image.point(lambda x: 0 if x < threshold else 225, '1')
# 根据语言包翻译图片内容
text = pytesseract.image_to_string(binary_image, lang='chi_sim')
print(text)
# 打开灰色图片
binary_image.show()
return
if __name__ == '__main__':
main()