白大米-Python图像文字识别（OCR）

最新推荐文章于 2024-07-08 09:38:15 发布

白叔King

最新推荐文章于 2024-07-08 09:38:15 发布

阅读量345

点赞数 1

分类专栏： python OCR

本文链接：https://blog.csdn.net/weixin_37254196/article/details/117731594

版权

python 同时被 2 个专栏收录

87 篇文章 6 订阅

订阅专栏

OCR

1 篇文章 0 订阅

订阅专栏

废话不多说，直接开干！
最近私活，遇到图像处理！

需要的材料

下载Tesseract-OC
下载简体字识别包

下载地址

在系统变量中新建一个配置信息，命名为：TESSDATA_PREFIX，变量值为安装路径D:\Tesseract-OCR\tessdata
path路径D:\Tesseract-OCR

在这里插入图片描述

需要安装库

pip install pytesseract

特别注意tesseract与pillow版本要一致

 安装tesseract，pip install pytesseract
 安装pillow, pip install Pillow

直接上代码

# -*- coding: UTF-8 -*-
import pytesseract
from PIL import Image
def douyin():
    image = Image.open("202106060946243820.jpg")
    text = pytesseract.image_to_string(image,lang='chi_sim') #使用简体中文解析图片
    with open('douyin.txt','w') as file:
        print(text)
        file.write(str(text))

if __name__ == '__main__':
    douyin()