python使用pytesseract实现图像OCR

最新推荐文章于 2024-08-07 07:15:00 发布

森尼嫩豆腐

最新推荐文章于 2024-08-07 07:15:00 发布

阅读量1.2k

点赞数 3

分类专栏：实用工具代码实现文章标签： ocr python 图像识别计算机视觉

本文链接：https://blog.csdn.net/lavinia_chen007/article/details/116137606

版权

实用工具同时被 2 个专栏收录

20 篇文章 2 订阅

订阅专栏

代码实现

14 篇文章 6 订阅

订阅专栏

python使用pytesseract识别图中的文字。

以识别和提取下图为例。要实现对图中文字的自动识别和提取。
在这里插入图片描述
图像的质量将极大地影响OCR的准确度。因此选择适当的图像预处理方式能够帮助我们准确地提取图片中的文字。基础的、有效的预处理步骤包括：

将文字区域从图片中裁剪提取出来；
适当缩放文字区域至恰当的大小；
恰当的阈值处理，将文字和背景区域二值化；
高斯滤波消除噪声。

其他可选预处理步骤，包括但不限于：

图像锐化；
图像形态学处理（Opening\Closing…）.
…

考虑到示例图片分辨率过低，二值化后存在噪声等问题，采取以下步骤对示例图片进行预处理。

import os
import cv2
from skimage.transform import rescale

def image_preprocessing(img_path):
   img = cv2.imread(img_path)
   #由于示例图仅包含文字区域，示例代码不包含裁剪步骤
   gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #将RGB图转化为灰度图，便于进行预处理步骤
   #首先做阈值处理，这边使用的是Otsu‘s Method自动选取最佳阈值
   #阈值处理后，得到结果前景文字值为255，背景值为0。
   ret,thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
   #示例图分辨率过低，这边使用rescale将其放大至其原始尺寸2倍大小
   #rescale后，原本二值化的结果会因为插值而变为取值0~1
   thresh = rescale(thresh,2,anti_aliasing=True)
   #应用低通滤波器去除高频噪声
   thresh = cv2.blur(thresh,(3,3))
   #重新将结果转化二值化
   thresh[thresh>=0.5] = 255
   thresh[thresh<0.5] = 0
   return thresh

上述预处理步骤后，得到如下结果。后面的OCR步骤我们使用预处理后的图片进行
在这里插入图片描述

import os
import re
import numpy as np
import pytesseract

def img_ocr(img):
	text = pytesseract.image_to_string(img)
	#text ='MGI-01\n\nB\n\nFH4.5 G10\nFR23 DR116\nD16.0'
	#下面对text做简单处理，转化成“参数：值”的形式
	items = text.split('\n')
    info = dict()
    
    for idx,item in enumerate(items):
        if item == '':
            continue
        else:
            labels = item.split(' ')
            for label in labels:
                #re.findall('\d*\D+',label)找到字符串中的字母
                #re.findall('\d+',label)找到字符串中的数字
                if len(re.findall('\d+',label)) != 0:
                    info[re.findall('\d*\D+',label)[0]] = label.lstrip(re.findall('\d*\D+',label)[0])
                else:
                    info[label] = label
    return info

写一个主程序将图片预处理，ocr和保存识别结果保存txt串起来。

import os
import cv2
import re
from skimage.transform import rescale
import numpy as np
import pytesseract

def main():
	processed_img = image_preprocessing(img_path)
	info = ocr(processed_img)
	
	with open(txt_path,'w+') as file:
            for key,value in info.items():
                line = ':'.join((key,value))
                line = line + '\n'
                file.writelines(line)