有几个可能的原因导致OCR无法识别委托方和地址等信息

最新推荐文章于 2025-05-29 11:03:46 发布

Leon_Jinhai_Sun

最新推荐文章于 2025-05-29 11:03:46 发布

阅读量248

点赞数 9

文章标签： ocr python opencv

本文链接：https://blog.csdn.net/Leon_Jinhai_Sun/article/details/147265517

版权

有几个可能的原因导致OCR无法识别委托方和地址等信息：

图片内容显示的是包装设计稿，可能委托方和地址信息在图片的其他部分未显示
当前OCR处理没有进行适当的图像预处理
PaddleOCR参数可能需要调整以提高识别率

以下是改进后的代码，增加了图像预处理和更完善的OCR配置：

from paddleocr import PaddleOCR
import cv2
import numpy as np

class ImageTextExtractor:
    def __init__(self, image_path):
        self.image_path = image_path
        
    def preprocess_image(self):
        """图像预处理以提高OCR识别率"""
        try:
            # 读取图像
            img = cv2.imread(self.image_path)
            if img is None:
                raise ValueError("无法读取图像文件")
                
            # 转换为灰度图
            gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            
            # 自适应阈值二值化
            thresh = cv2.adaptiveThreshold(gray, 255, 
                                         cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
                                         cv2.THRESH_BINARY, 11, 2)
            
            # 可选: 降噪处理
            kernel = np.ones((1, 1), np.uint8)
            processed_img = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
            
            return processed_img
        except Exception as e:
            print(f"图像预处理失败: {str(e)}")
            return None
            
    def extract_text_from_image(self):
        """从图像中提取文本"""
        try:
            # 预处理图像
            processed_img = self.preprocess_image()
            
            # 初始化PaddleOCR
            ocr = PaddleOCR(
                use_angle_cls=True,
                lang="ch",
                det_model_dir=None,  # 可以指定自定义模型路径
                rec_model_dir=None,
                cls_model_dir=None,
                use_gpu=False,       # 根据实际情况调整
                det_db_thresh=0.3,   # 检测阈值
                det_db_box_thresh=0.5,
                det_db_unclip_ratio=1.6,
                use_dilation=True,   # 对密集文本使用膨胀
                rec_char_dict_path=None,  # 可以指定自定义字典
                show_log=False
            )
            
            # 使用预处理后的图像或原始图像
            img_to_ocr = processed_img if processed_img is not None else self.image_path
            
            # 执行OCR
            result = ocr.ocr(img_to_ocr, cls=True)
            
            # 格式化结果
            formatted_text = self.format_ocr_result(result)
            return formatted_text
        except Exception as e:
            raise RuntimeError(f"文本提取失败: {str(e)}")
            
    def format_ocr_result(self, result):
        """格式化OCR结果"""
        lines = []
        if result is None:
            return ""
            
        for idx in range(len(result)):
            res = result[idx]
            if res is None:
                continue
                
            for line in res:
                if line and len(line) >= 2:
                    line_text = line[1][0]  # 提取文本内容
                    confidence = line[1][1]  # 提取置信度
                    # 可以根据置信度过滤结果
                    if confidence > 0.6:  # 只保留置信度高于0.6的结果
                        lines.append(line_text)
                        
        return "\n".join(lines)

# 使用示例
if __name__ == "__main__":
    extractor = ImageTextExtractor("SWC22551_NT水光瓶2.5包装设计V1-250305_OL_545_1.0.png")
    try:
        text = extractor.extract_text_from_image()
        print("提取的文本内容:")
        print(text)
    except Exception as e:
        print(f"错误: {str(e)}")

改进点：

图像预处理：增加了灰度转换、自适应阈值二值化等预处理步骤，可以提高文字识别率
OCR参数优化：
- 调整了检测阈值(det_db_thresh)
- 启用了膨胀处理(use_dilation)以改善密集文本识别
- 可以指定自定义字典(rec_char_dict_path)用于特定领域的术语
结果过滤：
- 根据置信度过滤低质量识别结果
- 更健壮的结果处理逻辑
错误处理：
- 更完善的异常处理
- 图像读取验证