PaddleOCR —— 导出训练模型并测试（俄语）

最新推荐文章于 2024-07-19 17:57:40 发布

只会git clone的程序员

最新推荐文章于 2024-07-19 17:57:40 发布

阅读量2k

点赞数 3

分类专栏： # OCR 文章标签： PaddleOCR

本文链接：https://blog.csdn.net/qq_37668436/article/details/109722924

版权

OCR 专栏收录该内容

10 篇文章 3 订阅

订阅专栏

本文档详细介绍了如何使用PaddleOCR训练日语识别模型，并解决训练过程中遇到的字典文件缺失、模块未找到以及模型不支持字符类型的问题。成功导出模型后，通过predict_system.py进行预测，注意到预测时需指定正确的字典路径以避免错误的中文预测结果。然而，预测效果仍需进一步优化。

摘要由CSDN通过智能技术生成

使用paddleocr训练好代码后可以拿到三个文件：
在这里插入图片描述
路径在PaddleOCR/output/rec_russia下，其中这个rec_russia是我自己训练的日语的识别的存放路径，可以在对应的训练yaml里面修改。

第一步：转换模型成inference模型

python3 tools/export_model.py -c ./configs/rec/multi_languages/rec_russia_lite_train.yml -o Global.checkpoints=./output/rec_russia/best_accuracy Global.save_inference_dir=./inference/rec_russia_crnn/

-c ：后面是训练的配置文件
-o ：后面是训练得到的模型以及inference模型保存路径

上述执行过程可能会报错：
1.

FileNotFoundError: [Errno 2] No such file or directory: ‘./ppocr/utils/russia_dict.txt’

这个是自己准备的俄语的字典，需要放到对应路径下
2.

ModuleNotFoundError: No module named ‘yaml’

pip安装下pyyaml即可：

python3 -m pip install pyyaml,imgaug -i https://mirror.baidu.com/pypi/simple

File “/Users/XXX/PycharmProjects/pythonProject/PaddleOCR/ppocr/utils/character.py”, line 61, in init
“Nonsupport type of the character: {}”.format(self.character_str)
AssertionError: Nonsupport type of the character: None

这个是官方的PaddleOCR代码目前没有俄语的支持，所以需要在代码中加上character_type:russia
修改：PaddleOCR/ppocr/utils/character.py
在这里插入图片描述

上述问题解决完后导出模型输出如下则成功：
在这里插入图片描述

第二步：使用predict_system.py预测

python3 tools/infer/predict_system.py --image_dir="./russia_img_test/" --det_model_dir="./inference/ch_ppocr_mobile_v1.1_det_infer/"  --rec_model_dir="./inference/rec_russia_crnn/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True --use_gpu=False --rec_char_dict_path="./ppocr/utils/dict/russia_dict.txt"

这里有点要注意，官方的github地下写的这条命令没加字典的路径，如果不加就一直默认中文字典，所以最开始我的俄语预测全是中文结果…
image_dir：待检测的图片路径
det_model_dir：检测模型路径
rec_model_dir：识别模型路径
cls_model_dir：角度分类模型路径
rec_char_dict_path：字典路径

输出结果在inference_results中：
在这里插入图片描述
…效果不太行还要优化