0 项目背景
在本系列项目中,我们尝试基于Paddle工具库实现一个OCR垂类场景。原始数据集是一系列电度表的照片,类型较多,需要完成电表的读数识别,对于有编号的电表,还要完成其编号的识别。
1 数据集简介
注:因保密授权原因,数据集尚未公开,待更新
首先,我们来简单看一下数据集的情况。总的来说,这个场景面临几个比较大的问题:
- 电表类型较多。相比之下,现有数据量(500张)可能不够。
- 照片角度倾斜较厉害。这个比较好理解,有些电表可能不具备正面拍照条件,有不少图片是从下往上、甚至从左下往右上拍的。
- 反光严重。反光问题对定位目标框以及识别数字可能都会产生影响。
- 表号是点阵数字,不易识别。这个问题是标注的时候发现的,有的标注,PPOCRLabel自动识别的四点检测定位其实已经非常准了,但里面的数字识别效果却很离谱。
- 对检测框精准度要求非常高。电表显示读数的地方附近一般不是空白,往往有单位、字符或是小数点上的读数等,如果检测框没框准,就会把其它可识别项纳进来,如果也是数字,就算加了后处理也处理不掉。
下面,读者可以通过这几张典型图片,初步感受下数据集的基本情况。
2 开发思路
鉴于上面提到的这些问题,该场景的开发几乎是从数据标注就开始陷入纠结。比如是标注一次(PPOCRLabel)还是标注两次(Labelimg标检测框+PPOCRLabel识别finetune)?比如是全程用PPOCR还是PPDET+PPOCR?
最后发现,标注似乎可以一次到位,就是使用PPOCRLabel进行标注,然后将OCR标注格式通过规则转换为目标检测的标注。其原因在于,如果单独对数据集进行目标检测标注,等于要标注两次(标注目标框+标注框内内容),相比之下,显然用PPOCRLabel标注为OCR数据集是性价比更高的选择。
对于开发路线,一开始考虑的是两条都试试看,之所以这么考虑,在最初的尝试中,基于PPOCR文本检测模型finetune的效果一直上不去,是因为对OCR模型目标框能否一步检测到位存在疑虑;又根据数据集的实际情况,也考虑过引入PaddleDetection的旋转目标检测模型。
因此,一开始,项目的整体探索思路如下:
2.1 基于PaddleDetection的探索
在PPOCR+PPDET电表读数和编号识别项目中,我们跑通了第一条路线,基于标注矩形目标检测的电表识别。在这个部分,追求的是先搞定“有没有”的问题。
其预测效果如下:
从上面的预测结果看来,我们发现直接用矩形框检测也存在问题。由于输入图片会存在歪斜,导致矩形框可能会框住多余的文字,进而影响文字识别效果。
2.2 基于PaddleOCR的全流程打通
现在,在本项目中,我们将实现全程基于PaddleOCR完成电表识别任务。
打通这条流程的前提是,通过“炼丹”,大幅提升了PaddleOCR文本检测模型在电表框框选预测的准确性,使其达到甚至超越了使用PaddleDetection基线的表现。
3 PaddleOCR的文本检测模型
PaddleOCR包含丰富的文本检测、文本识别以及端到端算法。在PaddleOCR的全景图中,我们可以看到PaddleOCR支持的文本检测算法。
在标注数据的基础上,基于通用的文本检测算法finetune,我们就可以训练一个能将电表识别中的多余文本框自动去除,只留下目标的电表读数、编号的电表文本检测模型。
明确了目标,我们开始下一步的操作。
3.1 训练DB模型
为节省训练时间,这里提供了一个效果不错的预训练模型以及配置文件,读者可以选择基于预训练模型finetune或是从头训练。
在AIStudio训练,一定要注意几个重点!
- 用至尊版!因为原图分辨率太大,目标框相对其实很小,所以输入模型的size太小训练效果不好,而size设大自然需要更多显存
use_shared_memory
设置为False
batch_size_per_card
不能设置太大,因为输入size比较大
后面两个tricks如果不照做,训练会闪退。
在本文中,我们直接对configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_distill.yml
文件内容进行下面的替换。
配置文件如下:
Global:
debug: false
use_gpu: true
epoch_num: 1200
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/det_dianbiao_v3
save_epoch_step: 1200
eval_batch_step:
- 0
- 100
cal_metric_during_train: false
pretrained_model: my_exps/student.pdparams
checkpoints: null
save_inference_dir: null
use_visualdl: false
infer_img: M2021/台安站公寓楼段值班房间.jpg
save_res_path: ./output/det_db/predicts_db.txt
Architecture:
model_type: det
algorithm: DB
Transform: null
Backbone:
name: MobileNetV3
scale: 0.5
model_name: large
disable_se: true
Neck:
name: DBFPN
out_channels: 96
Head:
name: DBHead
k: 50
Loss:
name: DBLoss
balance_loss: true
main_loss_type: DiceLoss
alpha: 5
beta: 10
ohem_ratio: 3
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.0001
warmup_epoch: 2
regularizer:
name: L2
factor: 0
PostProcess:
name: DBPostProcess
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
Metric:
name: DetMetric
main_indicator: hmean
Train:
dataset:
name: SimpleDataSet
data_dir: ./
label_file_list:
- M2021/M2021_label_train.txt
ratio_list:
- 1.0
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- DetLabelEncode: null
- CopyPaste: null
- IaaAugment:
augmenter_args:
- type: Fliplr
args:
p: 0.5
- type: Affine
args:
rotate:
- -10
- 10
- type: Resize
args:
size:
- 0.5
- 3
- EastRandomCropData:
size:
- 1600
- 1600
max_tries: 50
keep_ratio: true
- MakeBorderMap:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- MakeShrinkMap:
shrink_ratio: 0.4
min_text_size: 8
- NormalizeImage:
scale: 1./255.
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
order: hwc
- ToCHWImage: null
- KeepKeys:
keep_keys:
- image
- threshold_map
- threshold_mask
- shrink_map
- shrink_mask
loader:
shuffle: true
drop_last: false
batch_size_per_card: 4 # 重点!
num_workers: 4
use_shared_memory: False # 重点!
Eval:
dataset:
name: SimpleDataSet
data_dir: ./
label_file_list:
- M2021/M2021_label_eval.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- DetLabelEncode: null
- DetResizeForTest:
limit_side_len: 1280
limit_type: min
- NormalizeImage:
scale: 1./255.
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
order: hwc
- ToCHWImage: null
- KeepKeys:
keep_keys:
- image
- shape
- polys
- ignore_tags
loader:
shuffle: false
drop_last: false
batch_size_per_card: 1
num_workers: 2
use_shared_memory: False # 重点!
profiler_options: null
!git clone https://gitee.com/paddlepaddle/PaddleOCR.git
# 解压数据集
!unzip -O GB2312 data/data117381/M2021.zip
!cp -r ../M2021 ./M2021
# 安装ppocr
!pip install fasttext==0.8.3
!pip install paddleocr --no-deps -r requirements.txt
%cd PaddleOCR/
/home/aistudio/PaddleOCR
# 提供的预训练模型和配置文件(供参考,直接用不该上面两个注意点,训练会报错)
!tar -xvf ../my_exps.tar -C ./
my_exps/
my_exps/student.pdparams
my_exps/det_dianbiao_size1600_copypaste/
my_exps/det_dianbiao_size1600_copypaste/best_accuracy.pdopt
my_exps/det_dianbiao_size1600_copypaste/config.yml
my_exps/det_dianbiao_size1600_copypaste/train.log
my_exps/det_dianbiao_size1600_copypaste/best_accuracy.pdparams
my_exps/det_dianbiao_size1600_copypaste/best_accuracy.states
my_exps/det_dianbiao_size1600/
my_exps/det_dianbiao_size1600/best_accuracy.pdopt
my_exps/det_dianbiao_size1600/config.yml
my_exps/det_dianbiao_size1600/latest.pdopt
my_exps/det_dianbiao_size1600/train.log
my_exps/det_dianbiao_size1600/latest.pdparams
my_exps/det_dianbiao_size1600/best_accuracy.pdparams
my_exps/det_dianbiao_size1600/latest.states
my_exps/det_dianbiao_size1600/best_accuracy.states
# 从头开始训练
!python tools/train.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_distill.yml
3.2 模型效果验证
# 也可以查看下提供的模型训练效果
!python tools/eval.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_distill.yml -o Global.checkpoints="my_exps/det_dianbiao_size1600_copypaste/best_accuracy"
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/skimage/morphology/_skeletonize.py:241: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
0, 1, 1, 0, 0, 1, 0, 0, 0], dtype=np.bool)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/skimage/morphology/_skeletonize.py:256: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=np.bool)
[2022/01/20 01:40:18] root INFO: Architecture :
[2022/01/20 01:40:18] root INFO: Backbone :
[2022/01/20 01:40:18] root INFO: disable_se : True
[2022/01/20 01:40:18] root INFO: model_name : large
[2022/01/20 01:40:18] root INFO: name : MobileNetV3
[2022/01/20 01:40:18] root INFO: scale : 0.5
[2022/01/20 01:40:18] root INFO: Head :
[2022/01/20 01:40:18] root INFO: k : 50
[2022/01/20 01:40:18] root INFO: name : DBHead
[2022/01/20 01:40:18] root INFO: Neck :
[2022/01/20 01:40:18] root INFO: name : DBFPN
[2022/01/20 01:40:18] root INFO: out_channels : 96
[2022/01/20 01:40:18] root INFO: Transform : None
[2022/01/20 01:40:18] root INFO: algorithm : DB
[2022/01/20 01:40:18] root INFO: model_type : det
[2022/01/20 01:40:18] root INFO: Eval :
[2022/01/20 01:40:18] root INFO: dataset :
[2022/01/20 01:40:18] root INFO: data_dir : ./
[2022/01/20 01:40:18] root INFO: label_file_list : ['M2021/M2021_label_eval.txt']
[2022/01/20 01:40:18] root INFO: name : SimpleDataSet
[2022/01/20 01:40:18] root INFO: transforms :
[2022/01/20 01:40:18] root INFO: DecodeImage :
[2022/01/20 01:40:18] root INFO: channel_first : False
[2022/01/20 01:40:18] root INFO: img_mode : BGR
[2022/01/20 01:40:18] root INFO: DetLabelEncode : None
[2022/01/20 01:40:18] root INFO: DetResizeForTest :
[2022/01/20 01:40:18] root INFO: limit_side_len : 1280
[2022/01/20 01:40:18] root INFO: limit_type : min
[2022/01/20 01:40:18] root INFO: NormalizeImage :
[2022/01/20 01:40:18] root INFO: mean : [0.485, 0.456, 0.406]
[2022/01/20 01:40:18] root INFO: order : hwc
[2022/01/20 01:40:18] root INFO: scale : 1./255.
[2022/01/20 01:40:18] root INFO: std : [0.229, 0.224, 0.225]
[2022/01/20 01:40:18] root INFO: ToCHWImage : None
[2022/01/20 01:40:18] root INFO: KeepKeys :
[2022/01/20 01:40:18] root INFO: keep_keys : ['image', 'shape', 'polys', 'ignore_tags']
[2022/01/20 01:40:18] root INFO: loader :
[2022/01/20 01:40:18] root INFO: batch_size_per_card : 1
[2022/01/20 01:40:18] root INFO: drop_last : False
[2022/01/20 01:40:18] root INFO: num_workers : 2
[2022/01/20 01:40:18] root INFO: shuffle : False
[2022/01/20 01:40:18] root INFO: use_shared_memory : False
[2022/01/20 01:40:18] root INFO: Global :
[2022/01/20 01:40:18] root INFO: cal_metric_during_train : False
[2022/01/20 01:40:18] root INFO: checkpoints : my_exps/det_dianbiao_size1600_copypaste/best_accuracy
[2022/01/20 01:40:18] root INFO: debug : False
[2022/01/20 01:40:18] root INFO: distributed : False
[2022/01/20 01:40:18] root INFO: epoch_num : 1200
[2022/01/20 01:40:18] root INFO: eval_batch_step : [0, 100]
[2022/01/20 01:40:18] root INFO: infer_img : M2021/台安站公寓楼段值班房间.jpg
[2022/01/20 01:40:18] root INFO: log_smooth_window : 20
[2022/01/20 01:40:18] root INFO: pretrained_model : my_exps/student.pdparams
[2022/01/20 01:40:18] root INFO: print_batch_step : 10
[2022/01/20 01:40:18] root INFO: save_epoch_step : 1200
[2022/01/20 01:40:18] root INFO: save_inference_dir : None
[2022/01/20 01:40:18] root INFO: save_model_dir : ./output/det_dianbiao_v3
[2022/01/20 01:40:18] root INFO: save_res_path : ./output/det_db/predicts_db.txt
[2022/01/20 01:40:18] root INFO: use_gpu : True
[2022/01/20 01:40:18] root INFO: use_visualdl : False
[2022/01/20 01:40:18] root INFO: Loss :
[2022/01/20 01:40:18] root INFO: alpha : 5
[2022/01/20 01:40:18] root INFO: balance_loss : True
[2022/01/20 01:40:18] root INFO: beta : 10
[2022/01/20 01:40:18] root INFO: main_loss_type : DiceLoss
[2022/01/20 01:40:18] root INFO: name : DBLoss
[2022/01/20 01:40:18] root INFO: ohem_ratio : 3
[2022/01/20 01:40:18] root INFO: Metric :
[2022/01/20 01:40:18] root INFO: main_indicator : hmean
[2022/01/20 01:40:18] root INFO: name : DetMetric
[2022/01/20 01:40:18] root INFO: Optimizer :
[2022/01/20 01:40:18] root INFO: beta1 : 0.9
[2022/01/20 01:40:18] root INFO: beta2 : 0.999
[2022/01/20 01:40:18] root INFO: lr :
[2022/01/20 01:40:18] root INFO: learning_rate : 0.0001
[2022/01/20 01:40:18] root INFO: name : Cosine
[2022/01/20 01:40:18] root INFO: warmup_epoch : 2
[2022/01/20 01:40:18] root INFO: name : Adam
[2022/01/20 01:40:18] root INFO: regularizer :
[2022/01/20 01:40:18] root INFO: factor : 0
[2022/01/20 01:40:18] root INFO: name : L2
[2022/01/20 01:40:18] root INFO: PostProcess :
[2022/01/20 01:40:18] root INFO: box_thresh : 0.6
[2022/01/20 01:40:18] root INFO: max_candidates : 1000
[2022/01/20 01:40:18] root INFO: name : DBPostProcess
[2022/01/20 01:40:18] root INFO: thresh : 0.3
[2022/01/20 01:40:18] root INFO: unclip_ratio : 1.5
[2022/01/20 01:40:18] root INFO: Train :
[2022/01/20 01:40:18] root INFO: dataset :
[2022/01/20 01:40:18] root INFO: data_dir : ./
[2022/01/20 01:40:18] root INFO: label_file_list : ['M2021/M2021_label_train.txt']
[2022/01/20 01:40:18] root INFO: name : SimpleDataSet
[2022/01/20 01:40:18] root INFO: ratio_list : [1.0]
[2022/01/20 01:40:18] root INFO: transforms :
[2022/01/20 01:40:18] root INFO: DecodeImage :
[2022/01/20 01:40:18] root INFO: channel_first : False
[2022/01/20 01:40:18] root INFO: img_mode : BGR
[2022/01/20 01:40:18] root INFO: DetLabelEncode : None
[2022/01/20 01:40:18] root INFO: CopyPaste : None
[2022/01/20 01:40:18] root INFO: IaaAugment :
[2022/01/20 01:40:18] root INFO: augmenter_args :
[2022/01/20 01:40:18] root INFO: args :
[2022/01/20 01:40:18] root INFO: p : 0.5
[2022/01/20 01:40:18] root INFO: type : Fliplr
[2022/01/20 01:40:18] root INFO: args :
[2022/01/20 01:40:18] root INFO: rotate : [-10, 10]
[2022/01/20 01:40:18] root INFO: type : Affine
[2022/01/20 01:40:18] root INFO: args :
[2022/01/20 01:40:18] root INFO: size : [0.5, 3]
[2022/01/20 01:40:18] root INFO: type : Resize
[2022/01/20 01:40:18] root INFO: EastRandomCropData :
[2022/01/20 01:40:18] root INFO: keep_ratio : True
[2022/01/20 01:40:18] root INFO: max_tries : 50
[2022/01/20 01:40:18] root INFO: size : [1600, 1600]
[2022/01/20 01:40:18] root INFO: MakeBorderMap :
[2022/01/20 01:40:18] root INFO: shrink_ratio : 0.4
[2022/01/20 01:40:18] root INFO: thresh_max : 0.7
[2022/01/20 01:40:18] root INFO: thresh_min : 0.3
[2022/01/20 01:40:18] root INFO: MakeShrinkMap :
[2022/01/20 01:40:18] root INFO: min_text_size : 8
[2022/01/20 01:40:18] root INFO: shrink_ratio : 0.4
[2022/01/20 01:40:18] root INFO: NormalizeImage :
[2022/01/20 01:40:18] root INFO: mean : [0.485, 0.456, 0.406]
[2022/01/20 01:40:18] root INFO: order : hwc
[2022/01/20 01:40:18] root INFO: scale : 1./255.
[2022/01/20 01:40:18] root INFO: std : [0.229, 0.224, 0.225]
[2022/01/20 01:40:18] root INFO: ToCHWImage : None
[2022/01/20 01:40:18] root INFO: KeepKeys :
[2022/01/20 01:40:18] root INFO: keep_keys : ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask']
[2022/01/20 01:40:18] root INFO: loader :
[2022/01/20 01:40:18] root INFO: batch_size_per_card : 4
[2022/01/20 01:40:18] root INFO: drop_last : False
[2022/01/20 01:40:18] root INFO: num_workers : 4
[2022/01/20 01:40:18] root INFO: shuffle : True
[2022/01/20 01:40:18] root INFO: use_shared_memory : False
[2022/01/20 01:40:18] root INFO: profiler_options : None
[2022/01/20 01:40:18] root INFO: train with paddle 2.1.2 and device CUDAPlace(0)
[2022/01/20 01:40:18] root INFO: Initialize indexs of datasets:['M2021/M2021_label_eval.txt']
W0120 01:40:18.252579 10189 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0120 01:40:18.257786 10189 device_context.cc:422] device: 0, cuDNN Version: 7.6.
[2022/01/20 01:40:23] root INFO: resume from my_exps/det_dianbiao_size1600_copypaste/best_accuracy
[2022/01/20 01:40:23] root INFO: metric in ckpt ***************
[2022/01/20 01:40:23] root INFO: hmean:0.8543046357615895
[2022/01/20 01:40:23] root INFO: precision:0.7914110429447853
[2022/01/20 01:40:23] root INFO: recall:0.9280575539568345
[2022/01/20 01:40:23] root INFO: fps:3.2759908010222296
[2022/01/20 01:40:23] root INFO: best_epoch:138
[2022/01/20 01:40:23] root INFO: start_epoch:139
eval model:: 100%|██████████████████████████████| 70/70 [01:12<00:00, 1.04s/it]
[2022/01/20 01:41:36] root INFO: metric eval ***************
[2022/01/20 01:41:36] root INFO: precision:0.7081081081081081
[2022/01/20 01:41:36] root INFO: recall:0.9225352112676056
[2022/01/20 01:41:36] root INFO: hmean:0.801223241590214
[2022/01/20 01:41:36] root INFO: fps:2.3143317331617412
!python tools/infer_det.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_distill.yml -o Global.infer_img="./M2021/台安站公寓楼段值班房间.jpg" -o Global.checkpoints="my_exps/det_dianbiao_size1600_copypaste/best_accuracy"
效果非常棒!接下来,就是串接检测模型和识别模型了。
3.3 模型导出和串接
这里用了个比较取巧的方式,先将模型导出,然后把whl
下预测用的检测模型用新训练的模型直接替换掉,就可以看到finetune后的检测效果了!
# 模型导出
!python tools/export_model.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_distill.yml -o Global.pretrained_model=./my_exps/det_dianbiao_size1600_copypaste/best_accuracy Global.save_inference_dir=./inference/det_db
from paddleocr import PaddleOCR, draw_ocr
# 模型路径下必须含有model和params文件
ocr = PaddleOCR(det_model_dir='./inference/det_db',
use_angle_cls=True)
img_path = './M2021/台安站公寓楼段值班房间.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
print(line)
[2022/01/20 02:12:49] root WARNING: version 2.1 not support cls models, use version 2.0 instead
Namespace(benchmark=False, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/aistudio/.paddleocr/2.2.1/ocr/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, cpu_threads=10, det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max', det_model_dir='./inference/det_db', det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_polygon=True, e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, gpu_mem=500, help='==SUPPRESS==', image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, output='./output/table', precision='fp32', process_id=0, rec=True, rec_algorithm='CRNN', rec_batch_num=6, rec_char_dict_path='/home/aistudio/PaddleOCR/ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='/home/aistudio/.paddleocr/2.2.1/ocr/rec/ch/ch_PP-OCRv2_rec_infer', save_log_path='./log_output/', show_log=True, table_char_dict_path=None, table_char_type='en', table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=True, use_dilation=False, use_gpu=True, use_mp=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, version='2.1', vis_font_path='./doc/fonts/simfang.ttf', warmup=True)
[2022/01/20 02:12:52] root DEBUG: dt_boxes num : 4, elapse : 0.048544883728027344
[2022/01/20 02:12:52] root DEBUG: cls num : 4, elapse : 0.0068705081939697266
[2022/01/20 02:12:52] root DEBUG: rec_res num : 4, elapse : 0.019389867782592773
[[[1359.0, 1911.0], [2065.0, 1890.0], [2068.0, 1991.0], [1362.0, 2012.0]], ('2013207034088', 0.91313225)]
[[[1576.0, 2011.0], [1689.0, 2011.0], [1689.0, 2069.0], [1576.0, 2069.0]], ('WW', 0.5822574)]
[[[1065.0, 2131.0], [2008.0, 2131.0], [2008.0, 2366.0], [1065.0, 2366.0]], ('3127', 0.98005736)]
# 显示结果
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
对于多识别到的内容,有两个方式将其处理掉:
- 阈值调整
- 将非数字的内容后处理掉
4 小结
现在,基于PaddleOCR文本检测模型微调的这条路线,也跑通了。接下来我们将在新增数据集基础上,研究旋转目标检测和识别模型的finetune,尽请期待~