【Hackathon】基于PaddleOCR的工频场强计读数识别--全流程演练

★★★ 本文源自AlStudio社区精品项目,【点击此处】查看更多精品内容 >>>

v1.2 2023-05-18
  • 隐藏部分log
  • 增加模型config说明
  • 其他描述优化
v1.1 2023-05-15
  • 由于文件数量限制,公开部分训练数据
  • 由于文件大小限制,公开部分模型

【队名】:megemini

【GitHub id】megemini

【测试集指标】:暂无

【重点环节的准确度】:

模型评价指标Score
PaddleSegmIoU0.9805
PaddleClastop11.0
PaddleOCRacc0.882352422145634

【技术思路简介】:利用 PaddleSeg/PaddleClas/PaddleOCR 完成全流程工频场强计读数识别。

本文为《PaddlePaddle Hackathon 飞桨黑客马拉松第四期》中任务:

【Hackathon 4th】No.236:基于PaddleOCR的工频场强计读数识别

的全流程演练。

import glob
import cv2
import json
import numpy as np

import matplotlib.pyplot as plt

%matplotlib inline

项目背景

此次任务为:

识别工频场图像中的工频电磁场数值和单位、以及下方X\Y\Z的数值,要求结构化输出结果: [ {“Info_Probe”:“”}, {“Freq_Set”:“”}, {“Freq_Main”:“”}, {“Val_Total”:“”},{“Val_X”:“”}, {“Val_Y”:“”}, {“Val_Z”:“”}, {“Unit”:“”}, {“Field”:“”} ]

输出示意图片:

基本的思路是利用图片文字识别技术,将上面仪器中的关键信息抽取出来。

任务提供了100张原始图片,但是不提供标签信息,建议结合 PPOCRLabel 等标注工具构建训练数据并进行模型微调;可以使用数据生成方法批量生成识别数据。

赛题提供的图片有几个特点:

  • 分辨率高,都是 3840×2160 分辨率的图片
  • 虽然分辨率高,但是仪器显示部分所占比例较小
  • 屏幕中的具体关键信息所占比例更小
  • 总共包含两种仪器的图片
  • 关键信息有空值
  • 总量较少
plt.imshow(cv2.imread('./work/data/train/WIN_20230220_14_56_27_Pro.jpg')[..., ::-1])
<matplotlib.image.AxesImage at 0x7f07779da850>

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-PAACQbnE-1688145265029)(main_files/main_4_1.png)]

由于图片较大,但是关键信息的占比其实非常小,所以:

  • 如果直接利用原图进行训练与识别,那么训练必然很慢
  • 如果缩小图片进行训练与识别,必然降低识别精度

针对以上数据特征,这里设计了一套四阶段模型:

  • 第一阶段,仪表显示屏幕分割
  • 第二阶段,仪表显示屏幕分类
  • 第三阶段,生成关键部位Mask
  • 第四阶段,Finetune训练识别模型

其中主要涉及到的工具包括但不限于:

  • PaddleOCR,用于识别模型
  • PaddleClas,用于分类仪器
  • PaddleSeg,用于分割图片中的显示屏
  • OpenCV,用于图片相关操作
  • Label Studio,用于标注显示屏、关键信息

整体流程包括 模型训练图片识别 两部分:

环境准备

首先,需要准备环境,将 PaddleOCR/PaddleClas/PaddleSeg 下载下来并安装。

另外:

  • %%capture 用于控制不显示输出,如果第一次安装,可以注释掉这个控制符。
  • 由于此notebook调试时不是一次性完成的,不需要关注此notebook的执行顺序数字,个人使用时,从上至下顺序执行即可。
  • 使用此notebook可以完整的实现训练与识别的流程,中间有几处需要手动操作(如手动标注)要特别注意,不然可能导致流程跑不通。
  • 由于每个人标注的多少会有不同,以及随机数等因素,最终结果可能略有差异。
%%capture

!git clone https://github.com/PaddlePaddle/PaddleOCR.git
!git clone https://github.com/PaddlePaddle/PaddleClas.git
!git clone https://github.com/PaddlePaddle/PaddleSeg.git

!pip install -r PaddleOCR/requirements.txt
!pip install -e PaddleOCR/

!pip install -r PaddleClas/requirements.txt
!pip install -e PaddleClas/

!pip install -r PaddleSeg/requirements.txt
!pip install -e PaddleSeg/

模型训练

1. 第一阶段,仪表显示屏幕分割

1.1 生成 512×512 的小图

由于原图较大,我们需要将图片缩小用于后续的显示屏幕分割。

这里显示屏幕虽然占图片中的面积较小,但由于特征明显,缩小后并不会影响分割的准确度。

!mkdir ./work/data/step_1_512_img

for filename in glob.glob('./work/data/train/*.jpg'):
    img = cv2.imread(filename)
    cv2.imwrite('./work/data/step_1_512_img/'+filename.split('/')[-1], cv2.resize(img, (512, 512)))

1.2 标注仪器显示部分

打包下载上面的 step_1_512_img 图片,然后用 LabelStudio 进行标注。

标注完之后,将 json 格式的标注文件上传上来,并进行格式的转换。

这里先转换为 Labelme 的格式,将转换之后的文件放到 ./work/data/step_1_512_img_anno_seg_box.json

这里也可以直接转换为 PaddleSeg 的格式,或者直接使用 PaddleSeg 的 EISeg 工具进行标注。

1.3 转换为 PaddleSeg 的格式

with open('./work/data/step_1_512_img_anno_seg_box.json') as f:
    labels = json.load(f)

file_count = 0
for data in labels:
    _data = {
        "imagePath": None,
        "shapes": [],
    }
        
    _image = data['data']['image'].split('-')[-1]
    _data['imagePath'] = _image
    for _anno in data['annotations']:
        _shape = {
            "points": [],
            "group_id": None,
            "description": "",
            "shape_type": "polygon",
            "flags": {}
        }
        for _result in _anno['result']:
            _value = _result['value']
            _shape['points'] = [
                [int(float(p[0])*512/100), int(float(p[1])*512/100)] # 注意,LabelStudio记录的是百分比,要做像素的转换
                for p in _value['points']]
            _shape['label'] = _value['polygonlabels'][0]

        _data['shapes'].append(_shape)
    
    _filename = _image.split('.')[0]+'.json'
    with open('./work/data/step_1_512_img/'+_filename, 'w') as f:
        json.dump(_data, f)
        file_count += 1

print('Done with {} anno files...'.format(file_count))
Done with 99 anno files...

将上面 Labelme 格式的数据用 PaddleSeg 自带的工具转换为 PaddleSeg 使用的标注格式。

!python ./PaddleSeg/tools/data/labelme2seg.py ./work/data/step_1_512_img

1.4 划分PaddleSeg训练集与验证集

划分之前,需要将之前生成的标注文件复制到新建的文件夹中,以方便后续管理。

!mkdir ./work/data/step_1_seg
!mkdir ./work/data/step_1_seg/images
!mkdir ./work/data/step_1_seg/labels

!cp ./work/data/step_1_512_img/*.jpg ./work/data/step_1_seg/images/
!cp ./work/data/step_1_512_img/annotations/* ./work/data/step_1_seg/labels/

!python ./PaddleSeg/tools/data/split_dataset_list.py \
    ./work/data/step_1_seg \
    images \
    labels \
    --split 0.7 0.3 0 --format jpg png

1.5 训练PaddleSeg模型

这里使用 PPLiteSeg 进行模型的训练,配置文件主要需要关注:

batch_size: 4
iters: 1000

train_dataset:
  type: Dataset
  dataset_root: /home/aistudio/work/data/step_1_seg # 数据集的目录
  train_path: /home/aistudio/work/data/step_1_seg/train.txt # 生成的训练样本
  num_classes: 2 # 二分类模型
  mode: train
...

val_dataset:
  type: Dataset
  dataset_root: /home/aistudio/work/data/step_1_seg # 数据集的目录 
  val_path: /home/aistudio/work/data/step_1_seg/val.txt # 生成的验证样本
  num_classes: 2
...

model: # 所使用的模型
  type: PPLiteSeg
  backbone:
    type: STDC2
    pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz

!python ./PaddleSeg/tools/train.py \
       --config ./work/configs/pp_liteseg_optic_disc_512x512_1k.yml \
       --do_eval \
       --use_vdl \
       --save_interval 1000 \
       --save_dir output

/opt/conda/envs/python35-paddle120-env/lib/python3.9/site-packages/sklearn/utils/multiclass.py:14: DeprecationWarning: Please use `spmatrix` from the `scipy.sparse` namespace, the `scipy.sparse.base` namespace is deprecated.
  from scipy.sparse.base import spmatrix
2023-04-25 22:32:55 [WARNING]	Add the `num_classes` in train_dataset and val_dataset config to model config. We suggest you manually set `num_classes` in model config.
2023-04-25 22:32:55 [INFO]	
------------Environment Information-------------
platform: Linux-4.15.0-140-generic-x86_64-with-glibc2.23
Python: 3.9.16 (main, Jan 11 2023, 16:05:54) [GCC 11.2.0]
Paddle compiled with cuda: True
NVCC: Build cuda_11.2.r11.2/compiler.29618528_0
cudnn: 8.2
GPUs used: 1
CUDA_VISIBLE_DEVICES: None
GPU: ['GPU 0: Tesla V100-SXM2-16GB']
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0
PaddleSeg: 2.8.0
PaddlePaddle: 2.4.1
OpenCV: 4.5.5
------------------------------------------------
2023-04-25 22:32:55 [INFO]	
---------------Config Information---------------
batch_size: 4
iters: 1000
train_dataset:
  dataset_root: /home/aistudio/work/data/step_1_seg
  mode: train
  num_classes: 2
  train_path: /home/aistudio/work/data/step_1_seg/train.txt
  transforms:
  - max_scale_factor: 2.0
    min_scale_factor: 0.5
    scale_step_size: 0.25
    type: ResizeStepScaling
  - crop_size:
    - 512
    - 512
    type: RandomPaddingCrop
  - type: RandomHorizontalFlip
  - brightness_range: 0.5
    contrast_range: 0.5
    saturation_range: 0.5
    type: RandomDistort
  - type: Normalize
  type: Dataset
val_dataset:
  dataset_root: /home/aistudio/work/data/step_1_seg
  mode: val
  num_classes: 2
  transforms:
  - type: Normalize
  type: Dataset
  val_path: /home/aistudio/work/data/step_1_seg/val.txt
optimizer:
  momentum: 0.9
  type: SGD
  weight_decay: 4.0e-05
lr_scheduler:
  end_lr: 0
  learning_rate: 0.01
  power: 0.9
  type: PolynomialDecay
loss:
  coef:
  - 1
  - 1
  - 1
  types:
  - type: CrossEntropyLoss
  - type: CrossEntropyLoss
  - type: CrossEntropyLoss
model:
  backbone:
    pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
    type: STDC2
  num_classes: 2
  type: PPLiteSeg
------------------------------------------------

2023-04-25 22:32:55 [INFO]	Set device: gpu
2023-04-25 22:32:55 [INFO]	Use the following config to build model
model:
  backbone:
    pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
    type: STDC2
  num_classes: 2
  type: PPLiteSeg
W0425 22:32:55.275354  9607 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0425 22:32:55.275403  9607 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
2023-04-25 22:32:58 [INFO]	Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
Connecting to https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
Downloading PP_STDCNet2.tar.gz
[==================================================] 100.00%
Uncompress PP_STDCNet2.tar.gz
[==================================================] 100.00%
2023-04-25 22:33:31 [INFO]	There are 265/265 variables loaded into STDCNet.
2023-04-25 22:33:32 [INFO]	Use the following config to build train_dataset
train_dataset:
  dataset_root: /home/aistudio/work/data/step_1_seg
  mode: train
  num_classes: 2
  train_path: /home/aistudio/work/data/step_1_seg/train.txt
  transforms:
  - max_scale_factor: 2.0
    min_scale_factor: 0.5
    scale_step_size: 0.25
    type: ResizeStepScaling
  - crop_size:
    - 512
    - 512
    type: RandomPaddingCrop
  - type: RandomHorizontalFlip
  - brightness_range: 0.5
    contrast_range: 0.5
    saturation_range: 0.5
    type: RandomDistort
  - type: Normalize
  type: Dataset
2023-04-25 22:33:32 [INFO]	Use the following config to build val_dataset
val_dataset:
  dataset_root: /home/aistudio/work/data/step_1_seg
  mode: val
  num_classes: 2
  transforms:
  - type: Normalize
  type: Dataset
  val_path: /home/aistudio/work/data/step_1_seg/val.txt
2023-04-25 22:33:32 [INFO]	If the type is SGD and momentum in optimizer config, the type is changed to Momentum.
2023-04-25 22:33:32 [INFO]	Use the following config to build optimizer
optimizer:
  momentum: 0.9
  type: Momentum
  weight_decay: 4.0e-05
2023-04-25 22:33:32 [INFO]	Use the following config to build loss
loss:
  coef:
  - 1
  - 1
  - 1
  types:
  - type: CrossEntropyLoss
  - type: CrossEntropyLoss
  - type: CrossEntropyLoss
/opt/conda/envs/python35-paddle120-env/lib/python3.9/site-packages/paddle/nn/layer/norm.py:711: UserWarning: When training, we now always track global mean and variance.
  warnings.warn(
/opt/conda/envs/python35-paddle120-env/lib/python3.9/site-packages/paddle/fluid/dygraph/math_op_patch.py:275: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float32, but right dtype is paddle.int64, the right dtype will convert to paddle.float32
  warnings.warn(
2023-04-25 22:33:35 [INFO]	[TRAIN] epoch: 1, iter: 10/1000, loss: 1.6793, lr: 0.009919, batch_cost: 0.2874, reader_cost: 0.03260, ips: 13.9177 samples/sec | ETA 00:04:44
2023-04-25 22:33:36 [INFO]	[TRAIN] epoch: 2, iter: 20/1000, loss: 0.7097, lr: 0.009829, batch_cost: 0.1290, reader_cost: 0.04338, ips: 31.0177 samples/sec | ETA 00:02:06
2023-04-25 22:33:38 [INFO]	[TRAIN] epoch: 2, iter: 30/1000, loss: 0.5740, lr: 0.009739, batch_cost: 0.1360, reader_cost: 0.05363, ips: 29.4115 samples/sec | ETA 00:02:11
2023-04-25 22:33:39 [INFO]	[TRAIN] epoch: 3, iter: 40/1000, loss: 0.4438, lr: 0.009648, batch_cost: 0.1449, reader_cost: 0.05465, ips: 27.6070 samples/sec | ETA 00:02:19
2023-04-25 22:33:40 [INFO]	[TRAIN] epoch: 3, iter: 50/1000, loss: 0.4522, lr: 0.009558, batch_cost: 0.1367, reader_cost: 0.05543, ips: 29.2611 samples/sec | ETA 00:02:09
2023-04-25 22:33:42 [INFO]	[TRAIN] epoch: 4, iter: 60/1000, loss: 0.5030, lr: 0.009467, batch_cost: 0.1581, reader_cost: 0.07444, ips: 25.2939 samples/sec | ETA 00:02:28
2023-04-25 22:33:43 [INFO]	[TRAIN] epoch: 5, iter: 70/1000, loss: 0.2645, lr: 0.009377, batch_cost: 0.1393, reader_cost: 0.05388, ips: 28.7212 samples/sec | ETA 00:02:09
2023-04-25 22:33:45 [INFO]	[TRAIN] epoch: 5, iter: 80/1000, loss: 0.2792, lr: 0.009286, batch_cost: 0.1433, reader_cost: 0.05779, ips: 27.9046 samples/sec | ETA 00:02:11
2023-04-25 22:33:46 [INFO]	[TRAIN] epoch: 6, iter: 90/1000, loss: 0.3271, lr: 0.009195, batch_cost: 0.1394, reader_cost: 0.05749, ips: 28.6847 samples/sec | ETA 00:02:06
2023-04-25 22:33:48 [INFO]	[TRAIN] epoch: 6, iter: 100/1000, loss: 0.5220, lr: 0.009104, batch_cost: 0.1659, reader_cost: 0.05767, ips: 24.1043 samples/sec | ETA 00:02:29
2023-04-25 22:33:50 [INFO]	[TRAIN] epoch: 7, iter: 110/1000, loss: 0.2639, lr: 0.009013, batch_cost: 0.1711, reader_cost: 0.04379, ips: 23.3731 samples/sec | ETA 00:02:32
2023-04-25 22:33:51 [INFO]	[TRAIN] epoch: 8, iter: 120/1000, loss: 0.2142, lr: 0.008922, batch_cost: 0.1599, reader_cost: 0.07016, ips: 25.0096 samples/sec | ETA 00:02:20
2023-04-25 22:33:52 [INFO]	[TRAIN] epoch: 8, iter: 130/1000, loss: 0.2008, lr: 0.008831, batch_cost: 0.1262, reader_cost: 0.04760, ips: 31.7059 samples/sec | ETA 00:01:49
2023-04-25 22:33:54 [INFO]	[TRAIN] epoch: 9, iter: 140/1000, loss: 0.2218, lr: 0.008740, batch_cost: 0.1441, reader_cost: 0.05551, ips: 27.7641 samples/sec | ETA 00:02:03
2023-04-25 22:33:55 [INFO]	[TRAIN] epoch: 9, iter: 150/1000, loss: 0.1409, lr: 0.008648, batch_cost: 0.1264, reader_cost: 0.03781, ips: 31.6497 samples/sec | ETA 00:01:47
2023-04-25 22:33:57 [INFO]	[TRAIN] epoch: 10, iter: 160/1000, loss: 0.2484, lr: 0.008557, batch_cost: 0.1444, reader_cost: 0.05949, ips: 27.6954 samples/sec | ETA 00:02:01
2023-04-25 22:33:58 [INFO]	[TRAIN] epoch: 10, iter: 170/1000, loss: 0.1401, lr: 0.008465, batch_cost: 0.1413, reader_cost: 0.05913, ips: 28.2990 samples/sec | ETA 00:01:57
2023-04-25 22:33:59 [INFO]	[TRAIN] epoch: 11, iter: 180/1000, loss: 0.1379, lr: 0.008374, batch_cost: 0.1394, reader_cost: 0.05544, ips: 28.6957 samples/sec | ETA 00:01:54
2023-04-25 22:34:01 [INFO]	[TRAIN] epoch: 12, iter: 190/1000, loss: 0.1136, lr: 0.008282, batch_cost: 0.1512, reader_cost: 0.06950, ips: 26.4473 samples/sec | ETA 00:02:02
2023-04-25 22:34:02 [INFO]	[TRAIN] epoch: 12, iter: 200/1000, loss: 0.2589, lr: 0.008190, batch_cost: 0.1285, reader_cost: 0.05290, ips: 31.1383 samples/sec | ETA 00:01:42
2023-04-25 22:34:04 [INFO]	[TRAIN] epoch: 13, iter: 210/1000, loss: 0.1388, lr: 0.008098, batch_cost: 0.1358, reader_cost: 0.05745, ips: 29.4509 samples/sec | ETA 00:01:47
2023-04-25 22:34:05 [INFO]	[TRAIN] epoch: 13, iter: 220/1000, loss: 0.1498, lr: 0.008005, batch_cost: 0.1343, reader_cost: 0.05499, ips: 29.7885 samples/sec | ETA 00:01:44
2023-04-25 22:34:06 [INFO]	[TRAIN] epoch: 14, iter: 230/1000, loss: 0.1544, lr: 0.007913, batch_cost: 0.1584, reader_cost: 0.08448, ips: 25.2480 samples/sec | ETA 00:02:01
2023-04-25 22:34:08 [INFO]	[TRAIN] epoch: 15, iter: 240/1000, loss: 0.1084, lr: 0.007821, batch_cost: 0.1395, reader_cost: 0.05866, ips: 28.6759 samples/sec | ETA 00:01:46
2023-04-25 22:34:09 [INFO]	[TRAIN] epoch: 15, iter: 250/1000, loss: 0.0994, lr: 0.007728, batch_cost: 0.1367, reader_cost: 0.06072, ips: 29.2551 samples/sec | ETA 00:01:42
2023-04-25 22:34:11 [INFO]	[TRAIN] epoch: 16, iter: 260/1000, loss: 0.1247, lr: 0.007635, batch_cost: 0.1304, reader_cost: 0.05398, ips: 30.6764 samples/sec | ETA 00:01:36
2023-04-25 22:34:12 [INFO]	[TRAIN] epoch: 16, iter: 270/1000, loss: 0.0935, lr: 0.007543, batch_cost: 0.1235, reader_cost: 0.04477, ips: 32.3877 samples/sec | ETA 00:01:30
2023-04-25 22:34:13 [INFO]	[TRAIN] epoch: 17, iter: 280/1000, loss: 0.1019, lr: 0.007450, batch_cost: 0.1497, reader_cost: 0.07192, ips: 26.7180 samples/sec | ETA 00:01:47
2023-04-25 22:34:15 [INFO]	[TRAIN] epoch: 18, iter: 290/1000, loss: 0.2307, lr: 0.007357, batch_cost: 0.1551, reader_cost: 0.07459, ips: 25.7940 samples/sec | ETA 00:01:50
2023-04-25 22:34:16 [INFO]	[TRAIN] epoch: 18, iter: 300/1000, loss: 0.2091, lr: 0.007264, batch_cost: 0.1499, reader_cost: 0.06607, ips: 26.6801 samples/sec | ETA 00:01:44
2023-04-25 22:34:18 [INFO]	[TRAIN] epoch: 19, iter: 310/1000, loss: 0.1146, lr: 0.007170, batch_cost: 0.1504, reader_cost: 0.06960, ips: 26.5922 samples/sec | ETA 00:01:43
2023-04-25 22:34:19 [INFO]	[TRAIN] epoch: 19, iter: 320/1000, loss: 0.1001, lr: 0.007077, batch_cost: 0.1340, reader_cost: 0.04819, ips: 29.8484 samples/sec | ETA 00:01:31
2023-04-25 22:34:21 [INFO]	[TRAIN] epoch: 20, iter: 330/1000, loss: 0.1024, lr: 0.006983, batch_cost: 0.1634, reader_cost: 0.05647, ips: 24.4808 samples/sec | ETA 00:01:49
2023-04-25 22:34:23 [INFO]	[TRAIN] epoch: 20, iter: 340/1000, loss: 0.1425, lr: 0.006889, batch_cost: 0.1876, reader_cost: 0.06904, ips: 21.3171 samples/sec | ETA 00:02:03
2023-04-25 22:34:24 [INFO]	[TRAIN] epoch: 21, iter: 350/1000, loss: 0.1032, lr: 0.006796, batch_cost: 0.1342, reader_cost: 0.03127, ips: 29.7997 samples/sec | ETA 00:01:27
2023-04-25 22:34:25 [INFO]	[TRAIN] epoch: 22, iter: 360/1000, loss: 0.0990, lr: 0.006702, batch_cost: 0.1439, reader_cost: 0.06966, ips: 27.7995 samples/sec | ETA 00:01:32
2023-04-25 22:34:27 [INFO]	[TRAIN] epoch: 22, iter: 370/1000, loss: 0.0917, lr: 0.006607, batch_cost: 0.1444, reader_cost: 0.06935, ips: 27.7023 samples/sec | ETA 00:01:30
2023-04-25 22:34:28 [INFO]	[TRAIN] epoch: 23, iter: 380/1000, loss: 0.0958, lr: 0.006513, batch_cost: 0.1443, reader_cost: 0.06084, ips: 27.7266 samples/sec | ETA 00:01:29
2023-04-25 22:34:30 [INFO]	[TRAIN] epoch: 23, iter: 390/1000, loss: 0.1198, lr: 0.006419, batch_cost: 0.1447, reader_cost: 0.06019, ips: 27.6463 samples/sec | ETA 00:01:28
2023-04-25 22:34:31 [INFO]	[TRAIN] epoch: 24, iter: 400/1000, loss: 0.0893, lr: 0.006324, batch_cost: 0.1454, reader_cost: 0.05991, ips: 27.5036 samples/sec | ETA 00:01:27
2023-04-25 22:34:33 [INFO]	[TRAIN] epoch: 25, iter: 410/1000, loss: 0.0834, lr: 0.006229, batch_cost: 0.1511, reader_cost: 0.06680, ips: 26.4769 samples/sec | ETA 00:01:29
2023-04-25 22:34:34 [INFO]	[TRAIN] epoch: 25, iter: 420/1000, loss: 0.0823, lr: 0.006134, batch_cost: 0.1356, reader_cost: 0.04668, ips: 29.5051 samples/sec | ETA 00:01:18
2023-04-25 22:34:36 [INFO]	[TRAIN] epoch: 26, iter: 430/1000, loss: 0.1229, lr: 0.006039, batch_cost: 0.1578, reader_cost: 0.05383, ips: 25.3446 samples/sec | ETA 00:01:29
2023-04-25 22:34:37 [INFO]	[TRAIN] epoch: 26, iter: 440/1000, loss: 0.0793, lr: 0.005944, batch_cost: 0.1369, reader_cost: 0.04918, ips: 29.2133 samples/sec | ETA 00:01:16
2023-04-25 22:34:38 [INFO]	[TRAIN] epoch: 27, iter: 450/1000, loss: 0.1107, lr: 0.005848, batch_cost: 0.1338, reader_cost: 0.04761, ips: 29.8921 samples/sec | ETA 00:01:13
2023-04-25 22:34:40 [INFO]	[TRAIN] epoch: 28, iter: 460/1000, loss: 0.0778, lr: 0.005753, batch_cost: 0.1340, reader_cost: 0.04080, ips: 29.8578 samples/sec | ETA 00:01:12
2023-04-25 22:34:41 [INFO]	[TRAIN] epoch: 28, iter: 470/1000, loss: 0.0998, lr: 0.005657, batch_cost: 0.1426, reader_cost: 0.05288, ips: 28.0472 samples/sec | ETA 00:01:15
2023-04-25 22:34:43 [INFO]	[TRAIN] epoch: 29, iter: 480/1000, loss: 0.0810, lr: 0.005561, batch_cost: 0.1376, reader_cost: 0.06053, ips: 29.0793 samples/sec | ETA 00:01:11
2023-04-25 22:34:44 [INFO]	[TRAIN] epoch: 29, iter: 490/1000, loss: 0.0922, lr: 0.005465, batch_cost: 0.1290, reader_cost: 0.04236, ips: 31.0198 samples/sec | ETA 00:01:05
2023-04-25 22:34:45 [INFO]	[TRAIN] epoch: 30, iter: 500/1000, loss: 0.1185, lr: 0.005369, batch_cost: 0.1579, reader_cost: 0.06826, ips: 25.3382 samples/sec | ETA 00:01:18
2023-04-25 22:34:47 [INFO]	[TRAIN] epoch: 30, iter: 510/1000, loss: 0.0777, lr: 0.005272, batch_cost: 0.1365, reader_cost: 0.05412, ips: 29.3026 samples/sec | ETA 00:01:06
2023-04-25 22:34:48 [INFO]	[TRAIN] epoch: 31, iter: 520/1000, loss: 0.0773, lr: 0.005175, batch_cost: 0.1479, reader_cost: 0.06829, ips: 27.0383 samples/sec | ETA 00:01:11
2023-04-25 22:34:50 [INFO]	[TRAIN] epoch: 32, iter: 530/1000, loss: 0.0850, lr: 0.005078, batch_cost: 0.1400, reader_cost: 0.05743, ips: 28.5705 samples/sec | ETA 00:01:05
2023-04-25 22:34:51 [INFO]	[TRAIN] epoch: 32, iter: 540/1000, loss: 0.0776, lr: 0.004981, batch_cost: 0.1385, reader_cost: 0.04902, ips: 28.8886 samples/sec | ETA 00:01:03
2023-04-25 22:34:53 [INFO]	[TRAIN] epoch: 33, iter: 550/1000, loss: 0.0841, lr: 0.004884, batch_cost: 0.1425, reader_cost: 0.05928, ips: 28.0780 samples/sec | ETA 00:01:04
2023-04-25 22:34:54 [INFO]	[TRAIN] epoch: 33, iter: 560/1000, loss: 0.0730, lr: 0.004786, batch_cost: 0.1721, reader_cost: 0.06035, ips: 23.2459 samples/sec | ETA 00:01:15
2023-04-25 22:34:56 [INFO]	[TRAIN] epoch: 34, iter: 570/1000, loss: 0.0646, lr: 0.004688, batch_cost: 0.1715, reader_cost: 0.04614, ips: 23.3203 samples/sec | ETA 00:01:13
2023-04-25 22:34:57 [INFO]	[TRAIN] epoch: 35, iter: 580/1000, loss: 0.0851, lr: 0.004590, batch_cost: 0.1541, reader_cost: 0.06807, ips: 25.9596 samples/sec | ETA 00:01:04
2023-04-25 22:34:59 [INFO]	[TRAIN] epoch: 35, iter: 590/1000, loss: 0.0865, lr: 0.004492, batch_cost: 0.1393, reader_cost: 0.05610, ips: 28.7122 samples/sec | ETA 00:00:57
2023-04-25 22:35:00 [INFO]	[TRAIN] epoch: 36, iter: 600/1000, loss: 0.3419, lr: 0.004394, batch_cost: 0.1492, reader_cost: 0.06092, ips: 26.8164 samples/sec | ETA 00:00:59
2023-04-25 22:35:02 [INFO]	[TRAIN] epoch: 36, iter: 610/1000, loss: 0.1953, lr: 0.004295, batch_cost: 0.1304, reader_cost: 0.04788, ips: 30.6673 samples/sec | ETA 00:00:50
2023-04-25 22:35:03 [INFO]	[TRAIN] epoch: 37, iter: 620/1000, loss: 0.0932, lr: 0.004196, batch_cost: 0.1429, reader_cost: 0.05847, ips: 28.0001 samples/sec | ETA 00:00:54
2023-04-25 22:35:05 [INFO]	[TRAIN] epoch: 38, iter: 630/1000, loss: 0.0840, lr: 0.004097, batch_cost: 0.1399, reader_cost: 0.05385, ips: 28.5991 samples/sec | ETA 00:00:51
2023-04-25 22:35:06 [INFO]	[TRAIN] epoch: 38, iter: 640/1000, loss: 0.0852, lr: 0.003997, batch_cost: 0.1383, reader_cost: 0.05582, ips: 28.9321 samples/sec | ETA 00:00:49
2023-04-25 22:35:07 [INFO]	[TRAIN] epoch: 39, iter: 650/1000, loss: 0.1017, lr: 0.003897, batch_cost: 0.1523, reader_cost: 0.07285, ips: 26.2554 samples/sec | ETA 00:00:53
2023-04-25 22:35:09 [INFO]	[TRAIN] epoch: 39, iter: 660/1000, loss: 0.0920, lr: 0.003797, batch_cost: 0.1436, reader_cost: 0.05406, ips: 27.8561 samples/sec | ETA 00:00:48
2023-04-25 22:35:10 [INFO]	[TRAIN] epoch: 40, iter: 670/1000, loss: 0.0827, lr: 0.003697, batch_cost: 0.1483, reader_cost: 0.06632, ips: 26.9804 samples/sec | ETA 00:00:48
2023-04-25 22:35:12 [INFO]	[TRAIN] epoch: 40, iter: 680/1000, loss: 0.0793, lr: 0.003596, batch_cost: 0.1311, reader_cost: 0.04346, ips: 30.5098 samples/sec | ETA 00:00:41
2023-04-25 22:35:13 [INFO]	[TRAIN] epoch: 41, iter: 690/1000, loss: 0.1037, lr: 0.003495, batch_cost: 0.1391, reader_cost: 0.05895, ips: 28.7545 samples/sec | ETA 00:00:43
2023-04-25 22:35:14 [INFO]	[TRAIN] epoch: 42, iter: 700/1000, loss: 0.0843, lr: 0.003394, batch_cost: 0.1390, reader_cost: 0.03824, ips: 28.7751 samples/sec | ETA 00:00:41
2023-04-25 22:35:16 [INFO]	[TRAIN] epoch: 42, iter: 710/1000, loss: 0.1010, lr: 0.003292, batch_cost: 0.1383, reader_cost: 0.04830, ips: 28.9223 samples/sec | ETA 00:00:40
2023-04-25 22:35:17 [INFO]	[TRAIN] epoch: 43, iter: 720/1000, loss: 0.1004, lr: 0.003190, batch_cost: 0.1369, reader_cost: 0.04818, ips: 29.2279 samples/sec | ETA 00:00:38
2023-04-25 22:35:19 [INFO]	[TRAIN] epoch: 43, iter: 730/1000, loss: 0.0815, lr: 0.003088, batch_cost: 0.1356, reader_cost: 0.05024, ips: 29.4953 samples/sec | ETA 00:00:36
2023-04-25 22:35:20 [INFO]	[TRAIN] epoch: 44, iter: 740/1000, loss: 0.0849, lr: 0.002985, batch_cost: 0.1439, reader_cost: 0.05842, ips: 27.7970 samples/sec | ETA 00:00:37
2023-04-25 22:35:21 [INFO]	[TRAIN] epoch: 45, iter: 750/1000, loss: 0.0837, lr: 0.002882, batch_cost: 0.1466, reader_cost: 0.06697, ips: 27.2888 samples/sec | ETA 00:00:36
2023-04-25 22:35:23 [INFO]	[TRAIN] epoch: 45, iter: 760/1000, loss: 0.0739, lr: 0.002779, batch_cost: 0.1267, reader_cost: 0.04291, ips: 31.5647 samples/sec | ETA 00:00:30
2023-04-25 22:35:24 [INFO]	[TRAIN] epoch: 46, iter: 770/1000, loss: 0.0783, lr: 0.002675, batch_cost: 0.1519, reader_cost: 0.07287, ips: 26.3350 samples/sec | ETA 00:00:34
2023-04-25 22:35:26 [INFO]	[TRAIN] epoch: 46, iter: 780/1000, loss: 0.0715, lr: 0.002570, batch_cost: 0.1392, reader_cost: 0.05228, ips: 28.7448 samples/sec | ETA 00:00:30
2023-04-25 22:35:28 [INFO]	[TRAIN] epoch: 47, iter: 790/1000, loss: 0.0793, lr: 0.002465, batch_cost: 0.1912, reader_cost: 0.05838, ips: 20.9251 samples/sec | ETA 00:00:40
2023-04-25 22:35:29 [INFO]	[TRAIN] epoch: 48, iter: 800/1000, loss: 0.0768, lr: 0.002360, batch_cost: 0.1843, reader_cost: 0.07804, ips: 21.7002 samples/sec | ETA 00:00:36
2023-04-25 22:35:31 [INFO]	[TRAIN] epoch: 48, iter: 810/1000, loss: 0.0774, lr: 0.002254, batch_cost: 0.1381, reader_cost: 0.05504, ips: 28.9590 samples/sec | ETA 00:00:26
2023-04-25 22:35:32 [INFO]	[TRAIN] epoch: 49, iter: 820/1000, loss: 0.0726, lr: 0.002147, batch_cost: 0.1343, reader_cost: 0.05536, ips: 29.7925 samples/sec | ETA 00:00:24
2023-04-25 22:35:33 [INFO]	[TRAIN] epoch: 49, iter: 830/1000, loss: 0.0667, lr: 0.002040, batch_cost: 0.1383, reader_cost: 0.04428, ips: 28.9250 samples/sec | ETA 00:00:23
2023-04-25 22:35:35 [INFO]	[TRAIN] epoch: 50, iter: 840/1000, loss: 0.0660, lr: 0.001933, batch_cost: 0.1445, reader_cost: 0.06286, ips: 27.6774 samples/sec | ETA 00:00:23
2023-04-25 22:35:36 [INFO]	[TRAIN] epoch: 50, iter: 850/1000, loss: 0.0689, lr: 0.001824, batch_cost: 0.1337, reader_cost: 0.04738, ips: 29.9092 samples/sec | ETA 00:00:20
2023-04-25 22:35:38 [INFO]	[TRAIN] epoch: 51, iter: 860/1000, loss: 0.0762, lr: 0.001715, batch_cost: 0.1514, reader_cost: 0.06967, ips: 26.4201 samples/sec | ETA 00:00:21
2023-04-25 22:35:39 [INFO]	[TRAIN] epoch: 52, iter: 870/1000, loss: 0.0877, lr: 0.001605, batch_cost: 0.1389, reader_cost: 0.04366, ips: 28.7909 samples/sec | ETA 00:00:18
2023-04-25 22:35:41 [INFO]	[TRAIN] epoch: 52, iter: 880/1000, loss: 0.0854, lr: 0.001495, batch_cost: 0.1606, reader_cost: 0.06186, ips: 24.9000 samples/sec | ETA 00:00:19
2023-04-25 22:35:42 [INFO]	[TRAIN] epoch: 53, iter: 890/1000, loss: 0.0675, lr: 0.001383, batch_cost: 0.1587, reader_cost: 0.07349, ips: 25.1991 samples/sec | ETA 00:00:17
2023-04-25 22:35:44 [INFO]	[TRAIN] epoch: 53, iter: 900/1000, loss: 0.0893, lr: 0.001270, batch_cost: 0.1424, reader_cost: 0.05182, ips: 28.0810 samples/sec | ETA 00:00:14
2023-04-25 22:35:45 [INFO]	[TRAIN] epoch: 54, iter: 910/1000, loss: 0.0715, lr: 0.001156, batch_cost: 0.1525, reader_cost: 0.07255, ips: 26.2345 samples/sec | ETA 00:00:13
2023-04-25 22:35:47 [INFO]	[TRAIN] epoch: 55, iter: 920/1000, loss: 0.0730, lr: 0.001041, batch_cost: 0.1356, reader_cost: 0.05494, ips: 29.4905 samples/sec | ETA 00:00:10
2023-04-25 22:35:48 [INFO]	[TRAIN] epoch: 55, iter: 930/1000, loss: 0.0956, lr: 0.000925, batch_cost: 0.1367, reader_cost: 0.05587, ips: 29.2579 samples/sec | ETA 00:00:09
2023-04-25 22:35:50 [INFO]	[TRAIN] epoch: 56, iter: 940/1000, loss: 0.0712, lr: 0.000807, batch_cost: 0.1732, reader_cost: 0.08836, ips: 23.1004 samples/sec | ETA 00:00:10
2023-04-25 22:35:51 [INFO]	[TRAIN] epoch: 56, iter: 950/1000, loss: 0.0799, lr: 0.000687, batch_cost: 0.1522, reader_cost: 0.07004, ips: 26.2861 samples/sec | ETA 00:00:07
2023-04-25 22:35:53 [INFO]	[TRAIN] epoch: 57, iter: 960/1000, loss: 0.0679, lr: 0.000564, batch_cost: 0.1349, reader_cost: 0.05452, ips: 29.6436 samples/sec | ETA 00:00:05
2023-04-25 22:35:54 [INFO]	[TRAIN] epoch: 58, iter: 970/1000, loss: 0.0657, lr: 0.000439, batch_cost: 0.1495, reader_cost: 0.07090, ips: 26.7582 samples/sec | ETA 00:00:04
2023-04-25 22:35:56 [INFO]	[TRAIN] epoch: 58, iter: 980/1000, loss: 0.0801, lr: 0.000309, batch_cost: 0.1426, reader_cost: 0.06263, ips: 28.0562 samples/sec | ETA 00:00:02
2023-04-25 22:35:57 [INFO]	[TRAIN] epoch: 59, iter: 990/1000, loss: 0.0775, lr: 0.000173, batch_cost: 0.1274, reader_cost: 0.04774, ips: 31.3977 samples/sec | ETA 00:00:01
2023-04-25 22:35:58 [INFO]	[TRAIN] epoch: 59, iter: 1000/1000, loss: 0.0695, lr: 0.000020, batch_cost: 0.1441, reader_cost: 0.05858, ips: 27.7678 samples/sec | ETA 00:00:00
2023-04-25 22:35:58 [INFO]	Start evaluating (total_samples: 30, total_iters: 30)...
30/30 [==============================] - 1s 35ms/step - batch_cost: 0.0350 - reader cost: 9.6032e-04
2023-04-25 22:35:59 [INFO]	[EVAL] #Images: 30 mIoU: 0.9805 Acc: 0.9947 Kappa: 0.9803 Dice: 0.9901
2023-04-25 22:35:59 [INFO]	[EVAL] Class IoU: 
[0.9937 0.9674]
2023-04-25 22:35:59 [INFO]	[EVAL] Class Precision: 
[0.9974 0.9807]
2023-04-25 22:35:59 [INFO]	[EVAL] Class Recall: 
[0.9963 0.9862]
2023-04-25 22:36:00 [INFO]	[EVAL] The model with the best validation mIoU (0.9805) was saved at iter 1000.
<class 'paddle.nn.layer.conv.Conv2D'>'s flops has been counted
<class 'paddle.nn.layer.norm.BatchNorm2D'>'s flops has been counted
<class 'paddle.nn.layer.activation.ReLU'>'s flops has been counted
<class 'paddle.nn.layer.pooling.AvgPool2D'>'s flops has been counted
<class 'paddle.nn.layer.pooling.AdaptiveAvgPool2D'>'s flops has been counted
Total Flops: 9643807616     Total Params: 12251410

分割模型的 mIoU 0.9805,可以很好的分割出仪表的显示屏。

1.6 生成显示屏小图

首先,将所有的备选框预测出来。

!python ./PaddleSeg/tools/predict.py \
       --config ./work/configs/pp_liteseg_optic_disc_512x512_1k.yml \
       --model_path ./output/best_model/model.pdparams \
       --image_path ./work/data/step_1_seg/images \
       --save_dir ./work/data/

/opt/conda/envs/python35-paddle120-env/lib/python3.9/site-packages/sklearn/utils/multiclass.py:14: DeprecationWarning: Please use `spmatrix` from the `scipy.sparse` namespace, the `scipy.sparse.base` namespace is deprecated.
  from scipy.sparse.base import spmatrix
2023-04-25 22:43:02 [WARNING]	Add the `num_classes` in train_dataset and val_dataset config to model config. We suggest you manually set `num_classes` in model config.
2023-04-25 22:43:02 [INFO]	
------------Environment Information-------------
platform: Linux-4.15.0-140-generic-x86_64-with-glibc2.23
Python: 3.9.16 (main, Jan 11 2023, 16:05:54) [GCC 11.2.0]
Paddle compiled with cuda: True
NVCC: Build cuda_11.2.r11.2/compiler.29618528_0
cudnn: 8.2
GPUs used: 1
CUDA_VISIBLE_DEVICES: None
GPU: ['GPU 0: Tesla V100-SXM2-16GB']
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0
PaddleSeg: 2.8.0
PaddlePaddle: 2.4.1
OpenCV: 4.5.5
------------------------------------------------
2023-04-25 22:43:02 [INFO]	
---------------Config Information---------------
batch_size: 4
iters: 1000
train_dataset:
  dataset_root: /home/aistudio/work/data/step_1_seg
  mode: train
  num_classes: 2
  train_path: /home/aistudio/work/data/step_1_seg/train.txt
  transforms:
  - max_scale_factor: 2.0
    min_scale_factor: 0.5
    scale_step_size: 0.25
    type: ResizeStepScaling
  - crop_size:
    - 512
    - 512
    type: RandomPaddingCrop
  - type: RandomHorizontalFlip
  - brightness_range: 0.5
    contrast_range: 0.5
    saturation_range: 0.5
    type: RandomDistort
  - type: Normalize
  type: Dataset
val_dataset:
  dataset_root: /home/aistudio/work/data/step_1_seg
  mode: val
  num_classes: 2
  transforms:
  - type: Normalize
  type: Dataset
  val_path: /home/aistudio/work/data/step_1_seg/val.txt
optimizer:
  momentum: 0.9
  type: SGD
  weight_decay: 4.0e-05
lr_scheduler:
  end_lr: 0
  learning_rate: 0.01
  power: 0.9
  type: PolynomialDecay
loss:
  coef:
  - 1
  - 1
  - 1
  types:
  - type: CrossEntropyLoss
  - type: CrossEntropyLoss
  - type: CrossEntropyLoss
model:
  backbone:
    pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
    type: STDC2
  num_classes: 2
  type: PPLiteSeg
------------------------------------------------

2023-04-25 22:43:02 [INFO]	Set device: gpu
2023-04-25 22:43:02 [INFO]	Use the following config to build model
model:
  backbone:
    pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
    type: STDC2
  num_classes: 2
  type: PPLiteSeg
W0425 22:43:02.236009 10899 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0425 22:43:02.236068 10899 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
2023-04-25 22:43:04 [INFO]	Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
2023-04-25 22:43:04 [INFO]	There are 265/265 variables loaded into STDCNet.
2023-04-25 22:43:04 [INFO]	The number of images: 99
2023-04-25 22:43:04 [INFO]	Loading pretrained model from ./output/best_model/model.pdparams
2023-04-25 22:43:04 [INFO]	There are 370/370 variables loaded into PPLiteSeg.
2023-04-25 22:43:04 [INFO]	Start to predict...
99/99 [==============================] - 7s 69ms/step
2023-04-25 22:43:11 [INFO]	Predicted images are saved in ./work/data/added_prediction and ./work/data/pseudo_color_prediction .

利用opencv找出外接box,然后恢复到原图比例,并在原图中切出来,最后resize到512×512。

def get_contours(img):
    img = cv2.dilate(img,np.ones((9,9)),iterations=1)

    biggest = np.array([])
    maxArea = 0
    contours,hierarchy = cv2.findContours(img[..., 1], cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

    for cnt in contours:
        area = cv2.contourArea(cnt)
        if area>1000:
            peri = cv2.arcLength(cnt,True)
            approx = cv2.approxPolyDP(cnt,0.1*peri,True)
            if area > maxArea and len(approx) == 4:
                biggest = approx
                maxArea = area
    return biggest

def reorder(myPoints):
    myPoints = myPoints.reshape((4,2))
    myPointsNew = np.zeros((4,1,2),np.int32)
    add = myPoints.sum(1)
    myPointsNew[0] = myPoints[np.argmin(add)]
    myPointsNew[3] = myPoints[np.argmax(add)]
    diff = np.diff(myPoints,axis=1)
    myPointsNew[1]= myPoints[np.argmin(diff)]
    myPointsNew[2] = myPoints[np.argmax(diff)]
    return myPointsNew

def getWarp(img, biggest, width=512, height=512):
    widthImg = img.shape[0]
    heightImg = img.shape[1]

    biggest = reorder(biggest)
    pts1 = np.float32(biggest)
    pts2 = np.float32([[0, 0], [widthImg, 0], [0, heightImg], [widthImg, heightImg]])
    matrix = cv2.getPerspectiveTransform(pts1, pts2)
    imgOutput = cv2.warpPerspective(img, matrix, (widthImg, heightImg))

    imgCropped = imgOutput[20:imgOutput.shape[0]-20,20:imgOutput.shape[1]-20]
    imgCropped = cv2.resize(imgCropped,(width,height))

    return imgCropped

def get_det_img(img, img_raw, biggest):
    h, w, _ = img.shape
    _h, _w, _ = img_raw.shape

    biggest = biggest.astype(float)

    biggest[..., 0] = biggest[..., 0]/w
    biggest[..., 1] = biggest[..., 1]/h

    biggest[..., 0] = (biggest[..., 0] * _w)
    biggest[..., 1] = (biggest[..., 1] * _h)

    biggest = biggest.astype(int)

    return getWarp(img_raw, biggest)

_img = np.random.choice(glob.glob('./work/data/pseudo_color_prediction/*.png'))
filename = _img.split('/')[-1].split('.')[0]+'.jpg'
print(_img, filename)

img = cv2.imread(_img)
img_raw = cv2.imread('./work/data/train/'+filename)

biggest = get_contours(img)
img_det = get_det_img(img, img_raw, biggest)

plt.figure(figsize=(16, 6))
plt.subplot(131)
plt.imshow(img[..., ::-1])

plt.subplot(132)
plt.imshow(img_raw[..., ::-1])

plt.subplot(133)
plt.imshow(img_det[..., ::-1])   

./work/data/pseudo_color_prediction/WIN_20230220_15_54_52_Pro.png WIN_20230220_15_54_52_Pro.jpg





<matplotlib.image.AxesImage at 0x7fc9c3851ee0>

在这里插入图片描述

可以看到,这里分割的显示屏幕比较精准。

下面就把这些小图保存下来。

!mkdir ./work/data/step_1_screen

count_screen = 0
for _img in glob.glob('./work/data/pseudo_color_prediction/*.png'):
    filename = _img.split('/')[-1].split('.')[0]+'.jpg'

    img = cv2.imread(_img)
    img_raw = cv2.imread('./work/data/train/'+filename)

    biggest = get_contours(img)
    img_det = get_det_img(img, img_raw, biggest)

    cv2.imwrite('./work/data/step_1_screen/'+filename, img_det)
    count_screen += 1

print('Done with {} screen files...'.format(count_screen))
Done with 99 screen files...

可以看一下所有的屏幕分割情况,还算不错:

2. 第二阶段,仪表显示屏幕分类

2.1 划分PaddleClas训练集与验证集

这里需要手动将两种仪器的显示屏图片进行分类,并分好训练集与测试集,最终的文件目录如下:

./work/data/step_2_clas
├── train
│   ├── t0
│   │   ├── WIN_20230220_14_56_27_Pro.jpg
│   │   ├── ...
│   └── t1
│       ├── WIN_20230220_15_17_36_Pro.jpg
│       ├── ...
└── val
    ├── t0
    │   ├── WIN_20230220_14_47_59_Pro.jpg
    │   ├── ...
    └── t1
        ├── WIN_20230220_15_15_40_Pro.jpg
        ├── ...

6 directories, 99 files

接下来,生成相应的标记文件列表。

with open('./work/data/step_2_clas/train/train_list.txt', 'w') as f:
    for filename in glob.glob('./work/data/step_2_clas/train/t0/*.jpg'):
        f.write('t0/' + filename.split('/')[-1] + ' 0')
        f.write('\n')
    for filename in glob.glob('./work/data/step_2_clas/train/t1/*.jpg'):
        f.write('t1/' + filename.split('/')[-1] + ' 1')
        f.write('\n')

with open('./work/data/step_2_clas/val/val_list.txt', 'w') as f:
    for filename in glob.glob('./work/data/step_2_clas/val/t0/*.jpg'):
        f.write('t0/' + filename.split('/')[-1] + ' 0')
        f.write('\n')
    for filename in glob.glob('./work/data/step_2_clas/val/t1/*.jpg'):
        f.write('t1/' + filename.split('/')[-1] + ' 1')
        f.write('\n')

2.2 训练PaddleClas分类模型

这里使用 ShuffleNetV2 进行模型的训练,配置文件主要需要关注:

# global configs
Global:
  checkpoints: null
  pretrained_model: null
  output_dir: ./output/
  device: gpu
  save_interval: 100
  eval_during_train: True
  eval_interval: 1
  epochs: 20
  print_batch_step: 1
  use_visualdl: False
  # used for static mode and model export
  image_shape: [3, 512, 512]
  save_inference_dir: ./inference

# model architecture
Arch:
  name: ShuffleNetV2_x0_25
  class_num: 2
 
...

# data loader for train and eval
DataLoader:
  Train:
    dataset:
      name: ImageNetDataset
      image_root: /home/aistudio/work/data/step_2_clas/train/ # 数据集目录
      cls_label_path: /home/aistudio/work/data/step_2_clas/train/train_list.txt # 训练样本
...

  Eval:
    dataset: 
      name: ImageNetDataset
      image_root: /home/aistudio/work/data/step_2_clas/val/ # 数据集目录
      cls_label_path: /home/aistudio/work/data/step_2_clas/val/val_list.txt # 验证样本
...

!python ./PaddleClas/tools/train.py \
    -c ./work/configs/ShuffleNetV2_x0_25.yaml  \
    -o Arch.pretrained=True

/opt/conda/envs/python35-paddle120-env/lib/python3.9/site-packages/sklearn/utils/multiclass.py:14: DeprecationWarning: Please use `spmatrix` from the `scipy.sparse` namespace, the `scipy.sparse.base` namespace is deprecated.
  from scipy.sparse.base import spmatrix
A new field (pretrained) detected!
[2023/04/26 13:44:33] ppcls INFO: 
===========================================================
==        PaddleClas is powered by PaddlePaddle !        ==
===========================================================
==                                                       ==
==   For more info please go to the following website.   ==
==                                                       ==
==       https://github.com/PaddlePaddle/PaddleClas      ==
===========================================================

[2023/04/26 13:44:33] ppcls INFO: Arch : 
[2023/04/26 13:44:33] ppcls INFO:     class_num : 2
[2023/04/26 13:44:33] ppcls INFO:     name : ShuffleNetV2_x0_25
[2023/04/26 13:44:33] ppcls INFO:     pretrained : True
[2023/04/26 13:44:33] ppcls INFO: DataLoader : 
[2023/04/26 13:44:33] ppcls INFO:     Eval : 
[2023/04/26 13:44:33] ppcls INFO:         dataset : 
[2023/04/26 13:44:33] ppcls INFO:             cls_label_path : /home/aistudio/work/data/step_2_clas/val/val_list.txt
[2023/04/26 13:44:33] ppcls INFO:             image_root : /home/aistudio/work/data/step_2_clas/val/
[2023/04/26 13:44:33] ppcls INFO:             name : ImageNetDataset
[2023/04/26 13:44:33] ppcls INFO:             transform_ops : 
[2023/04/26 13:44:33] ppcls INFO:                 DecodeImage : 
[2023/04/26 13:44:33] ppcls INFO:                     channel_first : False
[2023/04/26 13:44:33] ppcls INFO:                     to_rgb : True
[2023/04/26 13:44:33] ppcls INFO:                 ResizeImage : 
[2023/04/26 13:44:33] ppcls INFO:                     resize_short : 256
[2023/04/26 13:44:33] ppcls INFO:                 CropImage : 
[2023/04/26 13:44:33] ppcls INFO:                     size : 512
[2023/04/26 13:44:33] ppcls INFO:                 NormalizeImage : 
[2023/04/26 13:44:33] ppcls INFO:                     mean : [0.485, 0.456, 0.406]
[2023/04/26 13:44:33] ppcls INFO:                     order : 
[2023/04/26 13:44:33] ppcls INFO:                     scale : 1.0/255.0
[2023/04/26 13:44:33] ppcls INFO:                     std : [0.229, 0.224, 0.225]
[2023/04/26 13:44:33] ppcls INFO:         loader : 
[2023/04/26 13:44:33] ppcls INFO:             num_workers : 4
[2023/04/26 13:44:33] ppcls INFO:             use_shared_memory : False
[2023/04/26 13:44:33] ppcls INFO:         sampler : 
[2023/04/26 13:44:33] ppcls INFO:             batch_size : 64
[2023/04/26 13:44:33] ppcls INFO:             drop_last : False
[2023/04/26 13:44:33] ppcls INFO:             name : DistributedBatchSampler
[2023/04/26 13:44:33] ppcls INFO:             shuffle : False
[2023/04/26 13:44:33] ppcls INFO:     Train : 
[2023/04/26 13:44:33] ppcls INFO:         dataset : 
[2023/04/26 13:44:33] ppcls INFO:             cls_label_path : /home/aistudio/work/data/step_2_clas/train/train_list.txt
[2023/04/26 13:44:33] ppcls INFO:             image_root : /home/aistudio/work/data/step_2_clas/train/
[2023/04/26 13:44:33] ppcls INFO:             name : ImageNetDataset
[2023/04/26 13:44:33] ppcls INFO:             transform_ops : 
[2023/04/26 13:44:33] ppcls INFO:                 DecodeImage : 
[2023/04/26 13:44:33] ppcls INFO:                     channel_first : False
[2023/04/26 13:44:33] ppcls INFO:                     to_rgb : True
[2023/04/26 13:44:33] ppcls INFO:                 RandCropImage : 
[2023/04/26 13:44:33] ppcls INFO:                     size : 512
[2023/04/26 13:44:33] ppcls INFO:                 RandFlipImage : 
[2023/04/26 13:44:33] ppcls INFO:                     flip_code : 1
[2023/04/26 13:44:33] ppcls INFO:                 NormalizeImage : 
[2023/04/26 13:44:33] ppcls INFO:                     mean : [0.485, 0.456, 0.406]
[2023/04/26 13:44:33] ppcls INFO:                     order : 
[2023/04/26 13:44:33] ppcls INFO:                     scale : 1.0/255.0
[2023/04/26 13:44:33] ppcls INFO:                     std : [0.229, 0.224, 0.225]
[2023/04/26 13:44:33] ppcls INFO:         loader : 
[2023/04/26 13:44:33] ppcls INFO:             num_workers : 4
[2023/04/26 13:44:33] ppcls INFO:             use_shared_memory : False
[2023/04/26 13:44:33] ppcls INFO:         sampler : 
[2023/04/26 13:44:33] ppcls INFO:             batch_size : 256
[2023/04/26 13:44:33] ppcls INFO:             drop_last : False
[2023/04/26 13:44:33] ppcls INFO:             name : DistributedBatchSampler
[2023/04/26 13:44:33] ppcls INFO:             shuffle : True
[2023/04/26 13:44:33] ppcls INFO: Global : 
[2023/04/26 13:44:33] ppcls INFO:     checkpoints : None
[2023/04/26 13:44:33] ppcls INFO:     device : gpu
[2023/04/26 13:44:33] ppcls INFO:     epochs : 20
[2023/04/26 13:44:33] ppcls INFO:     eval_during_train : True
[2023/04/26 13:44:33] ppcls INFO:     eval_interval : 1
[2023/04/26 13:44:33] ppcls INFO:     image_shape : [3, 512, 512]
[2023/04/26 13:44:33] ppcls INFO:     output_dir : ./output/
[2023/04/26 13:44:33] ppcls INFO:     pretrained_model : None
[2023/04/26 13:44:33] ppcls INFO:     print_batch_step : 1
[2023/04/26 13:44:33] ppcls INFO:     save_inference_dir : ./inference
[2023/04/26 13:44:33] ppcls INFO:     save_interval : 100
[2023/04/26 13:44:33] ppcls INFO:     use_visualdl : False
[2023/04/26 13:44:33] ppcls INFO: Infer : 
[2023/04/26 13:44:33] ppcls INFO:     PostProcess : 
[2023/04/26 13:44:33] ppcls INFO:         class_id_map_file : None
[2023/04/26 13:44:33] ppcls INFO:         name : Topk
[2023/04/26 13:44:33] ppcls INFO:         topk : 1
[2023/04/26 13:44:33] ppcls INFO:     batch_size : 10
[2023/04/26 13:44:33] ppcls INFO:     infer_imgs : /home/aistudio/work/data/step_2_clas/val/t1/WIN_20230220_15_15_40_Pro.jpg
[2023/04/26 13:44:33] ppcls INFO:     transforms : 
[2023/04/26 13:44:33] ppcls INFO:         DecodeImage : 
[2023/04/26 13:44:33] ppcls INFO:             channel_first : False
[2023/04/26 13:44:33] ppcls INFO:             to_rgb : True
[2023/04/26 13:44:33] ppcls INFO:         ResizeImage : 
[2023/04/26 13:44:33] ppcls INFO:             resize_short : 256
[2023/04/26 13:44:33] ppcls INFO:         CropImage : 
[2023/04/26 13:44:33] ppcls INFO:             size : 512
[2023/04/26 13:44:33] ppcls INFO:         NormalizeImage : 
[2023/04/26 13:44:33] ppcls INFO:             mean : [0.485, 0.456, 0.406]
[2023/04/26 13:44:33] ppcls INFO:             order : 
[2023/04/26 13:44:33] ppcls INFO:             scale : 1.0/255.0
[2023/04/26 13:44:33] ppcls INFO:             std : [0.229, 0.224, 0.225]
[2023/04/26 13:44:33] ppcls INFO:         ToCHWImage : None
[2023/04/26 13:44:33] ppcls INFO: Loss : 
[2023/04/26 13:44:33] ppcls INFO:     Eval : 
[2023/04/26 13:44:33] ppcls INFO:         CELoss : 
[2023/04/26 13:44:33] ppcls INFO:             weight : 1.0
[2023/04/26 13:44:33] ppcls INFO:     Train : 
[2023/04/26 13:44:33] ppcls INFO:         CELoss : 
[2023/04/26 13:44:33] ppcls INFO:             weight : 1.0
[2023/04/26 13:44:33] ppcls INFO: Metric : 
[2023/04/26 13:44:33] ppcls INFO:     Eval : 
[2023/04/26 13:44:33] ppcls INFO:         TopkAcc : 
[2023/04/26 13:44:33] ppcls INFO:             topk : [1, 5]
[2023/04/26 13:44:33] ppcls INFO:     Train : 
[2023/04/26 13:44:33] ppcls INFO:         TopkAcc : 
[2023/04/26 13:44:33] ppcls INFO:             topk : [1, 5]
[2023/04/26 13:44:33] ppcls INFO: Optimizer : 
[2023/04/26 13:44:33] ppcls INFO:     lr : 
[2023/04/26 13:44:33] ppcls INFO:         learning_rate : 0.0125
[2023/04/26 13:44:33] ppcls INFO:         name : Cosine
[2023/04/26 13:44:33] ppcls INFO:         warmup_epoch : 5
[2023/04/26 13:44:33] ppcls INFO:     momentum : 0.9
[2023/04/26 13:44:33] ppcls INFO:     name : Momentum
[2023/04/26 13:44:33] ppcls INFO:     regularizer : 
[2023/04/26 13:44:33] ppcls INFO:         coeff : 1e-05
[2023/04/26 13:44:33] ppcls INFO:         name : L2
[2023/04/26 13:44:33] ppcls INFO: profiler_options : None
[2023/04/26 13:44:33] ppcls INFO: train with paddle 2.4.1 and device Place(gpu:0)
W0426 13:44:33.714360 19712 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0426 13:44:33.718773 19712 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[2023/04/26 13:44:35] ppcls INFO: Found /home/aistudio/.paddleclas/weights/ShuffleNetV2_x0_25_pretrained.pdparams
[2023/04/26 13:44:35] ppcls WARNING: The training strategy provided by PaddleClas is based on 4 gpus. But the number of gpu is 1 in current training. Please modify the stategy (learning rate, batch size and so on) if use this config to train.
[2023/04/26 13:44:41] ppcls WARNING: The output dims(2) is less than k(5), and the argument 5 of Topk has been removed.
[2023/04/26 13:44:41] ppcls INFO: [Train][Epoch 1/20][Iter: 0/1]lr(LinearWarmup): 0.00250000, top1: 0.51685, CELoss: 0.86679, loss: 0.86679, batch_cost: 6.42097s, reader_cost: 4.21631, ips: 13.86083 samples/s, eta: 0:02:08
[2023/04/26 13:44:42] ppcls INFO: [Train][Epoch 1/20][Avg]top1: 0.51685, CELoss: 0.86679, loss: 0.86679
[2023/04/26 13:44:42] ppcls WARNING: The output dims(2) is less than k(5), and the argument 5 of Topk has been removed.
[2023/04/26 13:44:42] ppcls INFO: [Eval][Epoch 1][Iter: 0/1]CELoss: 1.30089, loss: 1.30089, top1: 0.50000, batch_cost: 0.90628s, reader_cost: 0.88015, ips: 11.03416 images/sec
[2023/04/26 13:44:43] ppcls INFO: [Eval][Epoch 1][Avg]CELoss: 1.30089, loss: 1.30089, top1: 0.50000
[2023/04/26 13:44:43] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/best_model
[2023/04/26 13:44:43] ppcls INFO: [Eval][Epoch 1][best metric: 0.5]
[2023/04/26 13:44:43] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:44:48] ppcls INFO: [Train][Epoch 2/20][Iter: 0/1]lr(LinearWarmup): 0.00500000, top1: 0.52809, CELoss: 0.81671, loss: 0.81671, batch_cost: 5.88639s, reader_cost: 4.68845, ips: 15.11963 samples/s, eta: 0:01:51
[2023/04/26 13:44:48] ppcls INFO: [Train][Epoch 2/20][Avg]top1: 0.52809, CELoss: 0.81671, loss: 0.81671
[2023/04/26 13:44:49] ppcls INFO: [Eval][Epoch 2][Iter: 0/1]CELoss: 1.06328, loss: 1.06328, top1: 0.50000, batch_cost: 0.69499s, reader_cost: 0.67162, ips: 14.38863 images/sec
[2023/04/26 13:44:49] ppcls INFO: [Eval][Epoch 2][Avg]CELoss: 1.06328, loss: 1.06328, top1: 0.50000
[2023/04/26 13:44:49] ppcls INFO: [Eval][Epoch 2][best metric: 0.5]
[2023/04/26 13:44:49] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:44:54] ppcls INFO: [Train][Epoch 3/20][Iter: 0/1]lr(LinearWarmup): 0.00750000, top1: 0.73034, CELoss: 0.57893, loss: 0.57893, batch_cost: 5.54559s, reader_cost: 4.68409, ips: 16.04879 samples/s, eta: 0:01:39
[2023/04/26 13:44:55] ppcls INFO: [Train][Epoch 3/20][Avg]top1: 0.73034, CELoss: 0.57893, loss: 0.57893
[2023/04/26 13:44:55] ppcls INFO: [Eval][Epoch 3][Iter: 0/1]CELoss: 0.63160, loss: 0.63160, top1: 0.60000, batch_cost: 0.73978s, reader_cost: 0.71610, ips: 13.51758 images/sec
[2023/04/26 13:44:56] ppcls INFO: [Eval][Epoch 3][Avg]CELoss: 0.63160, loss: 0.63160, top1: 0.60000
[2023/04/26 13:44:56] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/best_model
[2023/04/26 13:44:56] ppcls INFO: [Eval][Epoch 3][best metric: 0.6000000238418579]
[2023/04/26 13:44:56] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:00] ppcls INFO: [Train][Epoch 4/20][Iter: 0/1]lr(LinearWarmup): 0.01000000, top1: 0.91011, CELoss: 0.28251, loss: 0.28251, batch_cost: 5.31366s, reader_cost: 4.62113, ips: 16.74929 samples/s, eta: 0:01:30
[2023/04/26 13:45:00] ppcls INFO: [Train][Epoch 4/20][Avg]top1: 0.91011, CELoss: 0.28251, loss: 0.28251
[2023/04/26 13:45:01] ppcls INFO: [Eval][Epoch 4][Iter: 0/1]CELoss: 0.30897, loss: 0.30897, top1: 0.90000, batch_cost: 0.69318s, reader_cost: 0.66975, ips: 14.42626 images/sec
[2023/04/26 13:45:01] ppcls INFO: [Eval][Epoch 4][Avg]CELoss: 0.30897, loss: 0.30897, top1: 0.90000
[2023/04/26 13:45:01] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/best_model
[2023/04/26 13:45:01] ppcls INFO: [Eval][Epoch 4][best metric: 0.9000000357627869]
[2023/04/26 13:45:01] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:06] ppcls INFO: [Train][Epoch 5/20][Iter: 0/1]lr(LinearWarmup): 0.01250000, top1: 1.00000, CELoss: 0.09491, loss: 0.09491, batch_cost: 5.17621s, reader_cost: 4.58503, ips: 17.19403 samples/s, eta: 0:01:22
[2023/04/26 13:45:06] ppcls INFO: [Train][Epoch 5/20][Avg]top1: 1.00000, CELoss: 0.09491, loss: 0.09491
[2023/04/26 13:45:07] ppcls INFO: [Eval][Epoch 5][Iter: 0/1]CELoss: 0.12295, loss: 0.12295, top1: 1.00000, batch_cost: 0.73344s, reader_cost: 0.70926, ips: 13.63444 images/sec
[2023/04/26 13:45:07] ppcls INFO: [Eval][Epoch 5][Avg]CELoss: 0.12295, loss: 0.12295, top1: 1.00000
[2023/04/26 13:45:07] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/best_model
[2023/04/26 13:45:07] ppcls INFO: [Eval][Epoch 5][best metric: 1.0]
[2023/04/26 13:45:07] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:13] ppcls INFO: [Train][Epoch 6/20][Iter: 0/1]lr(LinearWarmup): 0.01236342, top1: 1.00000, CELoss: 0.02744, loss: 0.02744, batch_cost: 5.17934s, reader_cost: 4.65572, ips: 17.18367 samples/s, eta: 0:01:17
[2023/04/26 13:45:13] ppcls INFO: [Train][Epoch 6/20][Avg]top1: 1.00000, CELoss: 0.02744, loss: 0.02744
[2023/04/26 13:45:14] ppcls INFO: [Eval][Epoch 6][Iter: 0/1]CELoss: 0.06511, loss: 0.06511, top1: 1.00000, batch_cost: 0.79378s, reader_cost: 0.77070, ips: 12.59788 images/sec
[2023/04/26 13:45:14] ppcls INFO: [Eval][Epoch 6][Avg]CELoss: 0.06511, loss: 0.06511, top1: 1.00000
[2023/04/26 13:45:14] ppcls INFO: [Eval][Epoch 6][best metric: 1.0]
[2023/04/26 13:45:14] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:19] ppcls INFO: [Train][Epoch 7/20][Iter: 0/1]lr(LinearWarmup): 0.01195966, top1: 1.00000, CELoss: 0.01281, loss: 0.01281, batch_cost: 5.13178s, reader_cost: 4.65710, ips: 17.34289 samples/s, eta: 0:01:11
[2023/04/26 13:45:19] ppcls INFO: [Train][Epoch 7/20][Avg]top1: 1.00000, CELoss: 0.01281, loss: 0.01281
[2023/04/26 13:45:20] ppcls INFO: [Eval][Epoch 7][Iter: 0/1]CELoss: 0.04996, loss: 0.04996, top1: 1.00000, batch_cost: 0.72190s, reader_cost: 0.69832, ips: 13.85231 images/sec
[2023/04/26 13:45:20] ppcls INFO: [Eval][Epoch 7][Avg]CELoss: 0.04996, loss: 0.04996, top1: 1.00000
[2023/04/26 13:45:20] ppcls INFO: [Eval][Epoch 7][best metric: 1.0]
[2023/04/26 13:45:20] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:25] ppcls INFO: [Train][Epoch 8/20][Iter: 0/1]lr(LinearWarmup): 0.01130636, top1: 0.98876, CELoss: 0.04224, loss: 0.04224, batch_cost: 5.07132s, reader_cost: 4.63285, ips: 17.54966 samples/s, eta: 0:01:05
[2023/04/26 13:45:25] ppcls INFO: [Train][Epoch 8/20][Avg]top1: 0.98876, CELoss: 0.04224, loss: 0.04224
[2023/04/26 13:45:26] ppcls INFO: [Eval][Epoch 8][Iter: 0/1]CELoss: 0.04721, loss: 0.04721, top1: 1.00000, batch_cost: 0.69961s, reader_cost: 0.67377, ips: 14.29371 images/sec
[2023/04/26 13:45:26] ppcls INFO: [Eval][Epoch 8][Avg]CELoss: 0.04721, loss: 0.04721, top1: 1.00000
[2023/04/26 13:45:26] ppcls INFO: [Eval][Epoch 8][best metric: 1.0]
[2023/04/26 13:45:26] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:31] ppcls INFO: [Train][Epoch 9/20][Iter: 0/1]lr(LinearWarmup): 0.01043207, top1: 0.98876, CELoss: 0.01925, loss: 0.01925, batch_cost: 5.02064s, reader_cost: 4.61041, ips: 17.72683 samples/s, eta: 0:01:00
[2023/04/26 13:45:31] ppcls INFO: [Train][Epoch 9/20][Avg]top1: 0.98876, CELoss: 0.01925, loss: 0.01925
[2023/04/26 13:45:32] ppcls INFO: [Eval][Epoch 9][Iter: 0/1]CELoss: 0.04853, loss: 0.04853, top1: 1.00000, batch_cost: 0.72998s, reader_cost: 0.70338, ips: 13.69907 images/sec
[2023/04/26 13:45:32] ppcls INFO: [Eval][Epoch 9][Avg]CELoss: 0.04853, loss: 0.04853, top1: 1.00000
[2023/04/26 13:45:32] ppcls INFO: [Eval][Epoch 9][best metric: 1.0]
[2023/04/26 13:45:32] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:37] ppcls INFO: [Train][Epoch 10/20][Iter: 0/1]lr(LinearWarmup): 0.00937500, top1: 0.98876, CELoss: 0.02443, loss: 0.02443, batch_cost: 5.04136s, reader_cost: 4.65268, ips: 17.65395 samples/s, eta: 0:00:55
[2023/04/26 13:45:37] ppcls INFO: [Train][Epoch 10/20][Avg]top1: 0.98876, CELoss: 0.02443, loss: 0.02443
[2023/04/26 13:45:38] ppcls INFO: [Eval][Epoch 10][Iter: 0/1]CELoss: 0.05472, loss: 0.05472, top1: 1.00000, batch_cost: 0.70865s, reader_cost: 0.68417, ips: 14.11127 images/sec
[2023/04/26 13:45:38] ppcls INFO: [Eval][Epoch 10][Avg]CELoss: 0.05472, loss: 0.05472, top1: 1.00000
[2023/04/26 13:45:38] ppcls INFO: [Eval][Epoch 10][best metric: 1.0]
[2023/04/26 13:45:38] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:43] ppcls INFO: [Train][Epoch 11/20][Iter: 0/1]lr(LinearWarmup): 0.00818136, top1: 1.00000, CELoss: 0.00081, loss: 0.00081, batch_cost: 5.01281s, reader_cost: 4.64164, ips: 17.75452 samples/s, eta: 0:00:50
[2023/04/26 13:45:43] ppcls INFO: [Train][Epoch 11/20][Avg]top1: 1.00000, CELoss: 0.00081, loss: 0.00081
[2023/04/26 13:45:44] ppcls INFO: [Eval][Epoch 11][Iter: 0/1]CELoss: 0.06038, loss: 0.06038, top1: 1.00000, batch_cost: 0.75041s, reader_cost: 0.72504, ips: 13.32609 images/sec
[2023/04/26 13:45:44] ppcls INFO: [Eval][Epoch 11][Avg]CELoss: 0.06038, loss: 0.06038, top1: 1.00000
[2023/04/26 13:45:44] ppcls INFO: [Eval][Epoch 11][best metric: 1.0]
[2023/04/26 13:45:44] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:50] ppcls INFO: [Train][Epoch 12/20][Iter: 0/1]lr(LinearWarmup): 0.00690330, top1: 0.97753, CELoss: 0.03649, loss: 0.03649, batch_cost: 5.02185s, reader_cost: 4.66552, ips: 17.72257 samples/s, eta: 0:00:45
[2023/04/26 13:45:50] ppcls INFO: [Train][Epoch 12/20][Avg]top1: 0.97753, CELoss: 0.03649, loss: 0.03649
[2023/04/26 13:45:51] ppcls INFO: [Eval][Epoch 12][Iter: 0/1]CELoss: 0.06962, loss: 0.06962, top1: 1.00000, batch_cost: 0.71640s, reader_cost: 0.69255, ips: 13.95866 images/sec
[2023/04/26 13:45:51] ppcls INFO: [Eval][Epoch 12][Avg]CELoss: 0.06962, loss: 0.06962, top1: 1.00000
[2023/04/26 13:45:51] ppcls INFO: [Eval][Epoch 12][best metric: 1.0]
[2023/04/26 13:45:51] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:45:55] ppcls INFO: [Train][Epoch 13/20][Iter: 0/1]lr(LinearWarmup): 0.00559670, top1: 1.00000, CELoss: 0.00179, loss: 0.00179, batch_cost: 4.98877s, reader_cost: 4.64551, ips: 17.84007 samples/s, eta: 0:00:39
[2023/04/26 13:45:56] ppcls INFO: [Train][Epoch 13/20][Avg]top1: 1.00000, CELoss: 0.00179, loss: 0.00179
[2023/04/26 13:45:56] ppcls INFO: [Eval][Epoch 13][Iter: 0/1]CELoss: 0.07032, loss: 0.07032, top1: 1.00000, batch_cost: 0.71098s, reader_cost: 0.68341, ips: 14.06512 images/sec
[2023/04/26 13:45:57] ppcls INFO: [Eval][Epoch 13][Avg]CELoss: 0.07032, loss: 0.07032, top1: 1.00000
[2023/04/26 13:45:57] ppcls INFO: [Eval][Epoch 13][best metric: 1.0]
[2023/04/26 13:45:57] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:46:01] ppcls INFO: [Train][Epoch 14/20][Iter: 0/1]lr(LinearWarmup): 0.00431864, top1: 1.00000, CELoss: 0.00400, loss: 0.00400, batch_cost: 4.96516s, reader_cost: 4.63336, ips: 17.92492 samples/s, eta: 0:00:34
[2023/04/26 13:46:02] ppcls INFO: [Train][Epoch 14/20][Avg]top1: 1.00000, CELoss: 0.00400, loss: 0.00400
[2023/04/26 13:46:02] ppcls INFO: [Eval][Epoch 14][Iter: 0/1]CELoss: 0.06446, loss: 0.06446, top1: 1.00000, batch_cost: 0.70895s, reader_cost: 0.68529, ips: 14.10539 images/sec
[2023/04/26 13:46:02] ppcls INFO: [Eval][Epoch 14][Avg]CELoss: 0.06446, loss: 0.06446, top1: 1.00000
[2023/04/26 13:46:02] ppcls INFO: [Eval][Epoch 14][best metric: 1.0]
[2023/04/26 13:46:03] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:46:07] ppcls INFO: [Train][Epoch 15/20][Iter: 0/1]lr(LinearWarmup): 0.00312500, top1: 1.00000, CELoss: 0.00080, loss: 0.00080, batch_cost: 4.95019s, reader_cost: 4.62862, ips: 17.97910 samples/s, eta: 0:00:29
[2023/04/26 13:46:08] ppcls INFO: [Train][Epoch 15/20][Avg]top1: 1.00000, CELoss: 0.00080, loss: 0.00080
[2023/04/26 13:46:08] ppcls INFO: [Eval][Epoch 15][Iter: 0/1]CELoss: 0.06150, loss: 0.06150, top1: 1.00000, batch_cost: 0.69698s, reader_cost: 0.67237, ips: 14.34760 images/sec
[2023/04/26 13:46:09] ppcls INFO: [Eval][Epoch 15][Avg]CELoss: 0.06150, loss: 0.06150, top1: 1.00000
[2023/04/26 13:46:09] ppcls INFO: [Eval][Epoch 15][best metric: 1.0]
[2023/04/26 13:46:09] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:46:14] ppcls INFO: [Train][Epoch 16/20][Iter: 0/1]lr(LinearWarmup): 0.00206793, top1: 1.00000, CELoss: 0.00236, loss: 0.00236, batch_cost: 4.97162s, reader_cost: 4.65866, ips: 17.90162 samples/s, eta: 0:00:24
[2023/04/26 13:46:14] ppcls INFO: [Train][Epoch 16/20][Avg]top1: 1.00000, CELoss: 0.00236, loss: 0.00236
[2023/04/26 13:46:15] ppcls INFO: [Eval][Epoch 16][Iter: 0/1]CELoss: 0.06222, loss: 0.06222, top1: 1.00000, batch_cost: 0.80950s, reader_cost: 0.78571, ips: 12.35334 images/sec
[2023/04/26 13:46:15] ppcls INFO: [Eval][Epoch 16][Avg]CELoss: 0.06222, loss: 0.06222, top1: 1.00000
[2023/04/26 13:46:15] ppcls INFO: [Eval][Epoch 16][best metric: 1.0]
[2023/04/26 13:46:15] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:46:20] ppcls INFO: [Train][Epoch 17/20][Iter: 0/1]lr(LinearWarmup): 0.00119364, top1: 1.00000, CELoss: 0.00076, loss: 0.00076, batch_cost: 4.99404s, reader_cost: 4.68679, ips: 17.82123 samples/s, eta: 0:00:19
[2023/04/26 13:46:21] ppcls INFO: [Train][Epoch 17/20][Avg]top1: 1.00000, CELoss: 0.00076, loss: 0.00076
[2023/04/26 13:46:21] ppcls INFO: [Eval][Epoch 17][Iter: 0/1]CELoss: 0.05307, loss: 0.05307, top1: 1.00000, batch_cost: 0.73522s, reader_cost: 0.70971, ips: 13.60143 images/sec
[2023/04/26 13:46:22] ppcls INFO: [Eval][Epoch 17][Avg]CELoss: 0.05307, loss: 0.05307, top1: 1.00000
[2023/04/26 13:46:22] ppcls INFO: [Eval][Epoch 17][best metric: 1.0]
[2023/04/26 13:46:22] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:46:28] ppcls INFO: [Train][Epoch 18/20][Iter: 0/1]lr(LinearWarmup): 0.00054034, top1: 1.00000, CELoss: 0.00058, loss: 0.00058, batch_cost: 5.05621s, reader_cost: 4.75497, ips: 17.60210 samples/s, eta: 0:00:15
[2023/04/26 13:46:28] ppcls INFO: [Train][Epoch 18/20][Avg]top1: 1.00000, CELoss: 0.00058, loss: 0.00058
[2023/04/26 13:46:29] ppcls INFO: [Eval][Epoch 18][Iter: 0/1]CELoss: 0.04409, loss: 0.04409, top1: 1.00000, batch_cost: 0.73768s, reader_cost: 0.71280, ips: 13.55594 images/sec
[2023/04/26 13:46:29] ppcls INFO: [Eval][Epoch 18][Avg]CELoss: 0.04409, loss: 0.04409, top1: 1.00000
[2023/04/26 13:46:29] ppcls INFO: [Eval][Epoch 18][best metric: 1.0]
[2023/04/26 13:46:29] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:46:34] ppcls INFO: [Train][Epoch 19/20][Iter: 0/1]lr(LinearWarmup): 0.00013658, top1: 1.00000, CELoss: 0.00070, loss: 0.00070, batch_cost: 5.04889s, reader_cost: 4.75389, ips: 17.62763 samples/s, eta: 0:00:10
[2023/04/26 13:46:34] ppcls INFO: [Train][Epoch 19/20][Avg]top1: 1.00000, CELoss: 0.00070, loss: 0.00070
[2023/04/26 13:46:35] ppcls INFO: [Eval][Epoch 19][Iter: 0/1]CELoss: 0.03790, loss: 0.03790, top1: 1.00000, batch_cost: 0.74916s, reader_cost: 0.72452, ips: 13.34823 images/sec
[2023/04/26 13:46:35] ppcls INFO: [Eval][Epoch 19][Avg]CELoss: 0.03790, loss: 0.03790, top1: 1.00000
[2023/04/26 13:46:35] ppcls INFO: [Eval][Epoch 19][best metric: 1.0]
[2023/04/26 13:46:35] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
[2023/04/26 13:46:40] ppcls INFO: [Train][Epoch 20/20][Iter: 0/1]lr(LinearWarmup): 0.00000000, top1: 1.00000, CELoss: 0.00175, loss: 0.00175, batch_cost: 5.04019s, reader_cost: 4.75059, ips: 17.65807 samples/s, eta: 0:00:05
[2023/04/26 13:46:40] ppcls INFO: [Train][Epoch 20/20][Avg]top1: 1.00000, CELoss: 0.00175, loss: 0.00175
[2023/04/26 13:46:41] ppcls INFO: [Eval][Epoch 20][Iter: 0/1]CELoss: 0.02890, loss: 0.02890, top1: 1.00000, batch_cost: 0.76344s, reader_cost: 0.73916, ips: 13.09865 images/sec
[2023/04/26 13:46:41] ppcls INFO: [Eval][Epoch 20][Avg]CELoss: 0.02890, loss: 0.02890, top1: 1.00000
[2023/04/26 13:46:41] ppcls INFO: [Eval][Epoch 20][best metric: 1.0]
[2023/04/26 13:46:41] ppcls INFO: Already save model in ./output/ShuffleNetV2_x0_25/latest
!python ./PaddleClas/tools/eval.py \
    -c ./work/configs/ShuffleNetV2_x0_25.yaml  \
    -o Global.pretrained_model=./output/ShuffleNetV2_x0_25/best_model

/opt/conda/envs/python35-paddle120-env/lib/python3.9/site-packages/sklearn/utils/multiclass.py:14: DeprecationWarning: Please use `spmatrix` from the `scipy.sparse` namespace, the `scipy.sparse.base` namespace is deprecated.
  from scipy.sparse.base import spmatrix
[2023/04/26 13:46:47] ppcls INFO: 
===========================================================
==        PaddleClas is powered by PaddlePaddle !        ==
===========================================================
==                                                       ==
==   For more info please go to the following website.   ==
==                                                       ==
==       https://github.com/PaddlePaddle/PaddleClas      ==
===========================================================

[2023/04/26 13:46:47] ppcls INFO: Arch : 
[2023/04/26 13:46:47] ppcls INFO:     class_num : 2
[2023/04/26 13:46:47] ppcls INFO:     name : ShuffleNetV2_x0_25
[2023/04/26 13:46:47] ppcls INFO: DataLoader : 
[2023/04/26 13:46:47] ppcls INFO:     Eval : 
[2023/04/26 13:46:47] ppcls INFO:         dataset : 
[2023/04/26 13:46:47] ppcls INFO:             cls_label_path : /home/aistudio/work/data/step_2_clas/val/val_list.txt
[2023/04/26 13:46:47] ppcls INFO:             image_root : /home/aistudio/work/data/step_2_clas/val/
[2023/04/26 13:46:47] ppcls INFO:             name : ImageNetDataset
[2023/04/26 13:46:47] ppcls INFO:             transform_ops : 
[2023/04/26 13:46:47] ppcls INFO:                 DecodeImage : 
[2023/04/26 13:46:47] ppcls INFO:                     channel_first : False
[2023/04/26 13:46:47] ppcls INFO:                     to_rgb : True
[2023/04/26 13:46:47] ppcls INFO:                 ResizeImage : 
[2023/04/26 13:46:47] ppcls INFO:                     resize_short : 256
[2023/04/26 13:46:47] ppcls INFO:                 CropImage : 
[2023/04/26 13:46:47] ppcls INFO:                     size : 512
[2023/04/26 13:46:47] ppcls INFO:                 NormalizeImage : 
[2023/04/26 13:46:47] ppcls INFO:                     mean : [0.485, 0.456, 0.406]
[2023/04/26 13:46:47] ppcls INFO:                     order : 
[2023/04/26 13:46:47] ppcls INFO:                     scale : 1.0/255.0
[2023/04/26 13:46:47] ppcls INFO:                     std : [0.229, 0.224, 0.225]
[2023/04/26 13:46:47] ppcls INFO:         loader : 
[2023/04/26 13:46:47] ppcls INFO:             num_workers : 4
[2023/04/26 13:46:47] ppcls INFO:             use_shared_memory : False
[2023/04/26 13:46:47] ppcls INFO:         sampler : 
[2023/04/26 13:46:47] ppcls INFO:             batch_size : 64
[2023/04/26 13:46:47] ppcls INFO:             drop_last : False
[2023/04/26 13:46:47] ppcls INFO:             name : DistributedBatchSampler
[2023/04/26 13:46:47] ppcls INFO:             shuffle : False
[2023/04/26 13:46:47] ppcls INFO:     Train : 
[2023/04/26 13:46:47] ppcls INFO:         dataset : 
[2023/04/26 13:46:47] ppcls INFO:             cls_label_path : /home/aistudio/work/data/step_2_clas/train/train_list.txt
[2023/04/26 13:46:47] ppcls INFO:             image_root : /home/aistudio/work/data/step_2_clas/train/
[2023/04/26 13:46:47] ppcls INFO:             name : ImageNetDataset
[2023/04/26 13:46:47] ppcls INFO:             transform_ops : 
[2023/04/26 13:46:47] ppcls INFO:                 DecodeImage : 
[2023/04/26 13:46:47] ppcls INFO:                     channel_first : False
[2023/04/26 13:46:47] ppcls INFO:                     to_rgb : True
[2023/04/26 13:46:47] ppcls INFO:                 RandCropImage : 
[2023/04/26 13:46:47] ppcls INFO:                     size : 512
[2023/04/26 13:46:47] ppcls INFO:                 RandFlipImage : 
[2023/04/26 13:46:47] ppcls INFO:                     flip_code : 1
[2023/04/26 13:46:47] ppcls INFO:                 NormalizeImage : 
[2023/04/26 13:46:47] ppcls INFO:                     mean : [0.485, 0.456, 0.406]
[2023/04/26 13:46:47] ppcls INFO:                     order : 
[2023/04/26 13:46:47] ppcls INFO:                     scale : 1.0/255.0
[2023/04/26 13:46:47] ppcls INFO:                     std : [0.229, 0.224, 0.225]
[2023/04/26 13:46:47] ppcls INFO:         loader : 
[2023/04/26 13:46:47] ppcls INFO:             num_workers : 4
[2023/04/26 13:46:47] ppcls INFO:             use_shared_memory : False
[2023/04/26 13:46:47] ppcls INFO:         sampler : 
[2023/04/26 13:46:47] ppcls INFO:             batch_size : 256
[2023/04/26 13:46:47] ppcls INFO:             drop_last : False
[2023/04/26 13:46:47] ppcls INFO:             name : DistributedBatchSampler
[2023/04/26 13:46:47] ppcls INFO:             shuffle : True
[2023/04/26 13:46:47] ppcls INFO: Global : 
[2023/04/26 13:46:47] ppcls INFO:     checkpoints : None
[2023/04/26 13:46:47] ppcls INFO:     device : gpu
[2023/04/26 13:46:47] ppcls INFO:     epochs : 20
[2023/04/26 13:46:47] ppcls INFO:     eval_during_train : True
[2023/04/26 13:46:47] ppcls INFO:     eval_interval : 1
[2023/04/26 13:46:47] ppcls INFO:     image_shape : [3, 512, 512]
[2023/04/26 13:46:47] ppcls INFO:     output_dir : ./output/
[2023/04/26 13:46:47] ppcls INFO:     pretrained_model : ./output/ShuffleNetV2_x0_25/best_model
[2023/04/26 13:46:47] ppcls INFO:     print_batch_step : 1
[2023/04/26 13:46:47] ppcls INFO:     save_inference_dir : ./inference
[2023/04/26 13:46:47] ppcls INFO:     save_interval : 100
[2023/04/26 13:46:47] ppcls INFO:     use_visualdl : False
[2023/04/26 13:46:47] ppcls INFO: Infer : 
[2023/04/26 13:46:47] ppcls INFO:     PostProcess : 
[2023/04/26 13:46:47] ppcls INFO:         class_id_map_file : None
[2023/04/26 13:46:47] ppcls INFO:         name : Topk
[2023/04/26 13:46:47] ppcls INFO:         topk : 1
[2023/04/26 13:46:47] ppcls INFO:     batch_size : 10
[2023/04/26 13:46:47] ppcls INFO:     infer_imgs : /home/aistudio/work/data/step_2_clas/val/t1/WIN_20230220_15_15_40_Pro.jpg
[2023/04/26 13:46:47] ppcls INFO:     transforms : 
[2023/04/26 13:46:47] ppcls INFO:         DecodeImage : 
[2023/04/26 13:46:47] ppcls INFO:             channel_first : False
[2023/04/26 13:46:47] ppcls INFO:             to_rgb : True
[2023/04/26 13:46:47] ppcls INFO:         ResizeImage : 
[2023/04/26 13:46:47] ppcls INFO:             resize_short : 256
[2023/04/26 13:46:47] ppcls INFO:         CropImage : 
[2023/04/26 13:46:47] ppcls INFO:             size : 512
[2023/04/26 13:46:47] ppcls INFO:         NormalizeImage : 
[2023/04/26 13:46:47] ppcls INFO:             mean : [0.485, 0.456, 0.406]
[2023/04/26 13:46:47] ppcls INFO:             order : 
[2023/04/26 13:46:47] ppcls INFO:             scale : 1.0/255.0
[2023/04/26 13:46:47] ppcls INFO:             std : [0.229, 0.224, 0.225]
[2023/04/26 13:46:47] ppcls INFO:         ToCHWImage : None
[2023/04/26 13:46:47] ppcls INFO: Loss : 
[2023/04/26 13:46:47] ppcls INFO:     Eval : 
[2023/04/26 13:46:47] ppcls INFO:         CELoss : 
[2023/04/26 13:46:47] ppcls INFO:             weight : 1.0
[2023/04/26 13:46:47] ppcls INFO:     Train : 
[2023/04/26 13:46:47] ppcls INFO:         CELoss : 
[2023/04/26 13:46:47] ppcls INFO:             weight : 1.0
[2023/04/26 13:46:47] ppcls INFO: Metric : 
[2023/04/26 13:46:47] ppcls INFO:     Eval : 
[2023/04/26 13:46:47] ppcls INFO:         TopkAcc : 
[2023/04/26 13:46:47] ppcls INFO:             topk : [1, 5]
[2023/04/26 13:46:47] ppcls INFO:     Train : 
[2023/04/26 13:46:47] ppcls INFO:         TopkAcc : 
[2023/04/26 13:46:47] ppcls INFO:             topk : [1, 5]
[2023/04/26 13:46:47] ppcls INFO: Optimizer : 
[2023/04/26 13:46:47] ppcls INFO:     lr : 
[2023/04/26 13:46:47] ppcls INFO:         learning_rate : 0.0125
[2023/04/26 13:46:47] ppcls INFO:         name : Cosine
[2023/04/26 13:46:47] ppcls INFO:         warmup_epoch : 5
[2023/04/26 13:46:47] ppcls INFO:     momentum : 0.9
[2023/04/26 13:46:47] ppcls INFO:     name : Momentum
[2023/04/26 13:46:47] ppcls INFO:     regularizer : 
[2023/04/26 13:46:47] ppcls INFO:         coeff : 1e-05
[2023/04/26 13:46:47] ppcls INFO:         name : L2
[2023/04/26 13:46:47] ppcls INFO: train with paddle 2.4.1 and device Place(gpu:0)
W0426 13:46:47.766211 20649 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0426 13:46:47.770717 20649 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[2023/04/26 13:46:51] ppcls WARNING: The output dims(2) is less than k(5), and the argument 5 of Topk has been removed.
[2023/04/26 13:46:51] ppcls INFO: [Eval][Epoch 0][Iter: 0/1]CELoss: 0.12295, loss: 0.12295, top1: 1.00000, batch_cost: 2.40572s, reader_cost: 0.52183, ips: 4.15676 images/sec
[2023/04/26 13:46:51] ppcls INFO: [Eval][Epoch 0][Avg]CELoss: 0.12295, loss: 0.12295, top1: 1.00000

可以看到,这里模型的分类精度很高,接近 100%

这里看看具体验证集中的分类分数。

!python ./PaddleClas/tools/infer.py \
    -c ./work/configs/ShuffleNetV2_x0_25.yaml  \
    -o Infer.infer_imgs=./work/data/step_2_clas/val/t1/ \
    -o Global.pretrained_model=./output/ShuffleNetV2_x0_25/best_model

从上面预测的结果来看:

[{'class_ids': [1], 'scores': [0.79564], 'file_name': './work/data/step_2_clas/val/t1/WIN_20230220_15_15_40_Pro.jpg', 'label_names': []}, 
{'class_ids': [1], 'scores': [0.8899], 'file_name': './work/data/step_2_clas/val/t1/WIN_20230220_15_16_00_Pro.jpg', 'label_names': []}, 
{'class_ids': [1], 'scores': [0.55245], 'file_name': './work/data/step_2_clas/val/t1/WIN_20230220_15_16_13_Pro.jpg', 'label_names': []}, 
{'class_ids': [1], 'scores': [0.94936], 'file_name': './work/data/step_2_clas/val/t1/WIN_20230220_15_17_17_Pro.jpg', 'label_names': []}, 
{'class_ids': [1], 'scores': [0.9375], 'file_name': './work/data/step_2_clas/val/t1/WIN_20230220_15_17_22_Pro.jpg', 'label_names': []}]

模型能够很好的对设备进行分类处理!

3. 第三阶段,生成关键部位Mask

3.1 标注显示屏中关键信息部分

打包下载上面的 step_1_screen 图片,然后用 LabelStudio 进行标注。

3.2 转换为PaddleOCR格式

标注完之后,将 json 格式的标注文件上传上来,放到 work/data/step_3_screen_anno_kie.json ,并进行格式的转换。

这里也可以直接使用 PaddleOCR 的 PPOCRLabel 工具进行标注。

!mkdir ./work/data/step_3_mask

with open('./work/data/step_3_screen_anno_kie.json') as f:
    step_3_screen_anno_kie = json.load(f)

kie_labels = []
for data in step_3_screen_anno_kie:
    _filename = data['file_upload'].split('-')[1]
    _result = {}
    for _anno in data['annotations']:
        for anno in _anno['result']:
            _id = anno['id']
            if _id not in _result:
                _result[_id] = {
                    "transcription": None, 
                    "label": None, 
                    "points": [], 
                    "id": _id, 
                    "linking": []
                }
            _result[_id]['transcription'] = _result[_id]['transcription'] if 'text' not in anno['value'] else anno['value']['text'][0].strip()
            _result[_id]['label'] = _result[_id]['label'] if 'labels' not in anno['value'] else anno['value']['labels'][0].strip()
            _x = int(512 * anno['value']['x'] / 100)
            _y = int(512 * anno['value']['y'] / 100)
            _width = int(512 * anno['value']['width'] / 100)
            _height = int(512 * anno['value']['height'] / 100)
            _result[_id]['points'] = [
                [_x, _y],
                [_x+_width, _y],
                [_x+_width, _y+_height],
                [_x, _y+_height]
            ]
            
    kie_labels.append(
        (_filename, list(_result.values()))
    )

with open('./work/data/step_3_mask/step_3_screen_mask.json', 'w') as f:
    for line in kie_labels:
        f.write(line[0])
        f.write('\t')
        f.write(json.dumps(line[1], ensure_ascii=False))
        f.write('\n')

# 生成类别标签列表
with open('./work/data/step_3_mask/class_list.txt', 'w') as f:
    f.write('other')
    f.write('\n')
    f.write('Info_Probe')
    f.write('\n')
    f.write('Freq_Set')
    f.write('\n')
    f.write('Freq_Main')
    f.write('\n')
    f.write('Val_Total')
    f.write('\n')
    f.write('Val_X')
    f.write('\n')
    f.write('Val_Y')
    f.write('\n')
    f.write('Val_Z')
    f.write('\n')
    f.write('Unit')
    f.write('\n')
    f.write('Field')
    f.write('\n')

mkdir: 无法创建目录"./work/data/step_3_mask": 文件已存在

3.3 生成关键部位Mask

这里将所有标注的Mask叠加起来,就得到了每一种仪表、每个关键信息部位的Mask概率图,取所有Mask的并集、并作一些Margin,就可以得到关键信息最终的Mask。

all_kie = []
all_filename = []
with open('./work/data/step_3_mask/step_3_screen_mask.json') as f:
    for line in f.readlines():
        all_filename.append(line.split('\t')[0])
        all_kie.append(json.loads(line.split('\t')[1]))

t0_keys = []
t1_keys = []
for filename in glob.glob('./work/data/step_2_clas/train/t0/*.jpg'):
    if filename.split('/')[-1] in all_filename:
        t0_keys.append(filename.split('/')[-1])
for filename in glob.glob('./work/data/step_2_clas/train/t1/*.jpg'):
    if filename.split('/')[-1] in all_filename:
        t1_keys.append(filename.split('/')[-1])
    
for filename in glob.glob('./work/data/step_2_clas/val/t0/*.jpg'):
    if filename.split('/')[-1] in all_filename:
        t0_keys.append(filename.split('/')[-1])
for filename in glob.glob('./work/data/step_2_clas/val/t1/*.jpg'):
    if filename.split('/')[-1] in all_filename:
        t1_keys.append(filename.split('/')[-1])

from collections import defaultdict
def _mask():
    return np.zeros((512, 512), dtype=int)

t0_mask = defaultdict(_mask)
t1_mask = defaultdict(_mask)

for i, filename in enumerate(all_filename):
    if filename in t0_keys:
        for anno in all_kie[i]:
            points = anno['points']
            t0_mask[anno['label']][points[1][1]:points[2][1], points[0][0]:points[1][0]] += 1
    if filename in t1_keys:
        for anno in all_kie[i]:
            points = anno['points']
            t1_mask[anno['label']][points[1][1]:points[2][1], points[0][0]:points[1][0]] += 1

t0_mask.keys(), t1_mask.keys()
(dict_keys(['Info_Probe', 'Freq_Set', 'Val_Total', 'Val_X', 'Val_Y', 'Val_Z', 'Unit', 'Field']),
 dict_keys(['Info_Probe', 'Freq_Set', 'Val_Total', 'Val_X', 'Val_Y', 'Val_Z', 'Freq_Main', 'Unit', 'Field']))
plt.figure(figsize=(16, 6))
plt.subplot(121)
plt.imshow(t0_mask['Info_Probe'])
 
plt.subplot(122)
plt.imshow(t1_mask['Info_Probe'])  

<matplotlib.image.AxesImage at 0x7f078111cf10>

在这里插入图片描述

plt.imshow(cv2.imread('work/data/step_1_screen/WIN_20230220_14_56_27_Pro.jpg'))
plt.imshow(t0_mask['Info_Probe'], alpha=0.5)

<matplotlib.image.AxesImage at 0x7f0780ec4af0>

在这里插入图片描述

def get_mask_box(mask, threshold=0, margin=0):
    def _get_range(data):
        r = np.argwhere(data>0)
        return np.min(r), np.max(r)
    height = np.sum(mask>threshold, axis=0)
    weight = np.sum(mask>threshold, axis=1)

    h_0, h_1 = _get_range(height)
    w_0, w_1 = _get_range(weight)

    h_0 = max(0, h_0-margin)
    h_1 = min(512, h_1+margin)
    w_0 = max(0, w_0-margin)
    w_1 = min(512, w_1+margin)

    return np.array([[h_0, w_0], [h_1, w_0], [h_0, w_1], [h_1, w_1]], dtype=int)

_key = 'Val_Total'

_t0_box = get_mask_box(t0_mask[_key], threshold=0, margin=0)
_t1_box = get_mask_box(t1_mask[_key], threshold=0, margin=0)

img = cv2.imread(np.random.choice(glob.glob('./work/data/step_2_clas/train/t1/*.jpg')))

plt.figure(figsize=(16, 6))
plt.subplot(121)
plt.imshow(img[..., ::-1])
 
plt.subplot(122)
plt.imshow(img[_t1_box[1][1]:_t1_box[2][1], _t1_box[0][0]:_t1_box[1][0], :][..., ::-1])   

<matplotlib.image.AxesImage at 0x7f0779d9f8e0>

在这里插入图片描述

可以看到,这里可以较好的将关键信息的位置提取出来。

另外,由于工业场景相对固定,仪表的使用说明一般都有读数指导,所以,也可以根据仪表本身布局手动划分mask,比如:

4. 第四阶段,Finetune训练识别模型

4.1 划分PaddleOCR训练集与验证集

利用之前标注的关键信息,将所有关键部位提取出来,作为PaddleOCR的训练数据。

也可以用之前的Mask提取关键部位,这样提取出来的图片可能会更大一些。

这里由于数据集较少,每种仪器只单独抽一张作为验证集。

!mkdir ./work/data/step_4_ocr
!mkdir ./work/data/step_4_ocr/train
!mkdir ./work/data/step_4_ocr/val

# 这里每种仪器单独抽一张作为验证集
val_imgs = {'WIN_20230220_14_56_07_Pro.jpg', 'WIN_20230220_15_17_47_Pro.jpg'}

count_train = 0
count_val = 0
train_labels = []
val_labels = []
for i, _img in enumerate(all_filename):
    img = cv2.imread('./work/data/step_1_screen/' + _img)
    for anno in all_kie[i]:
        points = anno['points']
        label = anno['label']
        img_rec = img[points[1][1]:points[2][1], points[0][0]:points[1][0]]
        if _img in val_imgs:
            _img_name = label+'_'+_img
            cv2.imwrite('./work/data/step_4_ocr/val/'+_img_name, img_rec)
            val_labels.append((_img_name, anno['transcription']))
            count_val += 1
        else:
            _img_name = label+'_'+_img
            cv2.imwrite('./work/data/step_4_ocr/train/'+_img_name, img_rec)
            train_labels.append((_img_name, anno['transcription']))
            count_train += 1

with open('./work/data/step_4_ocr/train_labels.txt', 'w') as f:
    for _img, transcription in train_labels:
        f.write('train/'+_img+'\t'+transcription)
        f.write('\n')

with open('./work/data/step_4_ocr/val_labels.txt', 'w') as f:
    for _img, transcription in val_labels:
        f.write('val/'+_img+'\t'+transcription)
        f.write('\n')

print('Done with {} train files...'.format(count_train))
print('Done with {} val files...'.format(count_val))
mkdir: 无法创建目录"./work/data/step_4_ocr": 文件已存在
mkdir: 无法创建目录"./work/data/step_4_ocr/train": 文件已存在
mkdir: 无法创建目录"./work/data/step_4_ocr/val": 文件已存在
Done with 208 train files...
Done with 17 val files...

分割好之后的数据文件如下:

./work/data/step_4_ocr
├── train
│   ├── Field_WIN_20230220_14_47_59_Pro.jpg
│   ├── ...
├── train_labels.txt
├── val
│   ├── Field_WIN_20230220_14_56_07_Pro.jpg
│   ├── ..
└── val_labels.txt

2 directories, 227 files

4.2 训练PaddleOCR识别模型

这里需要先手动下载训练模型,然后进行模型训练。

# 下载预训练模型
!wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar
!tar -xf ch_PP-OCRv3_rec_train.tar
--2023-04-26 13:20:59--  https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar
正在解析主机 paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)... 182.61.200.195, 182.61.200.229, 2409:8c04:1001:1002:0:ff:b001:368a
正在连接 paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)|182.61.200.195|:443... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度: 287467520 (274M) [application/x-tar]
正在保存至: “ch_PP-OCRv3_rec_train.tar”

ch_PP-OCRv3_rec_tra 100%[===================>] 274.15M  3.12MB/s    in 2m 19s  

2023-04-26 13:23:19 (1.97 MB/s) - 已保存 “ch_PP-OCRv3_rec_train.tar” [287467520/287467520])

这里使用 PP-OCRv3 进行模型的训练,配置文件主要需要关注:

Global:
  debug: false
  use_gpu: true
  epoch_num: 20
  log_smooth_window: 20
  print_batch_step: 5
  save_model_dir: ./output/rec_ppocr_v3_distillation # 模型保存的目录
  save_epoch_step: 300
  eval_batch_step: [0, 10]
  cal_metric_during_train: true
  pretrained_model: /home/aistudio/ch_PP-OCRv3_rec_train/best_accuracy # 上面的与训练模型目录
  checkpoints:
  save_inference_dir:
  use_visualdl: false
  infer_img: /home/aistudio/work/data/step_4_ocr/val/Field_WIN_20230220_14_56_07_Pro.jpg
  character_dict_path: /home/aistudio/PaddleOCR/ppocr/utils/ppocr_keys_v1.txt # 字符字典的路径
  max_text_length: &max_text_length 25
  infer_mode: false
  use_space_char: true
  distributed: false
  save_res_path: ./output/rec/predicts_ppocrv3_distillation.txt

...

Train:
  dataset:
    name: SimpleDataSet
    data_dir: /home/aistudio/work/data/step_4_ocr/ # 数据目录
    ext_op_transform_idx: 1
    label_file_list:
    - /home/aistudio/work/data/step_4_ocr//train_labels.txt # 训练样本
...

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/aistudio/work/data/step_4_ocr/ # 数据目录
    label_file_list:
    - /home/aistudio/work/data/step_4_ocr/val_labels.txt # 验证样本
...
!python3 ./PaddleOCR/tools/train.py \
    -c ./work/configs/ch_PP-OCRv3_rec_distillation.yml \
    -o Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True

[2023/04/26 14:00:41] ppocr INFO: Architecture : 
[2023/04/26 14:00:41] ppocr INFO:     Models : 
[2023/04/26 14:00:41] ppocr INFO:         Student : 
[2023/04/26 14:00:41] ppocr INFO:             Backbone : 
[2023/04/26 14:00:41] ppocr INFO:                 last_conv_stride : [1, 2]
[2023/04/26 14:00:41] ppocr INFO:                 last_pool_type : avg
[2023/04/26 14:00:41] ppocr INFO:                 name : MobileNetV1Enhance
[2023/04/26 14:00:41] ppocr INFO:                 scale : 0.5
[2023/04/26 14:00:41] ppocr INFO:             Head : 
[2023/04/26 14:00:41] ppocr INFO:                 head_list : 
[2023/04/26 14:00:41] ppocr INFO:                     CTCHead : 
[2023/04/26 14:00:41] ppocr INFO:                         Head : 
[2023/04/26 14:00:41] ppocr INFO:                             fc_decay : 1e-05
[2023/04/26 14:00:41] ppocr INFO:                         Neck : 
[2023/04/26 14:00:41] ppocr INFO:                             depth : 2
[2023/04/26 14:00:41] ppocr INFO:                             dims : 64
[2023/04/26 14:00:41] ppocr INFO:                             hidden_dims : 120
[2023/04/26 14:00:41] ppocr INFO:                             name : svtr
[2023/04/26 14:00:41] ppocr INFO:                             use_guide : True
[2023/04/26 14:00:41] ppocr INFO:                     SARHead : 
[2023/04/26 14:00:41] ppocr INFO:                         enc_dim : 512
[2023/04/26 14:00:41] ppocr INFO:                         max_text_length : 25
[2023/04/26 14:00:41] ppocr INFO:                 name : MultiHead
[2023/04/26 14:00:41] ppocr INFO:             Transform : None
[2023/04/26 14:00:41] ppocr INFO:             algorithm : SVTR
[2023/04/26 14:00:41] ppocr INFO:             freeze_params : False
[2023/04/26 14:00:41] ppocr INFO:             model_type : rec
[2023/04/26 14:00:41] ppocr INFO:             pretrained : None
[2023/04/26 14:00:41] ppocr INFO:             return_all_feats : True
[2023/04/26 14:00:41] ppocr INFO:         Teacher : 
[2023/04/26 14:00:41] ppocr INFO:             Backbone : 
[2023/04/26 14:00:41] ppocr INFO:                 last_conv_stride : [1, 2]
[2023/04/26 14:00:41] ppocr INFO:                 last_pool_type : avg
[2023/04/26 14:00:41] ppocr INFO:                 name : MobileNetV1Enhance
[2023/04/26 14:00:41] ppocr INFO:                 scale : 0.5
[2023/04/26 14:00:41] ppocr INFO:             Head : 
[2023/04/26 14:00:41] ppocr INFO:                 head_list : 
[2023/04/26 14:00:41] ppocr INFO:                     CTCHead : 
[2023/04/26 14:00:41] ppocr INFO:                         Head : 
[2023/04/26 14:00:41] ppocr INFO:                             fc_decay : 1e-05
[2023/04/26 14:00:41] ppocr INFO:                         Neck : 
[2023/04/26 14:00:41] ppocr INFO:                             depth : 2
[2023/04/26 14:00:41] ppocr INFO:                             dims : 64
[2023/04/26 14:00:41] ppocr INFO:                             hidden_dims : 120
[2023/04/26 14:00:41] ppocr INFO:                             name : svtr
[2023/04/26 14:00:41] ppocr INFO:                             use_guide : True
[2023/04/26 14:00:41] ppocr INFO:                     SARHead : 
[2023/04/26 14:00:41] ppocr INFO:                         enc_dim : 512
[2023/04/26 14:00:41] ppocr INFO:                         max_text_length : 25
[2023/04/26 14:00:41] ppocr INFO:                 name : MultiHead
[2023/04/26 14:00:41] ppocr INFO:             Transform : None
[2023/04/26 14:00:41] ppocr INFO:             algorithm : SVTR
[2023/04/26 14:00:41] ppocr INFO:             freeze_params : False
[2023/04/26 14:00:41] ppocr INFO:             model_type : rec
[2023/04/26 14:00:41] ppocr INFO:             pretrained : None
[2023/04/26 14:00:41] ppocr INFO:             return_all_feats : True
[2023/04/26 14:00:41] ppocr INFO:     algorithm : Distillation
[2023/04/26 14:00:41] ppocr INFO:     model_type : rec
[2023/04/26 14:00:41] ppocr INFO:     name : DistillationModel
[2023/04/26 14:00:41] ppocr INFO: Eval : 
[2023/04/26 14:00:41] ppocr INFO:     dataset : 
[2023/04/26 14:00:41] ppocr INFO:         data_dir : /home/aistudio/work/data/step_4_ocr/
[2023/04/26 14:00:41] ppocr INFO:         label_file_list : ['/home/aistudio/work/data/step_4_ocr/val_labels.txt']
[2023/04/26 14:00:41] ppocr INFO:         name : SimpleDataSet
[2023/04/26 14:00:41] ppocr INFO:         transforms : 
[2023/04/26 14:00:41] ppocr INFO:             DecodeImage : 
[2023/04/26 14:00:41] ppocr INFO:                 channel_first : False
[2023/04/26 14:00:41] ppocr INFO:                 img_mode : BGR
[2023/04/26 14:00:41] ppocr INFO:             MultiLabelEncode : None
[2023/04/26 14:00:41] ppocr INFO:             RecResizeImg : 
[2023/04/26 14:00:41] ppocr INFO:                 image_shape : [3, 48, 320]
[2023/04/26 14:00:41] ppocr INFO:             KeepKeys : 
[2023/04/26 14:00:41] ppocr INFO:                 keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio']
[2023/04/26 14:00:41] ppocr INFO:     loader : 
[2023/04/26 14:00:41] ppocr INFO:         batch_size_per_card : 32
[2023/04/26 14:00:41] ppocr INFO:         drop_last : False
[2023/04/26 14:00:41] ppocr INFO:         num_workers : 4
[2023/04/26 14:00:41] ppocr INFO:         shuffle : False
[2023/04/26 14:00:41] ppocr INFO: Global : 
[2023/04/26 14:00:41] ppocr INFO:     cal_metric_during_train : True
[2023/04/26 14:00:41] ppocr INFO:     character_dict_path : /home/aistudio/PaddleOCR/ppocr/utils/ppocr_keys_v1.txt
[2023/04/26 14:00:41] ppocr INFO:     checkpoints : None
[2023/04/26 14:00:41] ppocr INFO:     debug : False
[2023/04/26 14:00:41] ppocr INFO:     distributed : False
[2023/04/26 14:00:41] ppocr INFO:     epoch_num : 20
[2023/04/26 14:00:41] ppocr INFO:     eval_batch_step : [0, 10]
[2023/04/26 14:00:41] ppocr INFO:     infer_img : /home/aistudio/work/data/step_4_ocr/val/Field_WIN_20230220_14_56_07_Pro.jpg
[2023/04/26 14:00:41] ppocr INFO:     infer_mode : False
[2023/04/26 14:00:41] ppocr INFO:     log_smooth_window : 20
[2023/04/26 14:00:41] ppocr INFO:     max_text_length : 25
[2023/04/26 14:00:41] ppocr INFO:     pretrained_model : /home/aistudio/ch_PP-OCRv3_rec_train/best_accuracy
[2023/04/26 14:00:41] ppocr INFO:     print_batch_step : 5
[2023/04/26 14:00:41] ppocr INFO:     save_epoch_step : 300
[2023/04/26 14:00:41] ppocr INFO:     save_inference_dir : None
[2023/04/26 14:00:41] ppocr INFO:     save_model_dir : ./output/rec_ppocr_v3_distillation
[2023/04/26 14:00:41] ppocr INFO:     save_res_path : ./output/rec/predicts_ppocrv3_distillation.txt
[2023/04/26 14:00:41] ppocr INFO:     scale_loss : 1024.0
[2023/04/26 14:00:41] ppocr INFO:     use_amp : True
[2023/04/26 14:00:41] ppocr INFO:     use_dynamic_loss_scaling : True
[2023/04/26 14:00:41] ppocr INFO:     use_gpu : True
[2023/04/26 14:00:41] ppocr INFO:     use_space_char : True
[2023/04/26 14:00:41] ppocr INFO:     use_visualdl : False
[2023/04/26 14:00:41] ppocr INFO: Loss : 
[2023/04/26 14:00:41] ppocr INFO:     loss_config_list : 
[2023/04/26 14:00:41] ppocr INFO:         DistillationDMLLoss : 
[2023/04/26 14:00:41] ppocr INFO:             act : softmax
[2023/04/26 14:00:41] ppocr INFO:             dis_head : ctc
[2023/04/26 14:00:41] ppocr INFO:             key : head_out
[2023/04/26 14:00:41] ppocr INFO:             model_name_pairs : [['Student', 'Teacher']]
[2023/04/26 14:00:41] ppocr INFO:             multi_head : True
[2023/04/26 14:00:41] ppocr INFO:             name : dml_ctc
[2023/04/26 14:00:41] ppocr INFO:             use_log : True
[2023/04/26 14:00:41] ppocr INFO:             weight : 1.0
[2023/04/26 14:00:41] ppocr INFO:         DistillationDMLLoss : 
[2023/04/26 14:00:41] ppocr INFO:             act : softmax
[2023/04/26 14:00:41] ppocr INFO:             dis_head : sar
[2023/04/26 14:00:41] ppocr INFO:             key : head_out
[2023/04/26 14:00:41] ppocr INFO:             model_name_pairs : [['Student', 'Teacher']]
[2023/04/26 14:00:41] ppocr INFO:             multi_head : True
[2023/04/26 14:00:41] ppocr INFO:             name : dml_sar
[2023/04/26 14:00:41] ppocr INFO:             use_log : True
[2023/04/26 14:00:41] ppocr INFO:             weight : 0.5
[2023/04/26 14:00:41] ppocr INFO:         DistillationDistanceLoss : 
[2023/04/26 14:00:41] ppocr INFO:             key : backbone_out
[2023/04/26 14:00:41] ppocr INFO:             mode : l2
[2023/04/26 14:00:41] ppocr INFO:             model_name_pairs : [['Student', 'Teacher']]
[2023/04/26 14:00:41] ppocr INFO:             weight : 1.0
[2023/04/26 14:00:41] ppocr INFO:         DistillationCTCLoss : 
[2023/04/26 14:00:41] ppocr INFO:             key : head_out
[2023/04/26 14:00:41] ppocr INFO:             model_name_list : ['Student', 'Teacher']
[2023/04/26 14:00:41] ppocr INFO:             multi_head : True
[2023/04/26 14:00:41] ppocr INFO:             weight : 1.0
[2023/04/26 14:00:41] ppocr INFO:         DistillationSARLoss : 
[2023/04/26 14:00:41] ppocr INFO:             key : head_out
[2023/04/26 14:00:41] ppocr INFO:             model_name_list : ['Student', 'Teacher']
[2023/04/26 14:00:41] ppocr INFO:             multi_head : True
[2023/04/26 14:00:41] ppocr INFO:             weight : 1.0
[2023/04/26 14:00:41] ppocr INFO:     name : CombinedLoss
[2023/04/26 14:00:41] ppocr INFO: Metric : 
[2023/04/26 14:00:41] ppocr INFO:     base_metric_name : RecMetric
[2023/04/26 14:00:41] ppocr INFO:     ignore_space : False
[2023/04/26 14:00:41] ppocr INFO:     key : Student
[2023/04/26 14:00:41] ppocr INFO:     main_indicator : acc
[2023/04/26 14:00:41] ppocr INFO:     name : DistillationMetric
[2023/04/26 14:00:41] ppocr INFO: Optimizer : 
[2023/04/26 14:00:41] ppocr INFO:     beta1 : 0.9
[2023/04/26 14:00:41] ppocr INFO:     beta2 : 0.999
[2023/04/26 14:00:41] ppocr INFO:     lr : 
[2023/04/26 14:00:41] ppocr INFO:         decay_epochs : [700]
[2023/04/26 14:00:41] ppocr INFO:         name : Piecewise
[2023/04/26 14:00:41] ppocr INFO:         values : [0.0005, 5e-05]
[2023/04/26 14:00:41] ppocr INFO:         warmup_epoch : 5
[2023/04/26 14:00:41] ppocr INFO:     name : Adam
[2023/04/26 14:00:41] ppocr INFO:     regularizer : 
[2023/04/26 14:00:41] ppocr INFO:         factor : 3e-05
[2023/04/26 14:00:41] ppocr INFO:         name : L2
[2023/04/26 14:00:41] ppocr INFO: PostProcess : 
[2023/04/26 14:00:41] ppocr INFO:     key : head_out
[2023/04/26 14:00:41] ppocr INFO:     model_name : ['Student', 'Teacher']
[2023/04/26 14:00:41] ppocr INFO:     multi_head : True
[2023/04/26 14:00:41] ppocr INFO:     name : DistillationCTCLabelDecode
[2023/04/26 14:00:41] ppocr INFO: Train : 
[2023/04/26 14:00:41] ppocr INFO:     dataset : 
[2023/04/26 14:00:41] ppocr INFO:         data_dir : /home/aistudio/work/data/step_4_ocr/
[2023/04/26 14:00:41] ppocr INFO:         ext_op_transform_idx : 1
[2023/04/26 14:00:41] ppocr INFO:         label_file_list : ['/home/aistudio/work/data/step_4_ocr//train_labels.txt']
[2023/04/26 14:00:41] ppocr INFO:         name : SimpleDataSet
[2023/04/26 14:00:41] ppocr INFO:         transforms : 
[2023/04/26 14:00:41] ppocr INFO:             DecodeImage : 
[2023/04/26 14:00:41] ppocr INFO:                 channel_first : False
[2023/04/26 14:00:41] ppocr INFO:                 img_mode : BGR
[2023/04/26 14:00:41] ppocr INFO:             RecConAug : 
[2023/04/26 14:00:41] ppocr INFO:                 ext_data_num : 2
[2023/04/26 14:00:41] ppocr INFO:                 image_shape : [48, 320, 3]
[2023/04/26 14:00:41] ppocr INFO:                 max_text_length : 25
[2023/04/26 14:00:41] ppocr INFO:                 prob : 0.5
[2023/04/26 14:00:41] ppocr INFO:             RecAug : None
[2023/04/26 14:00:41] ppocr INFO:             MultiLabelEncode : None
[2023/04/26 14:00:41] ppocr INFO:             RecResizeImg : 
[2023/04/26 14:00:41] ppocr INFO:                 image_shape : [3, 48, 320]
[2023/04/26 14:00:41] ppocr INFO:             KeepKeys : 
[2023/04/26 14:00:41] ppocr INFO:                 keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio']
[2023/04/26 14:00:41] ppocr INFO:     loader : 
[2023/04/26 14:00:41] ppocr INFO:         batch_size_per_card : 32
[2023/04/26 14:00:41] ppocr INFO:         drop_last : True
[2023/04/26 14:00:41] ppocr INFO:         num_workers : 4
[2023/04/26 14:00:41] ppocr INFO:         shuffle : True
[2023/04/26 14:00:41] ppocr INFO: profiler_options : None
[2023/04/26 14:00:41] ppocr INFO: train with paddle 2.4.1 and device Place(gpu:0)
[2023/04/26 14:00:41] ppocr INFO: Initialize indexs of datasets:['/home/aistudio/work/data/step_4_ocr//train_labels.txt']
[2023/04/26 14:00:41] ppocr INFO: Initialize indexs of datasets:['/home/aistudio/work/data/step_4_ocr/val_labels.txt']
W0426 14:00:41.313374 23242 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0426 14:00:41.317873 23242 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[2023/04/26 14:00:43] ppocr INFO: train dataloader has 6 iters
[2023/04/26 14:00:43] ppocr INFO: valid dataloader has 1 iters
[2023/04/26 14:00:43] ppocr INFO: load pretrain successful from /home/aistudio/ch_PP-OCRv3_rec_train/best_accuracy
[2023/04/26 14:00:43] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 10 iterations
W0426 14:00:47.215592 23242 rnn_kernel.cu.cc:237] If the memory space of the Input WeightList is not continuous, less efficient calculation will be called. Please call flatten_parameters() to make the input memory continuous.
W0426 14:00:47.523643 23242 rnn_kernel.cu.cc:237] If the memory space of the Input WeightList is not continuous, less efficient calculation will be called. Please call flatten_parameters() to make the input memory continuous.
[2023/04/26 14:00:49] ppocr INFO: epoch: [1/20], global_step: 5, lr: 0.000033, acc: 0.218750, norm_edit_dis: 0.618291, Teacher_acc: 0.218750, Teacher_norm_edit_dis: 0.615951, dml_ctc_0: 2.265834, loss: 67.465820, dml_sar_0: 4.138578, loss_distance_l2_Student_Teacher_0: 0.016533, loss_ctc_Student_0: 27.465954, loss_ctc_Teacher_1: 28.539000, loss_sar_Student_0: 2.618140, loss_sar_Teacher_1: 2.604934, avg_reader_cost: 0.26246 s, avg_batch_cost: 1.21846 s, avg_samples: 32.0, ips: 26.26263 samples/s, eta: 0:02:20
[2023/04/26 14:00:50] ppocr INFO: epoch: [1/20], global_step: 6, lr: 0.000042, acc: 0.218750, norm_edit_dis: 0.624227, Teacher_acc: 0.218750, Teacher_norm_edit_dis: 0.620674, dml_ctc_0: 2.323206, loss: 56.998512, dml_sar_0: 3.915677, loss_distance_l2_Student_Teacher_0: 0.016338, loss_ctc_Student_0: 22.409044, loss_ctc_Teacher_1: 23.497292, loss_sar_Student_0: 2.551030, loss_sar_Teacher_1: 2.558982, avg_reader_cost: 0.00007 s, avg_batch_cost: 0.06324 s, avg_samples: 6.4, ips: 101.20260 samples/s, eta: 0:02:01
[2023/04/26 14:00:57] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:01:01] ppocr INFO: epoch: [2/20], global_step: 10, lr: 0.000075, acc: 0.296875, norm_edit_dis: 0.660843, Teacher_acc: 0.281250, Teacher_norm_edit_dis: 0.646359, dml_ctc_0: 2.486703, loss: 47.692673, dml_sar_0: 3.781648, loss_distance_l2_Student_Teacher_0: 0.016127, loss_ctc_Student_0: 17.237976, loss_ctc_Teacher_1: 18.846182, loss_sar_Student_0: 2.473749, loss_sar_Teacher_1: 2.431814, avg_reader_cost: 1.85277 s, avg_batch_cost: 2.18952 s, avg_samples: 25.6, ips: 11.69205 samples/s, eta: 0:03:10
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.04it/s]
[2023/04/26 14:01:02] ppocr INFO: cur metric, acc: 0.7058819377165072, norm_edit_dis: 0.8555673118511611, Teacher_acc: 0.7058819377165072, Teacher_norm_edit_dis: 0.8615196893021435, fps: 245.3526762211173
[2023/04/26 14:01:09] ppocr INFO: save best model is to ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:01:09] ppocr INFO: best metric, acc: 0.7058819377165072, is_float16: False, norm_edit_dis: 0.8555673118511611, Teacher_acc: 0.7058819377165072, Teacher_norm_edit_dis: 0.8615196893021435, fps: 245.3526762211173, best_epoch: 2
[2023/04/26 14:01:10] ppocr INFO: epoch: [2/20], global_step: 12, lr: 0.000092, acc: 0.328125, norm_edit_dis: 0.690385, Teacher_acc: 0.312500, Teacher_norm_edit_dis: 0.669508, dml_ctc_0: 2.532462, loss: 45.867073, dml_sar_0: 3.693516, loss_distance_l2_Student_Teacher_0: 0.016127, loss_ctc_Student_0: 16.743233, loss_ctc_Teacher_1: 18.089544, loss_sar_Student_0: 2.422435, loss_sar_Teacher_1: 2.354228, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.13000 s, avg_samples: 12.8, ips: 98.45820 samples/s, eta: 0:02:42
[2023/04/26 14:01:17] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:01:20] ppocr INFO: epoch: [3/20], global_step: 15, lr: 0.000117, acc: 0.343750, norm_edit_dis: 0.730049, Teacher_acc: 0.375000, Teacher_norm_edit_dis: 0.728490, dml_ctc_0: 2.575012, loss: 42.983696, dml_sar_0: 3.689892, loss_distance_l2_Student_Teacher_0: 0.016143, loss_ctc_Student_0: 15.840259, loss_ctc_Teacher_1: 16.510084, loss_sar_Student_0: 2.273115, loss_sar_Teacher_1: 2.270792, avg_reader_cost: 1.81219 s, avg_batch_cost: 2.11303 s, avg_samples: 19.2, ips: 9.08647 samples/s, eta: 0:03:19
[2023/04/26 14:01:21] ppocr INFO: epoch: [3/20], global_step: 18, lr: 0.000142, acc: 0.406250, norm_edit_dis: 0.742920, Teacher_acc: 0.406250, Teacher_norm_edit_dis: 0.737482, dml_ctc_0: 2.510089, loss: 40.768040, dml_sar_0: 3.573122, loss_distance_l2_Student_Teacher_0: 0.016070, loss_ctc_Student_0: 14.639436, loss_ctc_Teacher_1: 15.107124, loss_sar_Student_0: 2.199036, loss_sar_Teacher_1: 2.249400, avg_reader_cost: 0.00018 s, avg_batch_cost: 0.19373 s, avg_samples: 19.2, ips: 99.10585 samples/s, eta: 0:02:47
[2023/04/26 14:01:27] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:01:30] ppocr INFO: epoch: [4/20], global_step: 20, lr: 0.000158, acc: 0.453125, norm_edit_dis: 0.766498, Teacher_acc: 0.453125, Teacher_norm_edit_dis: 0.760936, dml_ctc_0: 2.486703, loss: 39.262600, dml_sar_0: 3.410472, loss_distance_l2_Student_Teacher_0: 0.016070, loss_ctc_Student_0: 14.195126, loss_ctc_Teacher_1: 14.880583, loss_sar_Student_0: 2.110768, loss_sar_Teacher_1: 2.179938, avg_reader_cost: 1.51004 s, avg_batch_cost: 1.71247 s, avg_samples: 12.8, ips: 7.47459 samples/s, eta: 0:03:10
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.16it/s]
[2023/04/26 14:01:31] ppocr INFO: cur metric, acc: 0.7647054325262161, norm_edit_dis: 0.8690476960783861, Teacher_acc: 0.7647054325262161, Teacher_norm_edit_dis: 0.8874300382064201, fps: 297.2129417355131
[2023/04/26 14:01:38] ppocr INFO: save best model is to ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:01:38] ppocr INFO: best metric, acc: 0.7647054325262161, is_float16: False, norm_edit_dis: 0.8690476960783861, Teacher_acc: 0.7647054325262161, Teacher_norm_edit_dis: 0.8874300382064201, fps: 297.2129417355131, best_epoch: 4
[2023/04/26 14:01:40] ppocr INFO: epoch: [4/20], global_step: 24, lr: 0.000225, acc: 0.578125, norm_edit_dis: 0.837620, Teacher_acc: 0.562500, Teacher_norm_edit_dis: 0.832193, dml_ctc_0: 2.486703, loss: 28.321648, dml_sar_0: 2.794102, loss_distance_l2_Student_Teacher_0: 0.015937, loss_ctc_Student_0: 9.101360, loss_ctc_Teacher_1: 9.593037, loss_sar_Student_0: 2.067773, loss_sar_Teacher_1: 2.066093, avg_reader_cost: 0.00036 s, avg_batch_cost: 0.26845 s, avg_samples: 25.6, ips: 95.36338 samples/s, eta: 0:02:37
[2023/04/26 14:01:48] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:01:50] ppocr INFO: epoch: [5/20], global_step: 25, lr: 0.000242, acc: 0.593750, norm_edit_dis: 0.843491, Teacher_acc: 0.593750, Teacher_norm_edit_dis: 0.838860, dml_ctc_0: 2.491289, loss: 25.849964, dml_sar_0: 2.720895, loss_distance_l2_Student_Teacher_0: 0.016070, loss_ctc_Student_0: 8.218940, loss_ctc_Teacher_1: 8.407476, loss_sar_Student_0: 2.047373, loss_sar_Teacher_1: 2.018259, avg_reader_cost: 1.89827 s, avg_batch_cost: 2.00341 s, avg_samples: 6.4, ips: 3.19456 samples/s, eta: 0:03:07
[2023/04/26 14:01:51] ppocr INFO: epoch: [5/20], global_step: 30, lr: 0.000325, acc: 0.703125, norm_edit_dis: 0.886191, Teacher_acc: 0.687500, Teacher_norm_edit_dis: 0.883845, dml_ctc_0: 2.217640, loss: 18.657734, dml_sar_0: 2.425120, loss_distance_l2_Student_Teacher_0: 0.017844, loss_ctc_Student_0: 5.449226, loss_ctc_Teacher_1: 5.216585, loss_sar_Student_0: 1.973358, loss_sar_Teacher_1: 1.985333, avg_reader_cost: 0.00031 s, avg_batch_cost: 0.32730 s, avg_samples: 32.0, ips: 97.76977 samples/s, eta: 0:02:33
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.03it/s]
[2023/04/26 14:01:52] ppocr INFO: cur metric, acc: 0.8235289273359251, norm_edit_dis: 0.9215686735870547, Teacher_acc: 0.7647054325262161, Teacher_norm_edit_dis: 0.9031863314590207, fps: 276.2874812070863
[2023/04/26 14:02:00] ppocr INFO: save best model is to ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:02:00] ppocr INFO: best metric, acc: 0.8235289273359251, is_float16: False, norm_edit_dis: 0.9215686735870547, Teacher_acc: 0.7647054325262161, Teacher_norm_edit_dis: 0.9031863314590207, fps: 276.2874812070863, best_epoch: 5
[2023/04/26 14:02:07] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:02:11] ppocr INFO: epoch: [6/20], global_step: 35, lr: 0.000408, acc: 0.750000, norm_edit_dis: 0.907018, Teacher_acc: 0.718750, Teacher_norm_edit_dis: 0.901633, dml_ctc_0: 1.908825, loss: 16.794374, dml_sar_0: 2.203660, loss_distance_l2_Student_Teacher_0: 0.019825, loss_ctc_Student_0: 4.135582, loss_ctc_Teacher_1: 4.643906, loss_sar_Student_0: 1.950754, loss_sar_Teacher_1: 1.946669, avg_reader_cost: 1.82641 s, avg_batch_cost: 2.25922 s, avg_samples: 32.0, ips: 14.16415 samples/s, eta: 0:02:31
[2023/04/26 14:02:11] ppocr INFO: epoch: [6/20], global_step: 36, lr: 0.000425, acc: 0.765625, norm_edit_dis: 0.912027, Teacher_acc: 0.734375, Teacher_norm_edit_dis: 0.901633, dml_ctc_0: 1.798453, loss: 16.704975, dml_sar_0: 2.128488, loss_distance_l2_Student_Teacher_0: 0.020530, loss_ctc_Student_0: 4.051152, loss_ctc_Teacher_1: 4.371264, loss_sar_Student_0: 1.950754, loss_sar_Teacher_1: 1.964270, avg_reader_cost: 0.00006 s, avg_batch_cost: 0.07013 s, avg_samples: 6.4, ips: 91.25448 samples/s, eta: 0:02:26
[2023/04/26 14:02:18] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:02:21] ppocr INFO: epoch: [7/20], global_step: 40, lr: 0.000492, acc: 0.765625, norm_edit_dis: 0.928090, Teacher_acc: 0.781250, Teacher_norm_edit_dis: 0.930848, dml_ctc_0: 1.542307, loss: 16.559875, dml_sar_0: 1.971985, loss_distance_l2_Student_Teacher_0: 0.022730, loss_ctc_Student_0: 4.051152, loss_ctc_Teacher_1: 4.187770, loss_sar_Student_0: 1.973358, loss_sar_Teacher_1: 1.964270, avg_reader_cost: 1.61422 s, avg_batch_cost: 1.94979 s, avg_samples: 25.6, ips: 13.12963 samples/s, eta: 0:02:24
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.11it/s]
[2023/04/26 14:02:22] ppocr INFO: cur metric, acc: 0.7647054325262161, norm_edit_dis: 0.9215686735870547, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.936274547289482, fps: 286.65281574630944
[2023/04/26 14:02:22] ppocr INFO: best metric, acc: 0.8235289273359251, is_float16: False, norm_edit_dis: 0.9215686735870547, Teacher_acc: 0.7647054325262161, Teacher_norm_edit_dis: 0.9031863314590207, fps: 276.2874812070863, best_epoch: 5
[2023/04/26 14:02:23] ppocr INFO: epoch: [7/20], global_step: 42, lr: 0.000500, acc: 0.765625, norm_edit_dis: 0.934104, Teacher_acc: 0.781250, Teacher_norm_edit_dis: 0.933005, dml_ctc_0: 1.524538, loss: 16.559875, dml_sar_0: 1.906030, loss_distance_l2_Student_Teacher_0: 0.023698, loss_ctc_Student_0: 4.051152, loss_ctc_Teacher_1: 4.187770, loss_sar_Student_0: 1.950754, loss_sar_Teacher_1: 1.970239, avg_reader_cost: 0.00013 s, avg_batch_cost: 0.14352 s, avg_samples: 12.8, ips: 89.18865 samples/s, eta: 0:02:15
[2023/04/26 14:02:31] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:02:33] ppocr INFO: epoch: [8/20], global_step: 45, lr: 0.000500, acc: 0.765625, norm_edit_dis: 0.930026, Teacher_acc: 0.796875, Teacher_norm_edit_dis: 0.933005, dml_ctc_0: 1.503053, loss: 16.187134, dml_sar_0: 1.906030, loss_distance_l2_Student_Teacher_0: 0.024378, loss_ctc_Student_0: 4.051152, loss_ctc_Teacher_1: 4.141407, loss_sar_Student_0: 1.973358, loss_sar_Teacher_1: 1.946446, avg_reader_cost: 1.83885 s, avg_batch_cost: 2.11040 s, avg_samples: 19.2, ips: 9.09780 samples/s, eta: 0:02:19
[2023/04/26 14:02:34] ppocr INFO: epoch: [8/20], global_step: 48, lr: 0.000500, acc: 0.781250, norm_edit_dis: 0.948117, Teacher_acc: 0.812500, Teacher_norm_edit_dis: 0.944629, dml_ctc_0: 1.412281, loss: 15.210090, dml_sar_0: 1.906030, loss_distance_l2_Student_Teacher_0: 0.024378, loss_ctc_Student_0: 3.901334, loss_ctc_Teacher_1: 3.904406, loss_sar_Student_0: 2.003076, loss_sar_Teacher_1: 1.951821, avg_reader_cost: 0.00019 s, avg_batch_cost: 0.18971 s, avg_samples: 19.2, ips: 101.20629 samples/s, eta: 0:02:07
[2023/04/26 14:02:42] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:02:44] ppocr INFO: epoch: [9/20], global_step: 50, lr: 0.000500, acc: 0.843750, norm_edit_dis: 0.958142, Teacher_acc: 0.812500, Teacher_norm_edit_dis: 0.951574, dml_ctc_0: 1.395251, loss: 13.235109, dml_sar_0: 1.900609, loss_distance_l2_Student_Teacher_0: 0.024182, loss_ctc_Student_0: 3.330377, loss_ctc_Teacher_1: 3.337240, loss_sar_Student_0: 1.972665, loss_sar_Teacher_1: 1.951821, avg_reader_cost: 1.75878 s, avg_batch_cost: 1.95696 s, avg_samples: 12.8, ips: 6.54077 samples/s, eta: 0:02:12
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.03it/s]
[2023/04/26 14:02:45] ppocr INFO: cur metric, acc: 0.7058819377165072, norm_edit_dis: 0.8982843735582117, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9399510157150888, fps: 259.4682338385401
[2023/04/26 14:02:45] ppocr INFO: best metric, acc: 0.8235289273359251, is_float16: False, norm_edit_dis: 0.9215686735870547, Teacher_acc: 0.7647054325262161, Teacher_norm_edit_dis: 0.9031863314590207, fps: 276.2874812070863, best_epoch: 5
[2023/04/26 14:02:46] ppocr INFO: epoch: [9/20], global_step: 54, lr: 0.000500, acc: 0.812500, norm_edit_dis: 0.958826, Teacher_acc: 0.812500, Teacher_norm_edit_dis: 0.951574, dml_ctc_0: 1.327349, loss: 11.988683, dml_sar_0: 1.906030, loss_distance_l2_Student_Teacher_0: 0.024343, loss_ctc_Student_0: 2.604888, loss_ctc_Teacher_1: 2.742768, loss_sar_Student_0: 2.014726, loss_sar_Teacher_1: 1.953370, avg_reader_cost: 0.00045 s, avg_batch_cost: 0.24758 s, avg_samples: 25.6, ips: 103.40221 samples/s, eta: 0:01:57
[2023/04/26 14:02:54] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:02:56] ppocr INFO: epoch: [10/20], global_step: 55, lr: 0.000500, acc: 0.843750, norm_edit_dis: 0.963351, Teacher_acc: 0.812500, Teacher_norm_edit_dis: 0.952128, dml_ctc_0: 1.285616, loss: 11.801168, dml_sar_0: 1.906030, loss_distance_l2_Student_Teacher_0: 0.024182, loss_ctc_Student_0: 2.250480, loss_ctc_Teacher_1: 2.545674, loss_sar_Student_0: 2.014726, loss_sar_Teacher_1: 1.971787, avg_reader_cost: 1.81690 s, avg_batch_cost: 1.99592 s, avg_samples: 6.4, ips: 3.20654 samples/s, eta: 0:02:04
[2023/04/26 14:02:58] ppocr INFO: epoch: [10/20], global_step: 60, lr: 0.000500, acc: 0.843750, norm_edit_dis: 0.958598, Teacher_acc: 0.828125, Teacher_norm_edit_dis: 0.957156, dml_ctc_0: 1.285616, loss: 11.988683, dml_sar_0: 1.970335, loss_distance_l2_Student_Teacher_0: 0.022085, loss_ctc_Student_0: 2.604888, loss_ctc_Teacher_1: 2.742768, loss_sar_Student_0: 2.034446, loss_sar_Teacher_1: 1.953370, avg_reader_cost: 0.09832 s, avg_batch_cost: 0.47314 s, avg_samples: 32.0, ips: 67.63387 samples/s, eta: 0:01:48
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.12it/s]
[2023/04/26 14:02:59] ppocr INFO: cur metric, acc: 0.8235289273359251, norm_edit_dis: 0.9318627851787538, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9355392536043606, fps: 298.36957686127477
[2023/04/26 14:03:07] ppocr INFO: save best model is to ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:03:07] ppocr INFO: best metric, acc: 0.8235289273359251, is_float16: False, norm_edit_dis: 0.9318627851787538, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9355392536043606, fps: 298.36957686127477, best_epoch: 10
[2023/04/26 14:03:15] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:03:18] ppocr INFO: epoch: [11/20], global_step: 65, lr: 0.000500, acc: 0.875000, norm_edit_dis: 0.965351, Teacher_acc: 0.828125, Teacher_norm_edit_dis: 0.962944, dml_ctc_0: 1.250344, loss: 11.587343, dml_sar_0: 2.019257, loss_distance_l2_Student_Teacher_0: 0.021519, loss_ctc_Student_0: 2.143510, loss_ctc_Teacher_1: 2.164038, loss_sar_Student_0: 2.023615, loss_sar_Teacher_1: 1.954419, avg_reader_cost: 1.82468 s, avg_batch_cost: 2.21027 s, avg_samples: 32.0, ips: 14.47785 samples/s, eta: 0:01:40
[2023/04/26 14:03:18] ppocr INFO: epoch: [11/20], global_step: 66, lr: 0.000500, acc: 0.875000, norm_edit_dis: 0.965351, Teacher_acc: 0.828125, Teacher_norm_edit_dis: 0.962944, dml_ctc_0: 1.250344, loss: 11.587343, dml_sar_0: 2.081790, loss_distance_l2_Student_Teacher_0: 0.021744, loss_ctc_Student_0: 2.143510, loss_ctc_Teacher_1: 2.164038, loss_sar_Student_0: 1.987734, loss_sar_Teacher_1: 1.954419, avg_reader_cost: 0.00006 s, avg_batch_cost: 0.06362 s, avg_samples: 6.4, ips: 100.59391 samples/s, eta: 0:01:37
[2023/04/26 14:03:26] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:03:29] ppocr INFO: epoch: [12/20], global_step: 70, lr: 0.000500, acc: 0.859375, norm_edit_dis: 0.968764, Teacher_acc: 0.828125, Teacher_norm_edit_dis: 0.962944, dml_ctc_0: 1.184436, loss: 10.934487, dml_sar_0: 2.081790, loss_distance_l2_Student_Teacher_0: 0.021407, loss_ctc_Student_0: 1.819838, loss_ctc_Teacher_1: 1.749742, loss_sar_Student_0: 2.023615, loss_sar_Teacher_1: 1.930135, avg_reader_cost: 1.82486 s, avg_batch_cost: 2.23206 s, avg_samples: 25.6, ips: 11.46921 samples/s, eta: 0:01:33
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.03it/s]
[2023/04/26 14:03:30] ppocr INFO: cur metric, acc: 0.882352422145634, norm_edit_dis: 0.9583333578431228, Teacher_acc: 0.882352422145634, Teacher_norm_edit_dis: 0.9509804209919093, fps: 302.351144261308
[2023/04/26 14:03:38] ppocr INFO: save best model is to ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:03:38] ppocr INFO: best metric, acc: 0.882352422145634, is_float16: False, norm_edit_dis: 0.9583333578431228, Teacher_acc: 0.882352422145634, Teacher_norm_edit_dis: 0.9509804209919093, fps: 302.351144261308, best_epoch: 12
[2023/04/26 14:03:39] ppocr INFO: epoch: [12/20], global_step: 72, lr: 0.000500, acc: 0.875000, norm_edit_dis: 0.970898, Teacher_acc: 0.859375, Teacher_norm_edit_dis: 0.967383, dml_ctc_0: 1.154088, loss: 10.656655, dml_sar_0: 1.948636, loss_distance_l2_Student_Teacher_0: 0.020692, loss_ctc_Student_0: 1.976193, loss_ctc_Teacher_1: 1.885220, loss_sar_Student_0: 1.938864, loss_sar_Teacher_1: 1.930135, avg_reader_cost: 0.00015 s, avg_batch_cost: 0.13150 s, avg_samples: 12.8, ips: 97.33783 samples/s, eta: 0:01:27
[2023/04/26 14:03:46] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:03:49] ppocr INFO: epoch: [13/20], global_step: 75, lr: 0.000500, acc: 0.875000, norm_edit_dis: 0.970898, Teacher_acc: 0.875000, Teacher_norm_edit_dis: 0.967297, dml_ctc_0: 1.154088, loss: 10.656655, dml_sar_0: 1.915415, loss_distance_l2_Student_Teacher_0: 0.019598, loss_ctc_Student_0: 2.138191, loss_ctc_Teacher_1: 2.016405, loss_sar_Student_0: 1.898308, loss_sar_Teacher_1: 1.901832, avg_reader_cost: 1.81798 s, avg_batch_cost: 2.06453 s, avg_samples: 19.2, ips: 9.29994 samples/s, eta: 0:01:24
[2023/04/26 14:03:50] ppocr INFO: epoch: [13/20], global_step: 78, lr: 0.000500, acc: 0.875000, norm_edit_dis: 0.971230, Teacher_acc: 0.890625, Teacher_norm_edit_dis: 0.970617, dml_ctc_0: 1.154088, loss: 10.648561, dml_sar_0: 1.902517, loss_distance_l2_Student_Teacher_0: 0.019362, loss_ctc_Student_0: 2.138191, loss_ctc_Teacher_1: 2.016405, loss_sar_Student_0: 1.896412, loss_sar_Teacher_1: 1.886698, avg_reader_cost: 0.00018 s, avg_batch_cost: 0.19263 s, avg_samples: 19.2, ips: 99.67507 samples/s, eta: 0:01:16
[2023/04/26 14:03:58] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:04:00] ppocr INFO: epoch: [14/20], global_step: 80, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.975614, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.971639, dml_ctc_0: 0.993562, loss: 10.156738, dml_sar_0: 1.902517, loss_distance_l2_Student_Teacher_0: 0.018972, loss_ctc_Student_0: 1.733059, loss_ctc_Teacher_1: 1.787917, loss_sar_Student_0: 1.847017, loss_sar_Teacher_1: 1.857293, avg_reader_cost: 1.83297 s, avg_batch_cost: 2.05828 s, avg_samples: 12.8, ips: 6.21878 samples/s, eta: 0:01:16
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.03it/s]
[2023/04/26 14:04:01] ppocr INFO: cur metric, acc: 0.8235289273359251, norm_edit_dis: 0.9509804209919093, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.946568658881181, fps: 309.028912205883
[2023/04/26 14:04:01] ppocr INFO: best metric, acc: 0.882352422145634, is_float16: False, norm_edit_dis: 0.9583333578431228, Teacher_acc: 0.882352422145634, Teacher_norm_edit_dis: 0.9509804209919093, fps: 302.351144261308, best_epoch: 12
[2023/04/26 14:04:03] ppocr INFO: epoch: [14/20], global_step: 84, lr: 0.000500, acc: 0.875000, norm_edit_dis: 0.974403, Teacher_acc: 0.890625, Teacher_norm_edit_dis: 0.971053, dml_ctc_0: 1.062437, loss: 10.648561, dml_sar_0: 2.011392, loss_distance_l2_Student_Teacher_0: 0.018306, loss_ctc_Student_0: 2.138191, loss_ctc_Teacher_1: 2.016405, loss_sar_Student_0: 1.793074, loss_sar_Teacher_1: 1.853021, avg_reader_cost: 0.00147 s, avg_batch_cost: 0.27125 s, avg_samples: 25.6, ips: 94.37786 samples/s, eta: 0:01:06
[2023/04/26 14:04:10] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:04:13] ppocr INFO: epoch: [15/20], global_step: 85, lr: 0.000500, acc: 0.890625, norm_edit_dis: 0.974403, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.971497, dml_ctc_0: 0.985621, loss: 10.388481, dml_sar_0: 2.011392, loss_distance_l2_Student_Teacher_0: 0.017734, loss_ctc_Student_0: 1.746381, loss_ctc_Teacher_1: 1.789431, loss_sar_Student_0: 1.793074, loss_sar_Teacher_1: 1.849334, avg_reader_cost: 1.92948 s, avg_batch_cost: 2.05313 s, avg_samples: 6.4, ips: 3.11719 samples/s, eta: 0:01:07
[2023/04/26 14:04:14] ppocr INFO: epoch: [15/20], global_step: 90, lr: 0.000500, acc: 0.890625, norm_edit_dis: 0.973372, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.971497, dml_ctc_0: 0.953676, loss: 10.648561, dml_sar_0: 1.827707, loss_distance_l2_Student_Teacher_0: 0.016842, loss_ctc_Student_0: 2.138191, loss_ctc_Teacher_1: 2.030717, loss_sar_Student_0: 1.770331, loss_sar_Teacher_1: 1.823424, avg_reader_cost: 0.00032 s, avg_batch_cost: 0.32752 s, avg_samples: 32.0, ips: 97.70378 samples/s, eta: 0:00:55
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.08it/s]
[2023/04/26 14:04:15] ppocr INFO: cur metric, acc: 0.882352422145634, norm_edit_dis: 0.9656862946943364, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9392157220299674, fps: 313.4370516247011
[2023/04/26 14:04:23] ppocr INFO: save best model is to ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:04:23] ppocr INFO: best metric, acc: 0.882352422145634, is_float16: False, norm_edit_dis: 0.9656862946943364, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9392157220299674, fps: 313.4370516247011, best_epoch: 15
[2023/04/26 14:04:31] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:04:35] ppocr INFO: epoch: [16/20], global_step: 95, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.974855, Teacher_acc: 0.921875, Teacher_norm_edit_dis: 0.977304, dml_ctc_0: 0.901061, loss: 9.706725, dml_sar_0: 1.790451, loss_distance_l2_Student_Teacher_0: 0.016324, loss_ctc_Student_0: 1.481798, loss_ctc_Teacher_1: 1.571879, loss_sar_Student_0: 1.760932, loss_sar_Teacher_1: 1.790874, avg_reader_cost: 1.88214 s, avg_batch_cost: 2.34855 s, avg_samples: 32.0, ips: 13.62541 samples/s, eta: 0:00:46
[2023/04/26 14:04:35] ppocr INFO: epoch: [16/20], global_step: 96, lr: 0.000500, acc: 0.890625, norm_edit_dis: 0.972206, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.974565, dml_ctc_0: 0.947276, loss: 10.057051, dml_sar_0: 1.890679, loss_distance_l2_Student_Teacher_0: 0.016084, loss_ctc_Student_0: 1.628023, loss_ctc_Teacher_1: 1.803743, loss_sar_Student_0: 1.788018, loss_sar_Teacher_1: 1.804137, avg_reader_cost: 0.00006 s, avg_batch_cost: 0.06527 s, avg_samples: 6.4, ips: 98.04916 samples/s, eta: 0:00:44
[2023/04/26 14:04:43] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:04:46] ppocr INFO: epoch: [17/20], global_step: 100, lr: 0.000500, acc: 0.875000, norm_edit_dis: 0.971667, Teacher_acc: 0.890625, Teacher_norm_edit_dis: 0.972520, dml_ctc_0: 0.997186, loss: 10.057051, dml_sar_0: 1.790451, loss_distance_l2_Student_Teacher_0: 0.015053, loss_ctc_Student_0: 1.628023, loss_ctc_Teacher_1: 1.803743, loss_sar_Student_0: 1.795257, loss_sar_Teacher_1: 1.838318, avg_reader_cost: 1.79229 s, avg_batch_cost: 2.13328 s, avg_samples: 25.6, ips: 12.00030 samples/s, eta: 0:00:37
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.14it/s]
[2023/04/26 14:04:47] ppocr INFO: cur metric, acc: 0.8235289273359251, norm_edit_dis: 0.9473039525663024, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9428921904555743, fps: 317.4814907164166
[2023/04/26 14:04:47] ppocr INFO: best metric, acc: 0.882352422145634, is_float16: False, norm_edit_dis: 0.9656862946943364, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9392157220299674, fps: 313.4370516247011, best_epoch: 15
[2023/04/26 14:04:47] ppocr INFO: epoch: [17/20], global_step: 102, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.971667, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.972520, dml_ctc_0: 0.938884, loss: 10.015975, dml_sar_0: 1.790451, loss_distance_l2_Student_Teacher_0: 0.014847, loss_ctc_Student_0: 1.628023, loss_ctc_Teacher_1: 1.803743, loss_sar_Student_0: 1.833902, loss_sar_Teacher_1: 1.838318, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.13731 s, avg_samples: 12.8, ips: 93.21705 samples/s, eta: 0:00:33
[2023/04/26 14:04:55] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:04:59] ppocr INFO: epoch: [18/20], global_step: 105, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.971845, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.976959, dml_ctc_0: 0.914890, loss: 9.306868, dml_sar_0: 1.758324, loss_distance_l2_Student_Teacher_0: 0.014667, loss_ctc_Student_0: 1.488187, loss_ctc_Teacher_1: 1.634420, loss_sar_Student_0: 1.833902, loss_sar_Teacher_1: 1.815358, avg_reader_cost: 1.88010 s, avg_batch_cost: 2.32430 s, avg_samples: 19.2, ips: 8.26055 samples/s, eta: 0:00:28
[2023/04/26 14:05:01] ppocr INFO: epoch: [18/20], global_step: 108, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.971845, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.973967, dml_ctc_0: 0.936108, loss: 9.306868, dml_sar_0: 1.758324, loss_distance_l2_Student_Teacher_0: 0.014452, loss_ctc_Student_0: 1.488187, loss_ctc_Teacher_1: 1.572830, loss_sar_Student_0: 1.799711, loss_sar_Teacher_1: 1.787771, avg_reader_cost: 0.00020 s, avg_batch_cost: 0.33197 s, avg_samples: 19.2, ips: 57.83719 samples/s, eta: 0:00:22
[2023/04/26 14:05:08] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:05:11] ppocr INFO: epoch: [19/20], global_step: 110, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.973111, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.977477, dml_ctc_0: 0.912113, loss: 8.525324, dml_sar_0: 1.748102, loss_distance_l2_Student_Teacher_0: 0.014305, loss_ctc_Student_0: 1.239548, loss_ctc_Teacher_1: 1.284964, loss_sar_Student_0: 1.774445, loss_sar_Teacher_1: 1.782728, avg_reader_cost: 1.84409 s, avg_batch_cost: 2.02413 s, avg_samples: 12.8, ips: 6.32372 samples/s, eta: 0:00:19
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.15it/s]
[2023/04/26 14:05:12] ppocr INFO: cur metric, acc: 0.882352422145634, norm_edit_dis: 0.9620098262687297, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9502451273067879, fps: 302.45374529690474
[2023/04/26 14:05:19] ppocr INFO: save best model is to ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:05:19] ppocr INFO: best metric, acc: 0.882352422145634, is_float16: False, norm_edit_dis: 0.9620098262687297, Teacher_acc: 0.8235289273359251, Teacher_norm_edit_dis: 0.9502451273067879, fps: 302.45374529690474, best_epoch: 19
[2023/04/26 14:05:21] ppocr INFO: epoch: [19/20], global_step: 114, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.975663, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.982684, dml_ctc_0: 0.895687, loss: 8.459529, dml_sar_0: 1.800145, loss_distance_l2_Student_Teacher_0: 0.014305, loss_ctc_Student_0: 1.212382, loss_ctc_Teacher_1: 1.273640, loss_sar_Student_0: 1.764955, loss_sar_Teacher_1: 1.759473, avg_reader_cost: 0.00106 s, avg_batch_cost: 0.25626 s, avg_samples: 25.6, ips: 99.89697 samples/s, eta: 0:00:11
[2023/04/26 14:05:29] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:05:31] ppocr INFO: epoch: [20/20], global_step: 115, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.973489, Teacher_acc: 0.906250, Teacher_norm_edit_dis: 0.978094, dml_ctc_0: 0.895687, loss: 8.466958, dml_sar_0: 1.800377, loss_distance_l2_Student_Teacher_0: 0.014305, loss_ctc_Student_0: 1.189887, loss_ctc_Teacher_1: 1.273640, loss_sar_Student_0: 1.754431, loss_sar_Teacher_1: 1.782728, avg_reader_cost: 1.80880 s, avg_batch_cost: 2.02276 s, avg_samples: 6.4, ips: 3.16399 samples/s, eta: 0:00:09
[2023/04/26 14:05:32] ppocr INFO: epoch: [20/20], global_step: 120, lr: 0.000500, acc: 0.906250, norm_edit_dis: 0.975663, Teacher_acc: 0.921875, Teacher_norm_edit_dis: 0.983571, dml_ctc_0: 0.867986, loss: 8.299030, dml_sar_0: 1.800377, loss_distance_l2_Student_Teacher_0: 0.014451, loss_ctc_Student_0: 1.031372, loss_ctc_Teacher_1: 0.996874, loss_sar_Student_0: 1.749289, loss_sar_Teacher_1: 1.756403, avg_reader_cost: 0.00256 s, avg_batch_cost: 0.31664 s, avg_samples: 32.0, ips: 101.06031 samples/s, eta: 0:00:00
eval model:: 100%|████████████████████████████████| 1/1 [00:00<00:00,  1.10it/s]
[2023/04/26 14:05:33] ppocr INFO: cur metric, acc: 0.882352422145634, norm_edit_dis: 0.9620098262687297, Teacher_acc: 0.882352422145634, Teacher_norm_edit_dis: 0.9656862946943364, fps: 302.2460504173184
[2023/04/26 14:05:41] ppocr INFO: save best model is to ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:05:41] ppocr INFO: best metric, acc: 0.882352422145634, is_float16: False, norm_edit_dis: 0.9620098262687297, Teacher_acc: 0.882352422145634, Teacher_norm_edit_dis: 0.9656862946943364, fps: 302.2460504173184, best_epoch: 20
[2023/04/26 14:05:49] ppocr INFO: save model in ./output/rec_ppocr_v3_distillation/latest
[2023/04/26 14:05:49] ppocr INFO: best metric, acc: 0.882352422145634, is_float16: False, norm_edit_dis: 0.9620098262687297, Teacher_acc: 0.882352422145634, Teacher_norm_edit_dis: 0.9656862946943364, fps: 302.2460504173184, best_epoch: 20

可以看到,这里的识别准确度已经较高了,接近 90%

对比默认的模型与finetune模型的识别结果:

# 默认的模型
!python3 ./PaddleOCR/tools/eval.py \
    -c ./work/configs/ch_PP-OCRv3_rec_distillation.yml \
    -o Global.checkpoints=./ch_PP-OCRv3_rec_train/best_accuracy

微调之前的模型,结果较差:

[2023/04/26 14:07:25] ppocr INFO: metric eval ***************
[2023/04/26 14:07:25] ppocr INFO: acc:0.5882349480970893
[2023/04/26 14:07:25] ppocr INFO: norm_edit_dis:0.816106550749648
[2023/04/26 14:07:25] ppocr INFO: Teacher_acc:0.6470584429067983
[2023/04/26 14:07:25] ppocr INFO: Teacher_norm_edit_dis:0.832283011822318
[2023/04/26 14:07:25] ppocr INFO: fps:9.186677358642257

# 训练的模型
!python3 ./PaddleOCR/tools/eval.py \
    -c ./work/configs/ch_PP-OCRv3_rec_distillation.yml \
    -o Global.checkpoints=./output/rec_ppocr_v3_distillation/best_accuracy

[2023/04/26 14:07:57] ppocr INFO: Architecture : 
[2023/04/26 14:07:57] ppocr INFO:     Models : 
[2023/04/26 14:07:57] ppocr INFO:         Student : 
[2023/04/26 14:07:57] ppocr INFO:             Backbone : 
[2023/04/26 14:07:57] ppocr INFO:                 last_conv_stride : [1, 2]
[2023/04/26 14:07:57] ppocr INFO:                 last_pool_type : avg
[2023/04/26 14:07:57] ppocr INFO:                 name : MobileNetV1Enhance
[2023/04/26 14:07:57] ppocr INFO:                 scale : 0.5
[2023/04/26 14:07:57] ppocr INFO:             Head : 
[2023/04/26 14:07:57] ppocr INFO:                 head_list : 
[2023/04/26 14:07:57] ppocr INFO:                     CTCHead : 
[2023/04/26 14:07:57] ppocr INFO:                         Head : 
[2023/04/26 14:07:57] ppocr INFO:                             fc_decay : 1e-05
[2023/04/26 14:07:57] ppocr INFO:                         Neck : 
[2023/04/26 14:07:57] ppocr INFO:                             depth : 2
[2023/04/26 14:07:57] ppocr INFO:                             dims : 64
[2023/04/26 14:07:57] ppocr INFO:                             hidden_dims : 120
[2023/04/26 14:07:57] ppocr INFO:                             name : svtr
[2023/04/26 14:07:57] ppocr INFO:                             use_guide : True
[2023/04/26 14:07:57] ppocr INFO:                     SARHead : 
[2023/04/26 14:07:57] ppocr INFO:                         enc_dim : 512
[2023/04/26 14:07:57] ppocr INFO:                         max_text_length : 25
[2023/04/26 14:07:57] ppocr INFO:                 name : MultiHead
[2023/04/26 14:07:57] ppocr INFO:             Transform : None
[2023/04/26 14:07:57] ppocr INFO:             algorithm : SVTR
[2023/04/26 14:07:57] ppocr INFO:             freeze_params : False
[2023/04/26 14:07:57] ppocr INFO:             model_type : rec
[2023/04/26 14:07:57] ppocr INFO:             pretrained : None
[2023/04/26 14:07:57] ppocr INFO:             return_all_feats : True
[2023/04/26 14:07:57] ppocr INFO:         Teacher : 
[2023/04/26 14:07:57] ppocr INFO:             Backbone : 
[2023/04/26 14:07:57] ppocr INFO:                 last_conv_stride : [1, 2]
[2023/04/26 14:07:57] ppocr INFO:                 last_pool_type : avg
[2023/04/26 14:07:57] ppocr INFO:                 name : MobileNetV1Enhance
[2023/04/26 14:07:57] ppocr INFO:                 scale : 0.5
[2023/04/26 14:07:57] ppocr INFO:             Head : 
[2023/04/26 14:07:57] ppocr INFO:                 head_list : 
[2023/04/26 14:07:57] ppocr INFO:                     CTCHead : 
[2023/04/26 14:07:57] ppocr INFO:                         Head : 
[2023/04/26 14:07:57] ppocr INFO:                             fc_decay : 1e-05
[2023/04/26 14:07:57] ppocr INFO:                         Neck : 
[2023/04/26 14:07:57] ppocr INFO:                             depth : 2
[2023/04/26 14:07:57] ppocr INFO:                             dims : 64
[2023/04/26 14:07:57] ppocr INFO:                             hidden_dims : 120
[2023/04/26 14:07:57] ppocr INFO:                             name : svtr
[2023/04/26 14:07:57] ppocr INFO:                             use_guide : True
[2023/04/26 14:07:57] ppocr INFO:                     SARHead : 
[2023/04/26 14:07:57] ppocr INFO:                         enc_dim : 512
[2023/04/26 14:07:57] ppocr INFO:                         max_text_length : 25
[2023/04/26 14:07:57] ppocr INFO:                 name : MultiHead
[2023/04/26 14:07:57] ppocr INFO:             Transform : None
[2023/04/26 14:07:57] ppocr INFO:             algorithm : SVTR
[2023/04/26 14:07:57] ppocr INFO:             freeze_params : False
[2023/04/26 14:07:57] ppocr INFO:             model_type : rec
[2023/04/26 14:07:57] ppocr INFO:             pretrained : None
[2023/04/26 14:07:57] ppocr INFO:             return_all_feats : True
[2023/04/26 14:07:57] ppocr INFO:     algorithm : Distillation
[2023/04/26 14:07:57] ppocr INFO:     model_type : rec
[2023/04/26 14:07:57] ppocr INFO:     name : DistillationModel
[2023/04/26 14:07:57] ppocr INFO: Eval : 
[2023/04/26 14:07:57] ppocr INFO:     dataset : 
[2023/04/26 14:07:57] ppocr INFO:         data_dir : /home/aistudio/work/data/step_4_ocr/
[2023/04/26 14:07:57] ppocr INFO:         label_file_list : ['/home/aistudio/work/data/step_4_ocr/val_labels.txt']
[2023/04/26 14:07:57] ppocr INFO:         name : SimpleDataSet
[2023/04/26 14:07:57] ppocr INFO:         transforms : 
[2023/04/26 14:07:57] ppocr INFO:             DecodeImage : 
[2023/04/26 14:07:57] ppocr INFO:                 channel_first : False
[2023/04/26 14:07:57] ppocr INFO:                 img_mode : BGR
[2023/04/26 14:07:57] ppocr INFO:             MultiLabelEncode : None
[2023/04/26 14:07:57] ppocr INFO:             RecResizeImg : 
[2023/04/26 14:07:57] ppocr INFO:                 image_shape : [3, 48, 320]
[2023/04/26 14:07:57] ppocr INFO:             KeepKeys : 
[2023/04/26 14:07:57] ppocr INFO:                 keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio']
[2023/04/26 14:07:57] ppocr INFO:     loader : 
[2023/04/26 14:07:57] ppocr INFO:         batch_size_per_card : 32
[2023/04/26 14:07:57] ppocr INFO:         drop_last : False
[2023/04/26 14:07:57] ppocr INFO:         num_workers : 4
[2023/04/26 14:07:57] ppocr INFO:         shuffle : False
[2023/04/26 14:07:57] ppocr INFO: Global : 
[2023/04/26 14:07:57] ppocr INFO:     cal_metric_during_train : True
[2023/04/26 14:07:57] ppocr INFO:     character_dict_path : /home/aistudio/PaddleOCR/ppocr/utils/ppocr_keys_v1.txt
[2023/04/26 14:07:57] ppocr INFO:     checkpoints : ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:07:57] ppocr INFO:     debug : False
[2023/04/26 14:07:57] ppocr INFO:     distributed : False
[2023/04/26 14:07:57] ppocr INFO:     epoch_num : 20
[2023/04/26 14:07:57] ppocr INFO:     eval_batch_step : [0, 10]
[2023/04/26 14:07:57] ppocr INFO:     infer_img : /home/aistudio/work/data/step_4_ocr/val/Field_WIN_20230220_14_56_07_Pro.jpg
[2023/04/26 14:07:57] ppocr INFO:     infer_mode : False
[2023/04/26 14:07:57] ppocr INFO:     log_smooth_window : 20
[2023/04/26 14:07:57] ppocr INFO:     max_text_length : 25
[2023/04/26 14:07:57] ppocr INFO:     pretrained_model : /home/aistudio/ch_PP-OCRv3_rec_train/best_accuracy
[2023/04/26 14:07:57] ppocr INFO:     print_batch_step : 5
[2023/04/26 14:07:57] ppocr INFO:     save_epoch_step : 300
[2023/04/26 14:07:57] ppocr INFO:     save_inference_dir : None
[2023/04/26 14:07:57] ppocr INFO:     save_model_dir : ./output/rec_ppocr_v3_distillation
[2023/04/26 14:07:57] ppocr INFO:     save_res_path : ./output/rec/predicts_ppocrv3_distillation.txt
[2023/04/26 14:07:57] ppocr INFO:     use_gpu : True
[2023/04/26 14:07:57] ppocr INFO:     use_space_char : True
[2023/04/26 14:07:57] ppocr INFO:     use_visualdl : False
[2023/04/26 14:07:57] ppocr INFO: Loss : 
[2023/04/26 14:07:57] ppocr INFO:     loss_config_list : 
[2023/04/26 14:07:57] ppocr INFO:         DistillationDMLLoss : 
[2023/04/26 14:07:57] ppocr INFO:             act : softmax
[2023/04/26 14:07:57] ppocr INFO:             dis_head : ctc
[2023/04/26 14:07:57] ppocr INFO:             key : head_out
[2023/04/26 14:07:57] ppocr INFO:             model_name_pairs : [['Student', 'Teacher']]
[2023/04/26 14:07:57] ppocr INFO:             multi_head : True
[2023/04/26 14:07:57] ppocr INFO:             name : dml_ctc
[2023/04/26 14:07:57] ppocr INFO:             use_log : True
[2023/04/26 14:07:57] ppocr INFO:             weight : 1.0
[2023/04/26 14:07:57] ppocr INFO:         DistillationDMLLoss : 
[2023/04/26 14:07:57] ppocr INFO:             act : softmax
[2023/04/26 14:07:57] ppocr INFO:             dis_head : sar
[2023/04/26 14:07:57] ppocr INFO:             key : head_out
[2023/04/26 14:07:57] ppocr INFO:             model_name_pairs : [['Student', 'Teacher']]
[2023/04/26 14:07:57] ppocr INFO:             multi_head : True
[2023/04/26 14:07:57] ppocr INFO:             name : dml_sar
[2023/04/26 14:07:57] ppocr INFO:             use_log : True
[2023/04/26 14:07:57] ppocr INFO:             weight : 0.5
[2023/04/26 14:07:57] ppocr INFO:         DistillationDistanceLoss : 
[2023/04/26 14:07:57] ppocr INFO:             key : backbone_out
[2023/04/26 14:07:57] ppocr INFO:             mode : l2
[2023/04/26 14:07:57] ppocr INFO:             model_name_pairs : [['Student', 'Teacher']]
[2023/04/26 14:07:57] ppocr INFO:             weight : 1.0
[2023/04/26 14:07:57] ppocr INFO:         DistillationCTCLoss : 
[2023/04/26 14:07:57] ppocr INFO:             key : head_out
[2023/04/26 14:07:57] ppocr INFO:             model_name_list : ['Student', 'Teacher']
[2023/04/26 14:07:57] ppocr INFO:             multi_head : True
[2023/04/26 14:07:57] ppocr INFO:             weight : 1.0
[2023/04/26 14:07:57] ppocr INFO:         DistillationSARLoss : 
[2023/04/26 14:07:57] ppocr INFO:             key : head_out
[2023/04/26 14:07:57] ppocr INFO:             model_name_list : ['Student', 'Teacher']
[2023/04/26 14:07:57] ppocr INFO:             multi_head : True
[2023/04/26 14:07:57] ppocr INFO:             weight : 1.0
[2023/04/26 14:07:57] ppocr INFO:     name : CombinedLoss
[2023/04/26 14:07:57] ppocr INFO: Metric : 
[2023/04/26 14:07:57] ppocr INFO:     base_metric_name : RecMetric
[2023/04/26 14:07:57] ppocr INFO:     ignore_space : False
[2023/04/26 14:07:57] ppocr INFO:     key : Student
[2023/04/26 14:07:57] ppocr INFO:     main_indicator : acc
[2023/04/26 14:07:57] ppocr INFO:     name : DistillationMetric
[2023/04/26 14:07:57] ppocr INFO: Optimizer : 
[2023/04/26 14:07:57] ppocr INFO:     beta1 : 0.9
[2023/04/26 14:07:57] ppocr INFO:     beta2 : 0.999
[2023/04/26 14:07:57] ppocr INFO:     lr : 
[2023/04/26 14:07:57] ppocr INFO:         decay_epochs : [700]
[2023/04/26 14:07:57] ppocr INFO:         name : Piecewise
[2023/04/26 14:07:57] ppocr INFO:         values : [0.0005, 5e-05]
[2023/04/26 14:07:57] ppocr INFO:         warmup_epoch : 5
[2023/04/26 14:07:57] ppocr INFO:     name : Adam
[2023/04/26 14:07:57] ppocr INFO:     regularizer : 
[2023/04/26 14:07:57] ppocr INFO:         factor : 3e-05
[2023/04/26 14:07:57] ppocr INFO:         name : L2
[2023/04/26 14:07:57] ppocr INFO: PostProcess : 
[2023/04/26 14:07:57] ppocr INFO:     key : head_out
[2023/04/26 14:07:57] ppocr INFO:     model_name : ['Student', 'Teacher']
[2023/04/26 14:07:57] ppocr INFO:     multi_head : True
[2023/04/26 14:07:57] ppocr INFO:     name : DistillationCTCLabelDecode
[2023/04/26 14:07:57] ppocr INFO: Train : 
[2023/04/26 14:07:57] ppocr INFO:     dataset : 
[2023/04/26 14:07:57] ppocr INFO:         data_dir : /home/aistudio/work/data/step_4_ocr/
[2023/04/26 14:07:57] ppocr INFO:         ext_op_transform_idx : 1
[2023/04/26 14:07:57] ppocr INFO:         label_file_list : ['/home/aistudio/work/data/step_4_ocr//train_labels.txt']
[2023/04/26 14:07:57] ppocr INFO:         name : SimpleDataSet
[2023/04/26 14:07:57] ppocr INFO:         transforms : 
[2023/04/26 14:07:57] ppocr INFO:             DecodeImage : 
[2023/04/26 14:07:57] ppocr INFO:                 channel_first : False
[2023/04/26 14:07:57] ppocr INFO:                 img_mode : BGR
[2023/04/26 14:07:57] ppocr INFO:             RecConAug : 
[2023/04/26 14:07:57] ppocr INFO:                 ext_data_num : 2
[2023/04/26 14:07:57] ppocr INFO:                 image_shape : [48, 320, 3]
[2023/04/26 14:07:57] ppocr INFO:                 max_text_length : 25
[2023/04/26 14:07:57] ppocr INFO:                 prob : 0.5
[2023/04/26 14:07:57] ppocr INFO:             RecAug : None
[2023/04/26 14:07:57] ppocr INFO:             MultiLabelEncode : None
[2023/04/26 14:07:57] ppocr INFO:             RecResizeImg : 
[2023/04/26 14:07:57] ppocr INFO:                 image_shape : [3, 48, 320]
[2023/04/26 14:07:57] ppocr INFO:             KeepKeys : 
[2023/04/26 14:07:57] ppocr INFO:                 keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio']
[2023/04/26 14:07:57] ppocr INFO:     loader : 
[2023/04/26 14:07:57] ppocr INFO:         batch_size_per_card : 32
[2023/04/26 14:07:57] ppocr INFO:         drop_last : True
[2023/04/26 14:07:57] ppocr INFO:         num_workers : 4
[2023/04/26 14:07:57] ppocr INFO:         shuffle : True
[2023/04/26 14:07:57] ppocr INFO: profiler_options : None
[2023/04/26 14:07:57] ppocr INFO: train with paddle 2.4.1 and device Place(gpu:0)
[2023/04/26 14:07:57] ppocr INFO: Initialize indexs of datasets:['/home/aistudio/work/data/step_4_ocr/val_labels.txt']
W0426 14:07:57.791968 24777 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0426 14:07:57.796504 24777 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[2023/04/26 14:08:00] ppocr INFO: The parameter type is float16, which is converted to float32 when loading
[2023/04/26 14:08:00] ppocr INFO: resume from ./output/rec_ppocr_v3_distillation/best_accuracy
[2023/04/26 14:08:00] ppocr INFO: metric in ckpt ***************
[2023/04/26 14:08:00] ppocr INFO: acc:0.882352422145634
[2023/04/26 14:08:00] ppocr INFO: is_float16:True
[2023/04/26 14:08:00] ppocr INFO: norm_edit_dis:0.9620098262687297
[2023/04/26 14:08:00] ppocr INFO: Teacher_acc:0.882352422145634
[2023/04/26 14:08:00] ppocr INFO: Teacher_norm_edit_dis:0.9656862946943364
[2023/04/26 14:08:00] ppocr INFO: fps:302.2460504173184
[2023/04/26 14:08:00] ppocr INFO: best_epoch:20
[2023/04/26 14:08:00] ppocr INFO: start_epoch:21
eval model:: 100%|████████████████████████████████| 1/1 [00:02<00:00,  2.46s/it]
[2023/04/26 14:08:02] ppocr INFO: metric eval ***************
[2023/04/26 14:08:02] ppocr INFO: acc:0.882352422145634
[2023/04/26 14:08:02] ppocr INFO: norm_edit_dis:0.9656862946943364
[2023/04/26 14:08:02] ppocr INFO: Teacher_acc:0.882352422145634
[2023/04/26 14:08:02] ppocr INFO: Teacher_norm_edit_dis:0.9656862946943364
[2023/04/26 14:08:02] ppocr INFO: fps:9.246709134798062

默认的预训练模型 acc0.588,经过finetune之后可以达到 0.882

看看具体识别的文字:

plt.figure(figsize=(2, 2))
plt.imshow(cv2.imread('work/data/step_4_ocr/val/Unit_WIN_20230220_14_56_07_Pro.jpg'))
<matplotlib.image.AxesImage at 0x7f077c8789a0>

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-6N2OZzUi-1688145265032)(main_files/main_59_1.png)]

# 默认的模型
!python3 ./PaddleOCR/tools/infer_rec.py \
    -c ./work/configs/ch_PP-OCRv3_rec_distillation.yml \
    -o Global.pretrained_model=./ch_PP-OCRv3_rec_train/best_accuracy \
    Global.infer_img=work/data/step_4_ocr/val/Unit_WIN_20230220_14_56_07_Pro.jpg

[2023/04/26 14:10:19] ppocr INFO: infer_img: work/data/step_4_ocr/val/Unit_WIN_20230220_14_56_07_Pro.jpg
[2023/04/26 14:10:20] ppocr INFO: 	 result: {"Student": {"label": "uT", "score": 0.7819735407829285}, "Teacher": {"label": "UT", "score": 0.7493851184844971}}
[2023/04/26 14:10:20] ppocr INFO: success!

# 训练的模型
!python3 ./PaddleOCR/tools/infer_rec.py \
    -c ./work/configs/ch_PP-OCRv3_rec_distillation.yml \
    -o Global.pretrained_model=./output/rec_ppocr_v3_distillation/best_accuracy \
    Global.infer_img=work/data/step_4_ocr/val/Unit_WIN_20230220_14_56_07_Pro.jpg

[2023/04/26 14:11:01] ppocr INFO: infer_img: work/data/step_4_ocr/val/Unit_WIN_20230220_14_56_07_Pro.jpg
[2023/04/26 14:11:03] ppocr INFO: 	 result: {"Student": {"label": "μT", "score": 0.9993734359741211}, "Teacher": {"label": "μT", "score": 0.9988270998001099}}
[2023/04/26 14:11:03] ppocr INFO: success!

默认的预训练模型将图片识别为 uT,而finetune之后正确识别为 μT

图片识别

经过上面的模型训练之后,就可以上线部署了。

这里简单看一下图片识别的流程与结果。

首先读取一张图片:

img_filename = './work/data/train/WIN_20230220_14_56_27_Pro.jpg'
img_raw = cv2.imread(img_filename)
plt.imshow(img_raw)
<matplotlib.image.AxesImage at 0x7f0777b39700>

在这里插入图片描述

resize图片到512×512用于显示屏分割。

img = cv2.resize(img_raw, (512, 512))
cv2.imwrite('./work/prediction/predict_img.jpg', img)
True

利用PaddleSeg预测分割框:

!python ./PaddleSeg/tools/predict.py \
       --config ./work/configs/pp_liteseg_optic_disc_512x512_1k.yml \
       --model_path ./output/best_model/model.pdparams \
       --image_path ./work/prediction/predict_img.jpg \
       --save_dir ./work/prediction/seg/

利用opencv将分割的图片保存下来:

img_seg_predict = cv2.imread('work/prediction/seg/pseudo_color_prediction/predict_img.png')
biggest = get_contours(img_seg_predict)
img_det = get_det_img(img_seg_predict, img_raw, biggest)

plt.figure(figsize=(16, 6))
plt.subplot(131)
plt.imshow(img_seg_predict[..., ::-1])

plt.subplot(132)
plt.imshow(img_raw[..., ::-1])

plt.subplot(133)
plt.imshow(img_det[..., ::-1])   

<matplotlib.image.AxesImage at 0x7f0777ceaca0>

在这里插入图片描述

cv2.imwrite('./work/prediction/predict_screen.jpg', img_det)
True

利用PaddleClas预测这个屏幕属于哪种仪器。

./PaddleClas/tools/infer.py 没有保存结果的配置项,这里手动保存日志并提取结果。

!python ./PaddleClas/tools/infer.py \
    -c ./work/configs/ShuffleNetV2_x0_25.yaml  \
    -o Infer.infer_imgs=./work/prediction/predict_screen.jpg \
    -o Global.pretrained_model=./output/ShuffleNetV2_x0_25/best_model \
    > ./work/prediction/clas_log.txt
/opt/conda/envs/python35-paddle120-env/lib/python3.9/site-packages/sklearn/utils/multiclass.py:14: DeprecationWarning: Please use `spmatrix` from the `scipy.sparse` namespace, the `scipy.sparse.base` namespace is deprecated.
  from scipy.sparse.base import spmatrix
W0426 15:12:16.758939 31866 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0426 15:12:16.763239 31866 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
with open('./work/prediction/clas_log.txt') as f:
    prediction_clas = json.loads(f.readlines()[-1].strip()[1:-1].replace('\'', '\"'))
prediction_clas
{'class_ids': [0],
 'scores': [0.99865],
 'file_name': './work/prediction/predict_screen.jpg',
 'label_names': []}

这里预测仪器屏幕为种类 0

根据屏幕类别选择mask种类:

prediction_clas = prediction_clas['class_ids'][0]

predict_mask = None
if prediction_clas == 0:
    predict_mask = t0_mask
elif prediction_clas == 1:
    predict_mask = t1_mask
else:
    raise ValueError

keys = []
with open('./work/data/step_3_mask/class_list.txt') as f:
    for line in f.readlines()[1:]:
        keys.append(line.strip())
print(keys)
['Info_Probe', 'Freq_Set', 'Freq_Main', 'Val_Total', 'Val_X', 'Val_Y', 'Val_Z', 'Unit', 'Field']

利用mask分割图片并保存下来:

!mkdir ./work/prediction/ocr/

for key in keys:
    _mask = predict_mask[key]
    # 判断有没有这个字段
    if np.sum(_mask) > 0:
        box = get_mask_box(_mask, threshold=0, margin=0)
        ocr_img = img_det[box[1][1]:box[2][1], box[0][0]:box[1][0], :][..., ::-1]
        filename = key + '.jpg'
        filename = './work/prediction/ocr/'+filename
        cv2.imwrite(filename, ocr_img)
        print(filename)
./work/prediction/ocr/Info_Probe.jpg
./work/prediction/ocr/Freq_Set.jpg
./work/prediction/ocr/Val_Total.jpg
./work/prediction/ocr/Val_X.jpg
./work/prediction/ocr/Val_Y.jpg
./work/prediction/ocr/Val_Z.jpg
./work/prediction/ocr/Unit.jpg
./work/prediction/ocr/Field.jpg

en.jpg’,
‘label_names’: []}

这里预测仪器屏幕为种类 0

根据屏幕类别选择mask种类:

prediction_clas = prediction_clas['class_ids'][0]

predict_mask = None
if prediction_clas == 0:
    predict_mask = t0_mask
elif prediction_clas == 1:
    predict_mask = t1_mask
else:
    raise ValueError

keys = []
with open('./work/data/step_3_mask/class_list.txt') as f:
    for line in f.readlines()[1:]:
        keys.append(line.strip())
print(keys)
['Info_Probe', 'Freq_Set', 'Freq_Main', 'Val_Total', 'Val_X', 'Val_Y', 'Val_Z', 'Unit', 'Field']

利用mask分割图片并保存下来:

!mkdir ./work/prediction/ocr/

for key in keys:
    _mask = predict_mask[key]
    # 判断有没有这个字段
    if np.sum(_mask) > 0:
        box = get_mask_box(_mask, threshold=0, margin=0)
        ocr_img = img_det[box[1][1]:box[2][1], box[0][0]:box[1][0], :][..., ::-1]
        filename = key + '.jpg'
        filename = './work/prediction/ocr/'+filename
        cv2.imwrite(filename, ocr_img)
        print(filename)
./work/prediction/ocr/Info_Probe.jpg
./work/prediction/ocr/Freq_Set.jpg
./work/prediction/ocr/Val_Total.jpg
./work/prediction/ocr/Val_X.jpg
./work/prediction/ocr/Val_Y.jpg
./work/prediction/ocr/Val_Z.jpg
./work/prediction/ocr/Unit.jpg
./work/prediction/ocr/Field.jpg
plt.imshow(cv2.imread('./work/prediction/ocr/Info_Probe.jpg'))
<matplotlib.image.AxesImage at 0x7f0777a59970>

在这里插入图片描述

利用PaddleOCR识别这些分割的图片:

!python3 ./PaddleOCR/tools/infer_rec.py \
    -c ./work/configs/ch_PP-OCRv3_rec_distillation.yml \
    -o Global.pretrained_model=./output/rec_ppocr_v3_distillation/best_accuracy \
    Global.infer_img=./work/prediction/ocr/ \
    Global.save_res_path=./work/prediction/predicts_ppocrv3_distillation.txt

读取预测结果:

results_teacher = {}
results_student = {}
with open('./work/prediction/predicts_ppocrv3_distillation.txt') as f:
    for line in f.readlines():
        _filename, _result = line.strip().split('\t')
        _key = _filename.split('/')[-1].split('.')[0]
        _value_s = json.loads(_result)['Teacher']['label']
        _value_t = json.loads(_result)['Student']['label']
        results_student[_key] = _value_s
        results_teacher[_key] = _value_t
results_teacher, results_student
({'Field': '磁场',
  'Freq_Set': '100Hz',
  'Info_Probe': '探头:LF-01',
  'Unit': 'μT',
  'Val_Total': '1.8330',
  'Val_X': ':1.8077',
  'Val_Y': ':0.1609',
  'Val_Z': 'z-0.2573'},
 {'Field': '磁场',
  'Freq_Set': '100Hz',
  'Info_Probe': '探头:LF-01',
  'Unit': 'μm',
  'Val_Total': '1.8330',
  'Val_X': ':1.8077',
  'Val_Y': '.0.1609',
  'Val_Z': '.0.2573'})

最后,这里还可以进行后处理,比如 μm 明显错误,应该是 μT.0.2573 不是正常数字,应该是 0.2573

总结

本文演示了如何利用 PaddleOCR/PaddleClas/PaddleSeg 进行工频场强计读数识别。

这里归纳一下几个重点环节的准确度:

模型评价指标Score
PaddleSegmIoU0.9805
PaddleClastop11.0
PaddleOCRacc0.882352422145634

可以看到,基于此次数据集的模型,最终识别结果接近 90% ,而且,这只是在标注了 27 张图片的基础上得到的。

如果后续标注更多数据、进行数据集增强,并选用更优秀的识别模型,结果将更具实用价值。

一些思考

  • 为什么不用 KIE 或者 检测-识别 模型

    这里是尝试过使用 layoutxlm 模型,以及 检测-识别 方案(参考《基于PP-OCRv3的电表检测识别》),但是在少量标注的情况下,检测模型的 H-Means 只能到 0.5 左右,这与使用 PaddleClas+Mask 的方法相去甚远。

    由于检测模型是整个流程的关键点,如果这里出错,后面的识别准确性大大降低,所以,在数据集有限的情况下,这里选用 PaddleClas+Mask+PaddleOCR 的方案。

    另外,工业场景不同于电表监测这类民用场景,工业场景相对固定,以本文为例,检测仪表一般不会轻易更换,而且仪表的种类也相对有限。这对于 PaddleClas+Mask+PaddleOCR 的方案更有利。

  • 优劣

    • 检测-识别 适应性好,PaddleClas+Mask 的方法如果分类一错,后面都错。
    • 检测-识别 的精准度没有 PaddleClas+Mask 好。
  • 局限性

    无论使用哪种方案,数据本身限制了模型的精度上限。

    如果测试集中出现新的仪表,无论什么方案,可能都需要重新学习。

    就以人类自己的学习过程而言,如果只学会识别其中一种仪器,当拿到新设备后都需要重新学习,更不用说使用模型识别,检测-识别 方案、PaddleClas+Mask+PaddleOCR 的方案都有此局限性。

上图右边的仪表有两个频率值,而左边的只有一个,对于外行人来说,如果只看过右边的仪表,再拿到左边的仪表,同样是无法读数的。

遗留问题

  • PaddleClas 要手动读取结果,是否有更简单的方法?
  • PaddleSeg/PaddleClas/PaddleOCR 模型可以导出后进行部署。
  • 标注更多的图片、更好的模型可以得到更好的结果。

文件目录

.
├── ch_PP-OCRv3_rec_train # 识别模型的预训练模型
│   └── best_accuracy.pdparams
├── main.ipynb # 此 notebook
├── output
│   ├── best_model # 分割模型训练目录
│   │   └── model.pdparams
│   ├── iter_1000 # 分割模型训练目录
│   │   ├── model.pdopt
│   │   └── model.pdparams
│   ├── rec # 验证集识别结果
│   │   └── predicts_ppocrv3_distillation.txt
│   ├── rec_ppocr_v3_distillation # 识别模型训练目录
│   │   ├── best_accuracy.pdopt
│   │   ├── best_accuracy.pdparams
│   │   ├── best_accuracy.states
│   │   ├── config.yml
│   │   ├── latest.pdopt
│   │   ├── latest.pdparams
│   │   ├── latest.states
│   │   └── train.log
│   ├── ShuffleNetV2_x0_25 # 分类模型训练目录
│   │   ├── best_model.pdopt
│   │   ├── best_model.pdparams
│   │   ├── best_model.pdstates
│   │   ├── eval.log
│   │   ├── infer.log
│   │   ├── latest.pdopt
│   │   ├── latest.pdparams
│   │   ├── latest.pdstates
│   │   └── train.log
│   └── vdlrecords.1682433212.log
├── PaddleClas
│   ├── ...
├── PaddleOCR
│   ├── ...
├── PaddleSeg
│   ├── ...
└── work
    ├── configs # 各个模型的配置文件
    │   ├── ch_PP-OCRv3_rec_distillation.yml
    │   ├── pp_liteseg_optic_disc_512x512_1k.yml
    │   └── ShuffleNetV2_x0_25.yaml
    ├── data # 数据目录
    │   ├── added_prediction
    │   ├── digital_rec_hackon_train.zip # 原始数据
    │   ├── pseudo_color_prediction
    │   ├── step_1_512_img
    │   ├── step_1_512_img_anno_seg_box.json # 分割标注文件
    │   ├── step_1_screen
    │   ├── step_1_seg
    │   ├── step_2_clas
    │   ├── step_3_mask
    │   ├── step_3_screen_anno_kie.json # 关键信息标注文件
    │   ├── step_4_ocr
    │   └── train # 原始数据解压后得到
    └── prediction # 图片识别结果目录
        ├── clas_log.txt
        ├── ocr
        ├── predict_img.jpg
        ├── predict_screen.jpg
        ├── predicts_ppocrv3_distillation.txt
        └── seg

此文章为搬运
原项目链接

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值