paddleocr - 训练

MXL147

已于 2023-08-25 10:29:15 修改

阅读量536

点赞数

文章标签：深度学习

于 2023-06-05 16:05:48 首次发布

本文链接：https://blog.csdn.net/qq_39066502/article/details/131045983

版权

训练文字检测

下载预训练模型放到pretrain_models下并解压，找到预训练模型对应的配置文件，例如：configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml。修改Train和val的路径：

Global:
  use_gpu: true
  epoch_num: 1200
  log_smooth_window: 20
  print_batch_step: 2
  save_model_dir: ./output/ch_db_res18/
  save_epoch_step: 1200
  # evaluation is run every 5000 iterations after the 4000th iteration
  eval_batch_step: [0, 10]
Train:
  dataset:
    name: SimpleDataSet
    data_dir: /home/data/det_data/
    label_file_list:
      - /home/data/det_data/det_train_label.txt
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/data/det_data/
    label_file_list:
      - /home/data/det_data/det_val_label.txt

启动训练：

python tools/train.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o Global.pretrained_model=pretrain_models/ch_ppocr_server_v2.0_det_train/best_accuracy \
        Global.epoch_num=50 Global.save_epoch_step=20 Global.save_model_dir=output/det/ Train.loader.batch_size_per_card=8 Train.loader.num_workers=2 

# 断点重开
python tools/train.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o Global.checkpoints=output/det/latest.pdparams \
        Global.epoch_num=50 Global.save_epoch_step=20 Global.save_model_dir=output/det/ Train.loader.batch_size_per_card=8 Train.loader.num_workers=2

模型存储在output/det/下：

转inference模型：

python tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o Global.pretrained_model=output/det/best_accuracy Global.save_inference_dir=inference/det/

模型存储在inference/det/下：

转onnx模型:

paddle2onnx --model_dir inference/det/ --model_filename inference.pdmodel --params_filename inference.pdiparams \
            --save_file onnx_model/det.onnx --opset_version 10 --input_shape_dict="{'x':[-1,3,-1,-1]}" --enable_onnx_checker True

模型存储在onnx_model下

训练方向分类器

下载预训练模型放到pretrain_models下并解压，找到预训练模型对应的配置文件，例如：configs/cls/cls_mv3.yml。修改Train和val的路径:

Global:
  use_gpu: true
  epoch_num: 100
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/cls/mv3/
  save_epoch_step: 3
  # evaluation is run every 5000 iterations after the 4000th iteration
  eval_batch_step: [0, 10]
Train:
  dataset:
    name: SimpleDataSet
    data_dir: /home/data/cls_data/
    label_file_list:
      - /home/data/cls_data/cls_train_label.txt
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/data/cls_data/
    label_file_list:
      - /home/data/cls_data/cls_val_label.txt

启动训练：

python tools/train.py -c configs/cls/cls_mv3.yml -o Global.pretrained_model=pretrain_models/ch_ppocr_mobile_v2.0_cls_train/best_accuracy \
        Global.epoch_num=50 Global.save_epoch_step=20 Global.save_model_dir=output/cls/ Train.loader.batch_size_per_card=8 Train.loader.num_workers=2

模型存储在output/cls/下：

转inference模型：

python tools/export_model.py -c configs/cls/cls_mv3.yml -o Global.pretrained_model=output/cls/best_accuracy Global.save_inference_dir=inference/cls/

模型存储在inference/det/下：

转onnx模型:

paddle2onnx --model_dir inference/cls/ --model_filename inference.pdmodel --params_filename inference.pdiparams \
            --save_file onnx_model/cls.onnx --opset_version 10 --input_shape_dict="{'x':[-1,3,-1,-1]}" --enable_onnx_checker True

模型存储在onnx_model下

训练文字识别

下载预训练模型放到pretrain_models下并解压，找到预训练模型对应的配置文件，例如：configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml。修改Train和val的路径，还有字典的路径（此处字典用默认的）

Global:
  use_gpu: true
  epoch_num: 500
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/rec_chinese_common_v2.0
  save_epoch_step: 3
  # evaluation is run every 5000 iterations after the 4000th iteration
  eval_batch_step: [0, 10]
Train:
  dataset:
    name: SimpleDataSet
    data_dir: /home/data/res_data/
    label_file_list: ["/home/data/res_data/res_train_label.txt"]
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/data/res_data/
    label_file_list: ["/home/data/res_data/res_val_label.txt"]

启动训练：

python tools/train.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml -o Global.pretrained_model=pretrain_models/ch_ppocr_server_v2.0_rec_train/best_accuracy \
      Global.character_dict_path=ppocr/utils/ppocr_keys_v1.txt Global.epoch_num=50 Global.save_epoch_step=20 Global.save_model_dir=output/rec/ \
      Train.loader.batch_size_per_card=8 Train.loader.num_workers=2

模型存储在output/rec/下：

转inference模型：

python tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml -o Global.pretrained_model=output/rec/best_accuracy \
      Global.save_inference_dir=inference/rec/ Global.character_dict_path=ppocr/utils/ppocr_keys_v1.txt

模型存储在inference/rec/下：

转onnx模型:

paddle2onnx --model_dir inference/rec/ --model_filename inference.pdmodel --params_filename inference.pdiparams \
            --save_file onnx_model/rec.onnx --opset_version 10 --input_shape_dict="{'x':[-1,3,-1,-1]}" --enable_onnx_checker True

模型存储在onnx_model下

模型推理

用训练得到的inference模型做推理：

python tools/infer/predict_system.py  --image_dir doc/imgs/00111002.jpg \
                                      --det_model_dir inference/det/ \
                                      --rec_model_dir inference/rec/ \
                                      --cls_model_dir inference/cls/ \
                                      --use_angle_cls True \
                                      --use_space_char True

用训练得到的onnx做推理：

可以根据需要把onnx模型的input_size做下修改，修改后的模型放在onnx_inference下：

python tools/infer/predict_system.py --use_gpu=False --use_onnx=True \
                                    --det_model_dir=onnx_inference/det.onnx  \
                                    --rec_model_dir=onnx_inference/rec.onnx  \
                                    --cls_model_dir=onnx_inference/cls.onnx  \
                                    --image_dir=doc/imgs/00111002.jpg \
                                    --rec_char_dict_path=ppocr/utils/ppocr_keys_v1.txt

MXL147

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
paddleocr - 训练

下载预训练模型放到pretrain_models下并解压，找到预训练模型对应的配置文件，例如：configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml。下载预训练模型放到pretrain_models下并解压，找到预训练模型对应的配置文件，例如：configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml。模型存储在onnx_model下。模型存储在onnx_model下。
复制链接

扫一扫