win10 paddleocr环境搭建与使用

你说的对.

已于 2023-01-10 16:20:05 修改

阅读量1.4k

点赞数 1

分类专栏：笔记文章标签： windows

于 2023-01-04 11:59:14 首次发布

本文链接：https://blog.csdn.net/qq_44465286/article/details/128545187

版权

笔记专栏收录该内容

2 篇文章 0 订阅

订阅专栏

文章详细介绍了如何在Win10系统下配置PyCharm、Anaconda3和VS环境，包括下载安装工具、修改环境变量等步骤。接着，重点讲述了如何在PyCharm中使用Anaconda创建环境，安装PaddlePaddle和PaddleOCR，以及处理安装过程中可能出现的问题。最后，提供了PaddleOCR的训练模型配置和训练过程。

摘要由CSDN通过智能技术生成

环境

win10系统下pycharm、Anaconda3、VS

VS环境搭建

1.下载vs工具路径
https://visualstudio.microsoft.com/visual-cpp-build-tools/

2.安装工具后，选择如下
在这里插入图片描述
其中window10的SDK根据自己电脑选择

3.更改环境变量
在环境变量中创建INCLUDE（如果已存在，则无需创建），新增路径“C:\Program Files (x86)\Windows Kits\10\Include\SDK版本\ucrt”。环境变量多个值之间，用英文分号“;”隔开。
环境变量中创建INCLUDE（如果已存在，则无需创建），新增路径“C:\Program Files (x86)\Windows Kits\10\Include\SDK版本\shared”。环境变量多个值之间，用英文分号“;”隔开。
环境变量中创建LIB（如果已存在，则无需创建），新增路径“C:\Program Files (x86)\Windows Kits\10\Lib\SDK版本\um\x64”
环境变量中创建LIB（如果已存在，则无需创建），新增路径“C:\Program Files (x86)\Windows Kits\10\Lib\SDK版本\ucrt\x64”
环境变量中创建Path（如果已存在，则无需创建），新增路径“C:\Program Files (x86)\Windows Kits\10\bin\SDK版本\x64”

4.配置完成后重启电脑

pycharm环境配置

1.使用Anaconda3创建一个环境，pycharm 3.6
2.创建一个pycharm工程，选择刚刚创建的环境
3.安装paddlepaddle，命令行输入

pip install paddlepaddle-gpu==2.4.1 -i https://mirror.baidu.com/pypi/simple

4.下载paddleocr源码，命令行输入

git clone https://gitee.com/paddlepaddle/PaddleOCR.git

5.进入PaddleOCR目录下找到requirements.txt，修改opencv版本。默认安装新版会有问题
在这里插入图片描述

6.安装PyMuPDF，默认安装会有问题
PyMuPDF-1.18.4百度网盘连接如下
链接：https://pan.baidu.com/s/1l58nnT0tidqYjgn1Xb2C0g?pwd=rz4p
提取码：rz4p

下载到pycharm工程同级目录后，命令行输入安装

 pip install PyMuPDF-1.18.4-cp36-cp36m-win_amd64.whl

7.如有其他包安装报错可以去下面这里下载whl安装
https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely

8.安装paddleocr, 命令行输入

cd PaddleOCR
pip install -r requirements.txt

没有报错就是安装成功，如lanms-neo==1.0.2报错，是vs的环境没有搭建好

paddleocr训练模型

1.修改configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml文件如下

Global:
  debug: false
  use_gpu: true  #  是否使用gpu
  epoch_num: 100  # 训练步数
  log_smooth_window: 20  
  print_batch_step: 10
  save_model_dir: ./output/ch_PP-OCR_v3_det/  #  模型保存路径
  save_epoch_step: 10
  eval_batch_step:
  - 0
  - 100
  cal_metric_during_train: false
  pretrained_model: ./pretrain_models/ch_PP-OCRv3_det_distill_train/best_accuracy.pdparams  # 预加载模型路径
  checkpoints: null
  save_inference_dir: null
  use_visualdl: false
  infer_img: doc/imgs_en/img_10.jpg
  save_res_path: ./checkpoints/det_db/predicts_db.txt
  distributed: true

Architecture:
  name: DistillationModel
  algorithm: Distillation
  model_type: det
  Models:
    Student:
      pretrained:
      model_type: det
      algorithm: DB
      Transform: null
      Backbone:
        name: MobileNetV3
        scale: 0.5
        model_name: large
        disable_se: true
      Neck:
        name: RSEFPN
        out_channels: 96
        shortcut: True
      Head:
        name: DBHead
        k: 50
    Student2:
      pretrained:
      model_type: det
      algorithm: DB
      Transform: null
      Backbone:
        name: MobileNetV3
        scale: 0.5
        model_name: large
        disable_se: true
      Neck:
        name: RSEFPN
        out_channels: 96
        shortcut: True
      Head:
        name: DBHead
        k: 50
    Teacher:
      freeze_params: true
      return_all_feats: false
      model_type: det
      algorithm: DB
      Backbone:
        name: ResNet_vd
        in_channels: 3
        layers: 50
      Neck:
        name: LKPAN
        out_channels: 256
      Head:
        name: DBHead
        kernel_list: [7,2,2]
        k: 50

Loss:
  name: CombinedLoss
  loss_config_list:
  - DistillationDilaDBLoss:
      weight: 1.0
      model_name_pairs:
      - ["Student", "Teacher"]
      - ["Student2", "Teacher"]
      key: maps
      balance_loss: true
      main_loss_type: DiceLoss
      alpha: 5
      beta: 10
      ohem_ratio: 3
  - DistillationDMLLoss:
      model_name_pairs:
      - ["Student", "Student2"]
      maps_name: "thrink_maps"
      weight: 1.0
      model_name_pairs: ["Student", "Student2"]
      key: maps
  - DistillationDBLoss:
      weight: 1.0
      model_name_list: ["Student", "Student2"]
      balance_loss: true
      main_loss_type: DiceLoss
      alpha: 5
      beta: 10
      ohem_ratio: 3

Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.002
    warmup_epoch: 2
  regularizer:
    name: L2
    factor: 5.0e-05

PostProcess:
  name: DistillationDBPostProcess
  model_name: ["Student"]
  key: head_out
  thresh: 0.3
  box_thresh: 0.6
  max_candidates: 1000
  unclip_ratio: 1.5

Metric:
  name: DistillationMetric
  base_metric_name: DetMetric
  main_indicator: hmean
  key: "Student"

Train:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/icdar2015/text_localization/
    label_file_list:
      - ./train_data/icdar2015/text_localization/train/Label.txt
    ratio_list: [1.0]
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - CopyPaste:
    - IaaAugment:
        augmenter_args:
        - type: Fliplr
          args:
            p: 0.5
        - type: Affine
          args:
            rotate:
            - -10
            - 10
        - type: Resize
          args:
            size:
            - 0.5
            - 3
    - EastRandomCropData:
        size:
        - 960
        - 960
        max_tries: 50
        keep_ratio: true
    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image
        - threshold_map
        - threshold_mask
        - shrink_map
        - shrink_mask
  loader:
    shuffle: true
    drop_last: false
    batch_size_per_card: 2
    num_workers: 2
    use_shared_memory: False

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/icdar2015/text_localization/
    label_file_list:
      - ./train_data/icdar2015/text_localization/test/Label.txt
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - DetLabelEncode: # Class handling label
      - DetResizeForTest:
      - NormalizeImage:
          scale: 1./255.
          mean: [0.485, 0.456, 0.406]
          std: [0.229, 0.224, 0.225]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: ['image', 'shape', 'polys', 'ignore_tags']
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 1 # must be 1
    num_workers: 2
    use_shared_memory: False

2.其中预加载模型去飞浆官网下载
在这里插入图片描述
选择训练模型

3.训练模型

python tools/train.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml

参考 https://www.ngui.cc/el/837915.html?action=onClick
参考 https://blog.csdn.net/suiyingy/article/details/126682769
参考 https://blog.csdn.net/qq_49627063/article/details/119134847