工程链接:https://github.com/VamosC/CLIP4STR
下载工程链接工程,下载模型clip4str_base16x16_d70bde1f2d.ckpt和ViT-B-16.pt;
首先根据工程中的README.md进行环境处理:
Requires `Python >= 3.8` and `PyTorch >= 1.12`.
The following commands are tested on a Linux machine with CUDA Driver Version `525.105.17` and CUDA Version `11.3`.
```
conda create --name clip4str python==3.8
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 -c pytorch
pip install -r requirements.txt
具体步骤:
1.指定环境,创立虚拟环境
conda create --name /home/fxp/fxp/envs/CLIP4STR python==3.8
2.启动虚拟环境
source activate /home/fxp/fxp/envs/CLIP4STR
3.装指定的应用库
1)
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 -c pytorch
2)
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
其次,修改路径
CLIP_PATH = '/PUT/YOUR/PATH/HERE/pretrained/clip'
在CLIP4STR-main/strhub/models/vl_str/system.py中的line22行;
根据README.md进行测试:
bash scripts/read.sh 0 clip4str_base16x16_d70bde1f2d.ckpt misc/test_images
我在pycharm中配置参数:
/home/fxp/4tdisk/code/certificate_reader/CLIP4STR-main/weights/clip4str_base16x16_d70bde1f2d.ckpt
--images_path
/home/fxp/4tdisk/code/certificate_reader/CLIP4STR-main/misc/test_image
测试可输出正常字符识别结果;
在运用中遇到的问题:
1.
RuntimeError: The NVIDIA driver on your system is too old (found version 10010).
Please update your GPU driver by downloading and installing a new version from the
URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to:
https://pytorch.org to install a PyTorch version that has been compiled with your
version of the CUDA driver.
解决办法:直接使用新的显卡驱动,显卡驱动的安装参考:
ubuntu重装cuda,cudnn,并挂载硬盘到home_cudnn重新安装-CSDN博客
后续:
因为是为了检查人工标注的字符,所以才想到用这个大模型,但是模型的推理尺寸是224*224,短的文本行识别效果还是可以,太长的文本行效果不如paddleOCR的服务器大模型,所以就没有使用该模型做标签质检;