python识别文字软件_CRAFT + CRNN 文本识别工具

最新推荐文章于 2024-06-16 09:42:17 发布

weixin_39804603

最新推荐文章于 2024-06-16 09:42:17 发布

阅读量362

点赞数

文章标签： python识别文字软件

该博文介绍了使用预训练的神经网络进行文本检测和识别的工具。它利用CRAFT架构进行文本区域检测，并通过CRNN进行文字识别。作者将该应用部署在Heroku上，但由于内存限制，性能受限。提供了Windows和Linux的安装指南，并给出了运行示例。此解决方案目前是一个演示，展示了如何在部署环境中使用PyTorch模型。

摘要由CSDN通过智能技术生成

68747470733a2f2f6170692e7472617669732d63692e636f6d2f73336e682f7079746f7263682d746578742d7265636f676e6974696f6e2e7376673f6272616e63683d6d6173746572

Text detection and recognition

This repository contains tool which allow to detect region with text and translate it one by one.

Description

Two pretrained neural networks are used. One of them is responsible for detecting places in which text appear and return its coordinates. Structure use for this operation is based on CRAFT architecture.

Second network take detected words and recognize words included inside it. Convolutional Recurrential neural networks (CRNN) are used for this operation.

Example

Under construction

Deployment

I decided to deploy it on heroku (temporarily solution), but the amount of memory available on this platform is not enough. You can check it on heroku app. I decided to add bootstrap template because whole solution become more intuitive.

Windows Installation

To install it locally, you can run from your virtual env

python -m pip install requirements.txt

Linux installation

to install it properly on Linux OS you have to install additionaly

apt-get update

apt-get install -y libsm6 libxext6 libxrender-dev

pip install opencv-python

If problems with cv2 imports are still appearing then you should install

pip install opencv-contrib-python

Then you can run

```python

python -m pip install requirements.txt

Run

To run it locally, please activate your environment

> win

venv\Scripts\activate.bat

>linux

source venv\Scripts\activate

and run straight from project origin

python app.py

If everything goes properly, you'll see on localhost:8000, screen just like one below.

front_.PNG?raw=True

Updates

I decided to remove argparse, because as I mention earlier, it was less intuitive. Solution is not fast, is more like an toy example which shows how to use Pytorch model on deployment environment.

Version which I use here contain torch-cpu which make preprocessing and detecting slightly slower. I test it on cuda and it was much faster.

If you have more information, drop me a line If you like it, give a star

Draft: Show how does it work on complex .tif example document.