在Ubuntu 18.04上安装和使用Tesseract 4

量子指南 (QUANTRIUM GUIDES)

Today, the extraction of information from scanned documents such as letters, write-ups, invoices, etc. has become an integral part of your business processes. To accomplish this task, you need to setup an OCR software to extract the information from these scanned documents or pdfs.

如今,从扫描的文档中提取信息,例如信件,信件,发票等,已成为您业务流程中不可或缺的一部分。 要完成此任务,您需要安装OCR软件以从这些扫描的文档或pdf中提取信息。

Here we will take you through the process of building and installing Tesseract 4.x on your Ubuntu 18.04 machine. There are two ways to install Tesseract 4.x.:

在这里,我们将带您完成在Ubuntu 18.04计算机上构建和安装Tesseract 4.x的过程。 有两种安装Tesseract 4.x的方法:

One is installing the Tesseract 4.0.0 beta version, it is easy to install and can be done using couple of commands.

一种是安装Tesseract 4.0.0 beta版本,它易于安装,可以使用几个命令来完成。

Alternatively, you can install Tesseract 4.1.1 version, the latest stable release of Tesseract. In this post, we will guide you how to install each one of them on your Ubuntu 18.04 Machine.

或者,您可以安装Tesseract 4.1.1版本( Tesseract的最新稳定版本)。 在本文中,我们将指导您如何在Ubuntu 18.04机器上安装它们中的每一个。

If you are not familiar with build tools and building from GitHub repositories, then installing Tesseract 4.0.0 beta is better way for you. However, if you are experienced in building and installing applications from GitHub repositories you can skip the next section and jump directly to section Installing Tesseract 4.1.1.

如果您不熟悉构建工具以及如何从GitHub存储库构建,那么安装Tesseract 4.0.0 beta是您的更好方法。 但是,如果您有从GitHub存储库构建和安装应用程序的经验,则可以跳过下一部分,直接跳至安装Tesseract 4.1.1。

安装Tesseract 4.0.0 Beta (Installing Tesseract 4.0.0 beta)

Installing Tesseract 4.0.0 beta version is quite simple to install and can be done using the following apt commands:

安装Tesseract 4.0.0 beta版非常容易安装,可以使用以下apt命令完成:

$ sudo apt install tesseract-ocr
$ sudo apt install libtesseract-dev

Once you have run these two commands, check, if you have successfully installed tesseract by running the following command:

运行这两个命令后,通过运行以下命令来检查是否已成功安装tesseract:

$ tesseract --version

After running this command, you should something like this:

运行此命令后,应执行以下操作:

tesseract 4.0.0-beta.1 
leptonica-1.75.3

Or something along those lines if your installation was successful. If you it is not installed properly, you will get some errors. That means you have to check for your operating system and versions. These commands work only on Ubuntu 18.04 or higher.

如果安装成功,则遵循这些原则。 如果未正确安装,则会出现一些错误。 这意味着您必须检查操作系统和版本。 这些命令仅适用于Ubuntu 18.04或更高版本。

Once your tesseract installation is successful, you can run the following command to check which languages are supported by your installed version of tesseract:

成功安装tesseract之后,可以运行以下命令来检查已安装的tesseract版本支持哪些语言:

$ tesseract --list-langs

You can expect the following output:

您可以期待以下输出:

List of available languages (2):
eng
osd

The eng means, it can detect English language and osd refers that it can detect orientation and script.

eng表示可以检测英语,而osd则可以检测方向和脚本。

Well Congratulations! You have successfully installed Tesseract 4.0.0 beta on your system and its ready to use it.

好恭喜! 您已经在系统上成功安装了Tesseract 4.0.0 beta,并且可以使用它了。

在Ubuntu 18.04上安装tesseract 4.1.1: (Installing tesseract 4.1.1 on Ubuntu 18.04:)

In this section, we take you through the steps to build and install tesseract 4.1.1 from the following tesseract’s GitHub repository:

在本节中,我们将引导您从以下tesseract的GitHub存储库构建和安装tesseract 4.1.1的步骤:

Bef

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值