python批量保存图像先快后慢,Pytesseract太慢了。如何使其更快地处理图像?

I am using pytesseract in the below code:

def fnd():

for fname in list:

x = None

x = np.array([np.array(PIL.Image.open(fname))])

print x.size

for im in x:

txt = pytesseract.image_to_string(image=im).encode('utf-8').strip()

open("Output.txt","a+").write(txt)

with open("Output.txt") as openfile:

for line in openfile:

for part in line.split():

if "cyber" in part.lower():

print(line)

return

The list contains names of images from a folder (2408*3506 & 300 res Gray-scaled). Unfortunately for around 35 images the total processing time is around 1400-1500 seconds.

Is there a way I can reduce the processing time?

解决方案

Pytesseract writes and reads every image you pass it. This is unnecessary when running 35 images. Instead, you should use a python tesseract API interface. This will be significantly faster.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值