关闭

windows下pytesseract识别验证码遇到的WindowsError: [Error 2] 的解决方法

标签: pythonpytesseractWindowsError错误tesseract_cmd
911人阅读 评论(0) 收藏 举报

安装PIL+pytesseract
安装很简单,参考http://www.waitalone.cn/python-php-ocr.html

从http://www.lfd.uci.edu/~gohlke/pythonlibs/里面下载pillow选择自己的版本即可, 我是2.7,然而这里有个问题,明明我机子是64位的,我下载了64位的whl然后pip安装的时候居然报错了,说格式不支持,然后我就去下载32位了,居然特么的安装上了。算了....


然后

pip install pytesseract

安装成功后执行脚本:


from PIL import Image
from pytesseract import image_to_string
image = Image.open(r'7364.png')  # Open image object using PIL

<pre name="code" class="python"><pre name="code" class="plain">报错,错误如下:


Traceback (most recent call last):
  File "F:/spider/test.py", line 4, in <module>
    print image_to_string(image)     # Run tesseract.exe on image  
  File "C:\Users\tandazhao\spider_venv\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string
    config=config)
  File "C:\Users\tandazhao\spider_venv\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract
    stderr=subprocess.PIPE)
  File "C:\Python27\Lib\subprocess.py", line 711, in __init__
    errread, errwrite)
  File "C:\Python27\Lib\subprocess.py", line 959, in _execute_child
    startupinfo)
WindowsError: [Error 2] 

print image_to_string(image) # Run tesseract.exe on image

上网找解决方法,说是pytesseract.py 里面的

tesseract_cmd = 'tesseract' 改成  tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

好,我改

再次运行,嗯,再次报错

Traceback (most recent call last):
  File "F:/spider/test.py", line 4, in <module>
    print image_to_string(image)     # Run tesseract.exe on image  
  File "C:\Users\tandazhao\spider_venv\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string
    config=config)
  File "C:\Users\tandazhao\spider_venv\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract
    stderr=subprocess.PIPE)
  File "C:\Python27\Lib\subprocess.py", line 711, in __init__
    errread, errwrite)
  File "C:\Python27\Lib\subprocess.py", line 959, in _execute_child
    startupinfo)
WindowsError: [Error 2] 

呵呵哒,仔细看命令,发现windows下\t转义了。。。。然后在tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'前面加个r,

tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

执行,OK,识别出来了

C:\Users\tandazhao\spider_venv\Scripts\python.exe F:/spider/test.py
7364

Process finished with exit code 0


哈哈哈



1
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:10731次
    • 积分:313
    • 等级:
    • 排名:千里之外
    • 原创:20篇
    • 转载:9篇
    • 译文:0篇
    • 评论:2条
    文章分类
    最新评论