python textract能够帮助你从图片和各种文档识别文字
测试环境:
1. win7_64/win10_64
2. python3.7_64
oonnley.com - 算工资工具
textract安装
pip install extract
Textract dependencies
If you use pip install textract, then it will support to extract data from docx, xlsx, pptx.
- If you want textract support OCR(optical character recognition), you need to install tesseract: