按照下面的操作进行Python的OCR文字识别(识别PDF文字)
https://pythontips.com/2016/02/25/ocr-on-pdf-files-using-python/
http://blog.topspeedsnail.com/archives/3571
运行出现错误
wand.exceptions.PolicyError: not authorized `/tmp/xxx.pdf' @ .......
wand未授权,这是wand所调用ImageMagick的配置问题,需要修改/etc/ImageMagick-6/policy.xml文件
sudo vi /etc/ImageMagick-6/policy.xml
找到
<policy domain="coder" rights="none" pattern="PDF" />
修改为
<policy domain="coder" rights="read|write" pattern="PDF" />
除了PDF,其它类型文件也会出现这种错误,修改相应的条目就好了
<policy domain="cache" nam