1.非扫描件
模块
pip install pdf2docx
代码
from pdf2docx import Converter
pdf_file = 'C:/Users/woodwolf/Desktop/01.pdf'
docx_file = 'C:/Users/woodwolf/Desktop/02.docx'
cv = Converter(pdf_file)
cv.convert(docx_file, start=0, end=None)
输出
[INFO] Start to convert C:/Users/woodwolf/Desktop/01.pdf
[INFO] [1;36m[1/4] Opening document