python之调用pdf2docx

最新推荐文章于 2024-05-06 21:40:02 发布

沐岚浩

最新推荐文章于 2024-05-06 21:40:02 发布

阅读量3.2k

点赞数 3

分类专栏：应用文章标签： python PDF转word 免费 pdf2word

本文链接：https://blog.csdn.net/weixin_45729594/article/details/120240071

版权

应用专栏收录该内容

3 篇文章 0 订阅

订阅专栏

python之调用pdf2docx

pdf2docx支持Windows和Linux平台，要求Python版本>=3.6

文档：API Documentation

下载

pip install pdf2docx

使用

可以使用类Converter 和方法 pasrse()。

使用类Converter

Sample One

from pdf2docx import Converter
pdf_file = '/path/to/sample.pdf'
docx_file = 'path/to/sample.docx'
# convert pdf to docx
cv = Converter(pdf_file)
cv.convert(docx_file)      # all pages by default
cv.close()

Sample Two

import os
from pdf2docx import Converter

def ConverterByFolder(sourcePath,targetPath):
    """
        用到 os.listdir(path)  获取目录下的文件 返回的是数组
        os.path.splitext(fileName)  分离文件的名字  和  后缀 返回的是元组('fun','.png')
        """
    sufixx = '.docx'
    dir_list = os.listdir(sourcePath)
    for l in dir_list:
        pdfsufixx = os.path.splitext(l)[1]
        if pdfsufixx == ".pdf":
            pdf_file = sourcePath +'\\' +l
            fileName = os.path.splitext(l)[0]
            docx_file = targetPath + "\\"+ fileName + sufixx
            cv = Converter(pdf_file)
            cv.convert(docx_file)  # all pages by default
            cv.close()
            print(l,"=="*10,'>',"done")

if __name__ == '__main__':
    ConverterByFolder("F:\Desktop","F:\Desktop")

使用方法parse()

Sample One

from pdf2docx import parse

pdf_file = '/path/to/sample.pdf'
docx_file = 'path/to/sample.docx'

# convert pdf to docx
parse(pdf_file, docx_file)

Sample Two

import os
from pdf2docx import parse

def ConverterByFolder_parse(sourcePath,targetPath):
    """
        用到 os.listdir(path)  获取目录下的文件 返回的是数组
        os.path.splitext(fileName)  分离文件的名字  和  后缀 返回的是元组('fun','.png')
        """
    sufixx = '.docx'
    dir_list = os.listdir(sourcePath)
    for l in dir_list:
        pdfsufixx = os.path.splitext(l)[1]
        if pdfsufixx == ".pdf":
            pdf_file = sourcePath +'\\' +l
            fileName = os.path.splitext(l)[0]
            docx_file = targetPath + "\\"+ fileName + sufixx
            parse(pdf_file,docx_file)
            print(l,"=="*10,'>',"done")

if __name__ == '__main__':
    ConverterByFolder_parse("F:\Desktop","F:\Desktop")

沐岚浩

关注

3
点赞
踩
15

收藏

觉得还不错? 一键收藏
0
评论
python之调用pdf2docx

python之调用pdf2docxpdf2docx支持Windows和Linux平台，要求Python版本>=3.6文档：API Documentation下载pip install pdf2docx使用可以使用类Converter 和方法 pasrse()。使用类ConverterSample Onefrom pdf2docx import Converterpdf_file = '/path/to/sample.pdf'docx_file = 'path/to/samp
复制链接

扫一扫