python处理pdf文件_处理PDF

Python可以从中提取文本后读取PDF文件并打印出内容。 为此,必须首先安装所需的模块PyPDF2,以下是安装模块的命令。应该已经在python环境中安装了pip。

pip install pypdf2

成功安装此模块后,可以使用模块中提供的方法读取PDF文件。

import PyPDF2

pdfName = 'path\Yiibaipoint.pdf'

read_pdf = PyPDF2.PdfFileReader(pdfName)

page = read_pdf.getPage(0)

page_content = page.extractText()

print page_content

当运行上面的程序时,我们得到以下输出 -

Yiibai Point originated from the idea that there exists a class of readers who respond better

to online content and prefer to learn new skills at their own pace from the comforts of their

drawing rooms.

The journey commenced with a single tutorial on HTML in 2006 and elated by the response

it generated, we worked our way to adding fresh tutorials to our repository which now

proudly flaunts a wealth of tutorials and allied articles on topics ranging from programming

languages to web designing to academics and much more.

读取多个页面

要阅读包含多个页面的pdf并使用页码打印每个页面,使用带有getPageNumber()函数的循环。 在下面的例子中有两个页面的PDF文件。内容在两个单独的页面标题下打印。

import PyPDF2

pdfName = 'Path\Yiibaispoint2.pdf'

read_pdf = PyPDF2.PdfFileReader(pdfName)

for i in xrange(read_pdf.getNumPages()):

page = read_pdf.getPage(i)

print 'Page No - ' + str(1+read_pdf.getPageNumber(page))

page_content = page.extractText()

print page_content

执行上面示例代码,得到以下结果 -

Page No - 1

Yiibai Point originated from the idea that there exists a class of readers who respond better to

online content and prefer to learn new skills at their own pace from the comforts of their drawing

rooms.

Page No - 2

The journey commenced with a single tutorial on HTML in 2006 and elated by the response it

generated, we worked our way to adding fresh tutorials to our repository which now proudly flaunts

a wealth of tutorials and allied articles on topics ranging from p

rogramming languages to web

designing to academics and much more.

¥ 我要打赏

纠错/补充

收藏

加QQ群啦,易百教程官方技术学习群

注意:建议每个人选自己的技术方向加群,同一个QQ最多限加 3 个群。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值