基于python的office操作

最新推荐文章于 2024-08-21 08:21:10 发布

志者～不俗

最新推荐文章于 2024-08-21 08:21:10 发布

阅读量5.9k

点赞数 1

分类专栏： python学习文章标签： python

python学习专栏收录该内容

2 篇文章 0 订阅

订阅专栏

基于python的office操作

记录一些用python做文档处理的知识以备以后再用。

pdf操作

因为今天要交申报材料，需要一些证书的pdf，在手机上拍照用app扫描全能王做pdf的话会留水印，除非交钱注册vip，另外在网上找的一些pdf在线处理网站也是各种需要注册缴费，干脆自己学下用python处理算了。

首先用扫描全能王得到证书的jpg文档（直接拍也行，但是扫描全能王可以抓边框做锐化，还是挺好的，我要是经常用就注册了），然后用python把jpg转pdf，教程如下（注意照片必须是竖着的，如果照片是横着的，会横着缩放到pdf里，我还没搞明白这是怎么回事，也不能旋转再转pdf，因为转的时候会在图片两边生成黑边然后一起输入pdf，还得裁剪掉黑边，太麻烦还不如一开始就准备好竖着的图片）

https://blog.csdn.net/ycc297876771/article/details/81005298blog.csdn.net

程序如下：

import sys

from reportlab.lib.pagesizes import portrait

from reportlab.pdfgen import canvas

def imgtopdf(input_paths, outputpath):

(maxw, maxh) = [Image.open](https://link.zhihu.com/?target=http%3A//Image.open)(input_paths).size #确认图片尺寸

c = canvas.Canvas(outputpath, pagesize=portrait((maxw, maxh))) #按图片尺寸生成空白pdf

c.drawImage(input_paths, 0, 0, maxw, maxh) #把图片抄到pdf

c.showPage()

[c.save](https://link.zhihu.com/?target=http%3A//c.save)()

imgtopdf(“2b.jpg”, “2b.pdf”) #根据默认文件夹下的图片生成pdf

如果需要合并多个pdf，则采用如下教程：一个用于合并pdf的简单Python脚本如果需要合并多个pdf，则采用如下教程，对默认文件夹下所有pdf进行合并，注意某些pdf合成会失败，不知道是不是版本问题：

一个用于合并pdf的简单Python脚本www.jianshu.com

程序如下：

import PyPDF2

import os

import re

def main():

# find all the pdf files in current directory.

mypath = os.getcwd()

pattern = r"\.pdf$"

file_names_lst = [mypath + "\\" + f for f in os.listdir(mypath) if [re.search](https://link.zhihu.com/?target=http%3A//re.search)(pattern, f, re.IGNORECASE)

and not [re.search](https://link.zhihu.com/?target=http%3A//re.search)(r'Merged.pdf',f)]

# merge the file.

opened_file = [open(file_name,'rb') for file_name in file_names_lst]

pdfFM = PyPDF2.PdfFileMerger()

for file in opened_file:

pdfFM.append(file)

# output the file.

with open(mypath + "\\Merged.pdf", 'wb') as write_out_file:

pdfFM.write(write_out_file)

# close all the input files.

for file in opened_file:

file.close()

if __name__ == '__main__':

	main()

如果要清除某些页面：

#Python#用软件删除PDF中的空白页，竟然收费?!果断用Pythonwww.jianshu.com

程序如下：

import PyPDF2

original = r'1.pdf'

new = r'2.pdf'

original_pdf = PyPDF2.PdfFileReader(original)

page8 = original_pdf.getPage(8)

page16 = original_pdf.getPage(16)

pdfWriter = PyPDF2.PdfFileWriter()

pdfWriter.addPage(page8)

pdfWriter.addPage(page16)

with open(new, 'wb') as f:

pdfWriter.write(f)