【Python】合并两个Pdf文件

lin_xi_xi

已于 2024-05-29 19:48:58 修改

阅读量334

点赞数 6

分类专栏： Python 文章标签： python pdf 前端

于 2024-05-29 19:48:24 首次发布

本文链接：https://blog.csdn.net/lin_xi_xi/article/details/139304143

版权

Python 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

在现代数字化时代，PDF文件已经成为了我们日常生活中不可或缺的一部分。无论是学习、工作还是娱乐，我们都可能需要处理各种PDF文件。然而，市场上的许多PDF编辑器都需要下载和安装，甚至有些还需要付费。而Python的一个名为PyPDF2的包可以轻松地实现PDF文件的合并、拆分、旋转等操作，而无需下载任何额外的软件或插件。只需安装Python和PyPDF2包，就可以在任何计算机上使用这些功能。这使得处理PDF文件变得更加简单和高效。

此外，PyPDF2还具有跨平台的优势，可以在Windows、macOS和Linux等多种操作系统上运行。这意味着无论你使用的是哪种操作系统，都可以轻松地使用PyPDF2来处理PDF文件。

1.Python合并两个PDF文件

from PyPDF2 import PdfReader, PdfMerger

def merge_pdfs(file1, file2, output):
    pdf_merger = PdfMerger()
    pdf1 = PdfReader(file1)
    pdf2 = PdfReader(file2)
    pdf_merger.merge(0, pdf1)
    pdf_merger.merge(0, pdf2)
    with open(output, 'wb') as f:
        pdf_merger.write(f)

file1 = 'F:/file1.pdf'
file2 = 'F:/file2.pdf'
output = 'F:/combine-create.pdf'

merge_pdfs(file1, file2, output)

2.读取PDF文档信息：

import PyPDF2
 
# 打开PDF文件
with open('example.pdf', 'rb') as file:
    # 创建一个PdfFileReader对象
    pdf = PyPDF2.PdfFileReader(file)
 
    # 获取PDF文件的页数
    num_pages = pdf.numPages
    print("页数:", num_pages)
 
    # 获取PDF文件的元数据
    metadata = pdf.getDocumentInfo()
    print("标题:", metadata.title)
    print("作者:", metadata.author)
    print("创建时间:", metadata.created)

4.提取文本内容：

import PyPDF2
 
# 打开PDF文件
with open('example.pdf', 'rb') as file:
    # 创建一个PdfFileReader对象
    pdf = PyPDF2.PdfFileReader(file)
 
    # 提取第一页的文本内容
    page = pdf.getPage(0)
    text = page.extractText()
    print(text)

5.拆分PDF文档：

import PyPDF2
 
# 打开PDF文件
with open('example.pdf', 'rb') as file:
    # 创建一个PdfFileReader对象
    pdf = PyPDF2.PdfFileReader(file)
 
    # 拆分文档，将每一页保存到单独的文件中
    for page_num in range(pdf.numPages):
        output_pdf = PyPDF2.PdfFileWriter()
        output_pdf.addPage(pdf.getPage(page_num))
 
        with open(f'page{page_num + 1}.pdf', 'wb') as output_file:
            output_pdf.write(output_file)

6.添加水印：

import PyPDF2
 
# 打开PDF文件
with open('example.pdf', 'rb') as file:
    # 创建一个PdfFileReader对象
    pdf = PyPDF2.PdfFileReader(file)
 
    # 创建一个PdfFileWriter对象
    output_pdf = PyPDF2.PdfFileWriter()
 
    # 打开水印文件
    with open('watermark.pdf', 'rb') as watermark_file:
        # 创建一个PdfFileReader对象
        watermark = PyPDF2.PdfFileReader(watermark_file)
 
        # 将水印添加到每一页
        for page_num in range(pdf.numPages):
            page = pdf.getPage(page_num)
            page.mergePage(watermark.getPage(0))
            output_pdf.addPage(page)
 
    # 保存带有水印的PDF文件
    with open('watermarked_document.pdf', 'wb') as output_file:

延申资料*

P有PDF学习指南：PyPDF2 安装_w3cschool

lin_xi_xi

关注

6
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
【Python】合并两个Pdf文件

在现代数字化时代，PDF文件已经成为了我们日常生活中不可或缺的一部分。无论是学习、工作还是娱乐，我们都可能需要处理各种PDF文件。然而，市场上的许多PDF编辑器都需要下载和安装，甚至有些还需要付费。而Python的一个名为PyPDF2的包可以轻松地实现PDF文件的合并、拆分、旋转等操作，而无需下载任何额外的软件或插件。只需安装Python和PyPDF2包，就可以在任何计算机上使用这些功能。这使得处理PDF文件变得更加简单和高效。
复制链接

扫一扫