Han Xin's approach: the more, the better
最后一舞,接下来请欣赏60个Python实战小节目
运行系统:macOS Sonoma 14.6.1
Python编译器:PyCharm 2024.1.4 (Community Edition)
Python版本:3.12
往期链接:
本文只介绍每个包主要做什么及一些比较经典的示例。对于包的进一步学习,推荐参考对应包的官方文档。
136 PyPDF2–PDF编辑库
PyPDF2版本3.0.1,官方文档
PyPDF2 是一个用于操作 PDF 文件的 Python 库,支持读取、合并、拆分和修改 PDF 文件。尽管它的功能相对基础,但非常适合进行简单的 PDF 操作。主要功能:
- 读取 PDF 文件:
提取文本、元数据和页面信息。
- 合并 PDF 文件:
将多个 PDF 文件合并为一个文件。
- 拆分 PDF 文件:
从 PDF 文件中提取特定的页面。
- 添加水印:
将一个 PDF 文件作为水印添加到另一个 PDF 文件上。
- 旋转页面:
旋转 PDF 文件中的页面。
import PyPDF2
def read_pdf(file_path):
with open(file_path, "rb") as file:
reader = PyPDF2.PdfReader(file)
text = ""
for page in reader.pages:
text += page.extract_text() + "\n"
return text
def merge_pdfs(pdf_list, output_path):
pdf_writer = PyPDF2.PdfWriter()
for pdf_file in pdf_list:
with open(pdf_file, "rb") as file:
reader = PyPDF2.PdfReader(file)
for page in reader.pages:
pdf_writer.add_page(page)
with open(output_path, "wb") as output_file:
pdf_writer.write(output_file)
def split_pdf(file_path, page_number, output_path):
with open(file_path, "rb") as file:
reader = PyPDF2.PdfReader(file)
pdf_writer = PyPDF2.PdfWriter()
pdf_writer.add_page(reader.pages[page_number])
with open(output_path, "wb") as output_file:
pdf_writer.write(output_file)
def add_watermark(input_pdf, watermark_pdf, output_pdf):
with open(input_pdf, "rb") as original_file, open(watermark_pdf, "rb") as watermark_file:
original_reader = PyPDF2.PdfReader(original_file)
watermark_reader = PyPDF2.PdfReader(watermark_file)
pdf_writer = PyPDF2.PdfWriter()
for page in original_reader.pages:
page.merge_page(watermark_reader.pages[0])
pdf_writer.add_page(page)
with open(output_pdf, "wb") as output_file:
pdf_writer.write(output_file)
def rotate_page(input_pdf, page_number, angle, output_pdf):
with open(input_pdf, "rb") as file:
reader = PyPDF2.PdfReader(file)
pdf_writer = PyPDF2.PdfWriter()
page = reader.pages[page_number]
page.rotate_clockwise(angle)
pdf_writer.add_page(page)
with open(output_pdf, "wb") as output_file:
pdf_writer.write(output_file)
if __name__ == "__main__":
print("Reading PDF:")
print(read_pdf("example.pdf"))
merge_pdfs(["file1.pdf", "file2.pdf"], "merged.pdf")
print("Merged PDFs into merged.pdf")
split_pdf