微信公众号图文转PDF-CSDN博客

本文链接：https://blog.csdn.net/qq_35859258/article/details/122566383

文章目录

微信公众号的图片转为pdf（python）

微信公众号的图片转为pdf（python）

0.写在前面

个人会在微信公众号中看到很多不错的文章，想要打印下来慢慢看，然而直接打印效果不好，经过探索，我采用 “下载-转换-合并”的方式完成。主要过程如下：

1.图片下载

微信公众号的文章采用的是webp格式的图片，这是一种比jpeg更小的图片格式，然而很多图片阅读器不支持，并且不能直接加载到pdf编辑器或者ppt中。
如要下载文章内容的图片，需要按照以下步骤操作：
- 在浏览器中打开该文章
- 将该文章慢慢拖到底，加载所有图片，否则不能加载完全
- 保存该网页为 “网页，全部(*.htm;*.html)" 格式
- 此时会出现一个html格式的网页文件和同名的文件夹
- 文件夹内的640开头的文件即为webp格式的文件

在这里插入图片描述

2.图片转换

webp格式文件不能直接导入到ppt或者pdf中，需要转换为常规图片格式，例如png。

为640的文件添加webp后缀

import os

path="F:\\DOWNLOAD"
files = os.listdir()

for filename in files:
	if not filename.endswith(".py"):
	    newname = filename + ".webp"
	    os.rename(filename,newname)

使用以下脚本将webp格式文件转换为png格式文件
- 此处可能报错
  
  cannot import name '_imaging' from 'PIL'
  
  这是因为pillow库版本较老，需要更新
- 卸载后重新安装即可完成更新
```
pip uninstall pillow
pip install pillow
```
- 将以下代码文件复制到需要转换的图片文件夹下，运行py文件即可完成转换

# 功能 : 将当前工作目录下所有webp格式转为png or jpg
# -*- coding: UTF-8 -*-
import os
from PIL import Image

# 返回当前工作目录
CURRENT_PATH = os.getcwd()

# 转换格式
IMG_EXP = ".png"

# 获取最高所有文件
cur_all_files = os.listdir(CURRENT_PATH)
# 转换列表
imgList = []


# 遍历文件夹，储存webp格式的路径到列表内
def findFileForImage(filePath):
    child_all_files = os.listdir(filePath)
    for child_file_name in child_all_files:
        sPath = os.path.join(filePath, child_file_name)
        if os.path.isdir(sPath):
            findFileForImage(sPath)
        n, e = os.path.splitext(child_file_name)
        if e.lower() == ".webp":
            imgList.append(os.path.join(filePath, n))


# 检索目录下所有的webp文件，如果是文件夹则继续向下检索
for file_name in cur_all_files:
    nPath = os.path.join(CURRENT_PATH, file_name)
    # 文件夹
    if os.path.isdir(nPath):
        findFileForImage(nPath)
        continue
    # 储存
    name, ext = os.path.splitext(file_name)
    if ext.lower() == ".webp":
        imgList.append(os.path.join(CURRENT_PATH, name))


# 转换图片
def convertImage():
    for webpPath in imgList:
        print(webpPath)

        # 打开图片并赋值一份新的图片
        img = Image.open(webpPath + ".webp")
        img.load()
        # 将赋值的图片修改后缀保存在原路径
        img.save(webpPath + IMG_EXP)
        # 删除原webp图
        os.remove(webpPath + ".webp")


# 执行
convertImage()