在项目APP开发过程中需要将pdf文件逐页转成图片,并同时修改图片的大小(磁盘大小有要求而尺寸大小不作要求),研究了一下通过Python的“PyPDF2”库(参考:http://blog.csdn.net/sweeper_freedoman/article/details/52994400)和“PythonMagick”库(参考:http://blog.csdn.net/sweeper_freedoman/article/details/52994690)实现了需求。脚本简单如下。
# !/usr/bin/python
# -*- coding: utf-8 -*-
"""
author : 蛙鳜鸡鹳狸猿
create_time : 2016年 11月 01日 星期二 17:38:06 CST
program : *_* script of manipulating pdf *_*
"""
import sys
import PyPDF2
import PythonMagick
class ManImage:
"""
Manipulate Image Object
"""
def __init__(self, i_file, o_dire):
"""
init args
:param i_file: (str) input pdf file (eg: "/home/file.pdf")
:param o_dire: (str) output image directory (eg: "/home/")
"""
self.i_file = i_file
self.o_dire = o_dire
def __str__(self):
traceback = "Executing under {0.argv[0]} of {1.i_file} into {2.o_dire}......".format(sys, self, self)
return traceback
def playpdf(self, ds):
"""
split pdf file
:param ds: (int) set ds = 1024 ~= 1MB output under my test
:return: splited PNG image file
"""
pages = PyPDF2.PdfFileReader(file(self.i_file, "rb")).getNumPages()
print('Totally get ***{0:^4}*** pages from "{1.i_file}", playpdf start......'.format(pages, self))
try:
for i in range(pages):
image = PythonMagick.Image()
image.density(str(ds))
image.read(self.i_file + '[' + str(i) + ']')
image.magick("PNG")
image.write(self.o_dire + str(i + 1) + ".png")
print("{0:>5} page OK......".format(i + 1))
except Exception, e:
print(str(e))
以上代码写入一个“class_image.py”文件,下面是调取的简单示例。
# !/usr/bin/python
# -*- coding: utf-8 -*-
# te_author : 蛙鳜鸡鹳狸猿
# create_time : 2016年 11月 01日 星期二 17:38:06 CST
# NOTICE : *_* script of converting .pdf to .png*_*
import sys
import class_image
i_file = sys.argv[1]
o_dire = sys.argv[2]
ds = sys.argv[3]
if i_file[-4:] == ".pdf":
class_image.ManImage(i_file=i_file, o_dire=o_dire).playpdf(ds=ds)
即在命令行分别传入读入文件、输出目录以及图片大小三个参数,操作起来方便简捷。如果是直接处理图片,参考:http://blog.csdn.net/sweeper_freedoman/article/details/53000520和http://blog.csdn.net/sweeper_freedoman/article/details/69789307。