Pdfkit
(https://pypi.org/project/pdfkit/ )
Python 2 and 3 wrapper for wkhtmltopdf utility to convert HTML to PDF using Webkit. pdfkit只是对wkhtmltopdf 的包装,其会去调用可执行文件wkhtmltopdf 来完成任务处理。
安装:
pip install pdfkit 即可安装(https://blog.csdn.net/qq_35865125/article/details/106176741 )。
如果你用的是python3,则pip3 install pdfkit
Note:
该工具包是对wkhtmltopdf 的wrapper, 调用该包的函数时,例如 pdfkit.from_url('http://google.com', 'out.pdf'),需要依赖于wkhtmltopdf ,
需要在系统中安装wkhtmltopdf(
https://wkhtmltopdf.org/downloads.html),
安装后需要添加到系统路径中。windows下可以直接下载安装包,linux下
可以直接
sudo apt-get install wkhtmltopdf .
新增环境变量后重启一下pyCharm或命令行窗口哦。
例子:
https://pypi.org/project/pdfkit/ :
import pdfkit
pdfkit.from_url('http://google.com', 'out.pdf')
pdfkit.from_file('test.html', 'out.pdf')
pdfkit.from_string('Hello!', 'out.pdf')
关于wkhtmltopdf
wkhtmltopdf
and wkhtmltoimage
are open source (LGPLv3) command line tools to render HTML into PDF and various image formats using the Qt WebKit rendering engine. These run entirely "headless" and do not require a display or display service.
There is also a C library, if you're into that kind of thing.
pdfkit的python代码
pdfkit是对可执行文件的wrapper,从而可以提供一种调用第三方可执行文件的方案啊。
查看源码:
打开python终端,执行:
import pdfkit
help(pdfkit)
通过FILE可以定位到代码目录:
api.py文件中定义了经常调用的函数:
from_string, from_file等。
def from_file(input, output_path, options=None, toc=None, cover=None, css=None,
configuration=None, cover_first=False):
"""
Convert HTML file or files to PDF document
:param input: path to HTML file or list with paths or file-like object
:param output_path: path to output PDF file. False means file will be returned as string.
:param options: (optional) dict with wkhtmltopdf options, with or w/o '--'
:param toc: (optional) dict with toc-specific wkhtmltopdf options, with or w/o '--'
:param cover: (optional) string with url/filename with a cover html page
:param css: (optional) string with path to css file which will be added to a single input file
:param configuration: (optional) instance of pdfkit.configuration.Configuration()
:param configuration_first: (optional) if True, cover always precedes TOC
Returns: True on success
"""
r = PDFKit(input, 'file', options=options, toc=toc, cover=cover, css=css,
configuration=configuration, cover_first=cover_first)
return r.to_pdf(output_path)
该函数内部先生成一个PDGKit类的对象,然后使用该对象完成操作。
class PDFKit定义位于pdfkit.py,如下图,该类的成员wkhtmltopdf应该就是对应被wrapp的whtmltopdf。 成员configuration负责找到whtmltopdf.
class Configuration类使用subproess函数启用电脑中已经安装的wkhtmltopdf可执行文件::
python subprocess函数
从python2.4版本开始,可以用subprocess这个模块来产生子进程,并连接到子进程的标准输入/输出/错误中去,还可以得到子进程的返回值。
subprocess意在替代其他几个老的模块或者函数,比如:os.system os.spawn* os.popen* popen2.* commands.*