pdfkit,把 HTML+CSS 格式的文件转换成 PDF 格式文档的一个工具。
其实,pdfkit 是 html 转成 pdf 工具包 wkhtmltopdf 的 Python 封装。所以,首先安装 wkhtmltopdf 。 一般情况下,wkhtmltopdf需要手动安装,网站是 https://wkhtmltopdf.org/downloads.html,根据自己的操作系统下载对应的版本即可。ps:记住安装目录啊,下面要用到。
上面说到了pdfkit这个模块,这个是第三方模块,需要安装,使用pip安装即可。
pip install pdfkit
示例
pdfkit 可以将网页、html文件以及字符串生成pdf文件
import pdfkit
confg = pdfkit.configuration(wkhtmltopdf='C:\Python35\wkhtmltopdf.exe')# 这里指定一下wkhtmltopdf的路径,这就是我为啥在前面让记住这个路径
url = 'https://blog.csdn.net/fenglepeng/article/details/103670893'
pdfkit.from_url(url, 'aaa.pdf', configuration=confg)
# from_url这个函数是从url里面获取内容
# 这有3个参数,第一个是url,第二个是文件名,第三个就是khtmltopdf的路径
pdfkit.from_file('my.html', 'bbb.pdf', configuration=confg)
# from_file这个函数是从文件里面获取内容
# 这有3个参数,第一个是一个html文件,第二个是文生成的pdf的名字,第三个就是khtmltopdf的路径
html = '''
<div>
<h1>title</h1>
<p>content</p>
</div>
'''
pdfkit.from_string(html, 'ccc.pdf', configuration=confg)
# from_file这个函数是从一个字符串里面获取内容
# 这有3个参数,第一个是一个字符串,第二个是文生成的pdf的名字,第三个就是khtmltopdf的路径
API
def from_url(url, output_path, options=None, toc=None, cover=None, configuration=None, cover_first=False):
"""
把从URL获取文件转换为PDF文件
:param url: URL 或 URL列表
:param output_path: 输出PDF文件的路径。如果是参数等于False,意味着文件将会以字符串的形式返回,得到文本文件。
:param options: (可选) dict with wkhtmltopdf global and page options, with or w/o '--'
:param toc: (可选) dict with toc-specific wkhtmltopdf options, with or w/o '--'
:param cover: (可选) string with url/filename with a cover html page
:param configuration: (可选)实例化 pdfkit.configuration.Configuration()
:param configuration_first: (可选) if True, cover always precedes TOC
Returns:成功返回True
"""
def from_file(input, output_path, options=None, toc=None, cover=None, css=None, configuration=None, cover_first=False):
"""
Convert HTML file or files to PDF document
:param input: path to HTML file or list with paths or file-like object
:param output_path: path to output PDF file. False means file will be returned as string.
:param options: (optional) dict with wkhtmltopdf options, with or w/o '--'
:param toc: (optional) dict with toc-specific wkhtmltopdf options, with or w/o '--'
:param cover: (optional) string with url/filename with a cover html page
:param css: (optional) string with path to css file which will be added to a single input file
:param configuration: (optional) instance of pdfkit.configuration.Configuration()
:param configuration_first: (optional) if True, cover always precedes TOC
Returns: True on success
"""
def from_string(input, output_path, options=None, toc=None, cover=None, css=None, configuration=None, cover_first=False):
"""
Convert given string or strings to PDF document
:param input: string with a desired text. Could be a raw text or a html file
:param output_path: path to output PDF file. False means file will be returned as string.
:param options: (optional) dict with wkhtmltopdf options, with or w/o '--'
:param toc: (optional) dict with toc-specific wkhtmltopdf options, with or w/o '--'
:param cover: (optional) string with url/filename with a cover html page
:param css: (optional) string with path to css file which will be added to a input string
:param configuration: (optional) instance of pdfkit.configuration.Configuration()
:param configuration_first: (optional) if True, cover always precedes TOC
Returns: True on success
"""