Python进行PDF转图片
使用pdfplumber这个Python工具库,pdfplumber基于pdfminer.six。
使用pdfplumber进行PDF转图片,简单快捷。同时pdfplumber还提供可视化的PDF内容提取调试支持,如上图。
import pdfplumberpdf = pdfplumber.open("ccf-2019.pdf")for i, page in enumerate(pdf.pages): page.to_image(resolution=150).save('{}.png'.format(i))
Linux上进行PDF转图片
Linux上使用pdftoppm命令行工具可以方便进行PDF转图片,pdftoppm属于poppler-utils包。
安装:
sudo apt install poppler-utils
使用:
pdftoppm -png demo.pdf
pdftoppm提供许多配置选项,比如crop剪切图片、缩放、分辨率、打印页数等等。
Usage: pdftoppm [options] [PDF-file [PPM-file-prefix]] -f : first page to print -l : last page to print -o : print only odd pages -e : print only even pages -singlefile : write only the first page and do not add digits -r : resolution, in DPI (default is 150) -rx : X resolution, in DPI (default is 150) -ry : Y resolution, in DPI (default is 150) -scale-to : scales each page to fit within scale-to*scale-to pixel box -scale-to-x : scales each page horizontally to fit in scale-to-x pixels -scale-to-y : scales each page vertically to fit in scale-to-y pixels -x : x-coordinate of the crop area top left corner -y : y-coordinate of the crop area top left corner -W : width of crop area in pixels (default is 0) -H : height of crop area in pixels (default is 0) -sz : size of crop square in pixels (sets W and H) -cropbox : use the crop box rather than media box -mono : generate a monochrome PBM file -gray : generate a grayscale PGM file -png : generate a PNG file -jpeg : generate a JPEG file -jpegopt : jpeg options, with format =[,=]* -tiff : generate a TIFF file -tiffcompression : set TIFF compression: none, packbits, jpeg, lzw, deflate -freetype : enable FreeType font rasterizer: yes, no -thinlinemode : set thin line mode: none, solid, shape. Default: none -aa : enable font anti-aliasing: yes, no -aaVector : enable vector anti-aliasing: yes, no -opw : owner password (for encrypted files) -upw : user password (for encrypted files) -q : don't print any messages or errors