官网下载 https://pdfbox.apache.org/download.html
下载 pdfbox-app-3.0.3.jar
cd D:\pdfbox
运行 java -jar pdfbox-app-3.0.3.jar
java -jar pdfbox-app-3.0.3.jar
Usage: pdfbox [COMMAND] [OPTIONS]
Commands:
debug Analyzes and inspects the internal structure of a PDF document
decrypt Decrypts a PDF document
encrypt Encrypts a PDF document
decode Writes a PDF document with all streams decoded
export:images Extracts the images from a PDF document
export:xmp Extracts the xmp stream from a PDF document
export:text Extracts the text from a PDF document
export:fdf Exports AcroForm form data to FDF
export:xfdf Exports AcroForm form data to XFDF
import:fdf Imports AcroForm form data from FDF
import:xfdf Imports AcroForm form data from XFDF
overlay Adds an overlay to a PDF document
print Prints a PDF document
render Converts a PDF document to image(s)
merge Merges multiple PDF d*.ocuments into one
split Splits a PDF document into number of new documents
fromimage Creates a PDF document from images
fromtext Creates a PDF document from text
version Gets the version of PDFBox
help Display help information about the specified command.
See 'pdfbox help <command>' to read about a specific subcommand
运行 java -jar pdfbox-app-3.0.3.jar debug
# 导出扫描版PDF文件中每页的图片文件
java -jar pdfbox-app-3.0.3.jar export:images -prefix=test -i your_book.pdf
导出
Writing image: test-1.jpg
Writing image: test-2.jpg
Writing image: test-3.png
……
# from 多个 image 合并生成 pdf
java -jar pdfbox-app-3.0.3.jar fromimage -o=book1.pdf -i=test-1.jpg -i=test-2.jpg -i=test-3.png -i=test-4.jpg
生成 book1.pdf 视觉效果太差,而且命令行长度限制了图片文件数(一般扫描书都有几百页)。
还是要自己编程搞定。