xpdf将pdf转换为html,【记录】尝试用xpdf将不可复制的PDF转换为文本或HTML

最新推荐文章于 2023-02-16 14:21:15 发布

weixin_39541750

最新推荐文章于 2023-02-16 14:21:15 发布

阅读量213

点赞数

文章标签： xpdf将pdf转换为html

【背景】

折腾：

期间，去试试用xpdf，将一个不可拷贝的pdf文件，转换为文本或html。

【折腾过程】

1.参考：

去：

->

->

->

->

->

然后去运行:

D:\tmp\dev_tools\python\pdf\xpdfbin-win-3.03\xpdfbin-win-3.03\bin64

中的:

pdftotext.exe

结果还是被保护，无法拷贝：D:\tmp\dev_tools\python\pdf\xpdfbin-win-3.03\xpdfbin-win-3.03\bin64>pdftotext.exe

pdftotext version 3.03

Copyright 1996-2011 Glyph & Cog, LLC

Usage: pdftotext [options] []

-f : first page to convert

-l : last page to convert

-layout : maintain original physical layout

-fixed : assume fixed-pitch (or tabular) text

-raw : keep strings in content stream order

-htmlmeta : generate a simple HTML file, including the meta information

-enc : output text encoding name

-eol : output end-of-line convention (unix, dos, or mac)

-nopgbrk : don't insert page breaks between pages

-opw : owner password (for encrypted files)

-upw : user password (for encrypted files)

-q : don't print any messages or errors

-cfg : configuration file to use in place of .xpdfrc

-v : print copyright and version info

-h : print usage information

-help : print usage information

--help : print usage information

-? : print usage information

D:\tmp\dev_tools\python\pdf\xpdfbin-win-3.03\xpdfbin-win-3.03\bin64>pdftotext.exe -htmlmeta D:\tmp\tmp_dev_root\python\answer_question\self\pdf_table_to_xml\pdf\spec183r21.0.pdf hart183.html

Permission Error: Copying of text from this document is not allowed.

2.所以去解决上述问题：

但是没解决掉。。。

【总结】

目前还是没法用xpdf去把pdf转换为想要的html。

weixin_39541750

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。