pdf转HTML实践之ubuntu编译

官方安装说明:


Building
Kostas edited this page on 29 Nov 2018 · 59 revisions
Pages 27

Home
123
Author's Words
Auto Tests
Browser Requirements
Building

    - 欲速则不达。- Haste makes waste.
    Environment
    macOS
    Windows
    Dependencies
    Optional
    Compiling
    Troubleshooting
    Hacker's magical scripts

Command Line Options
Comparison
Customizing Output
Download
FAQ
Feature List
Font Files
How can I help?
Introduction
Clone this wiki locally

    - 欲速则不达。
    - Haste makes waste.

Environment

pdf2htmlEX can be built in a Unix-like environment:

    GNU/Linux (for Ubuntu 12.10+, Fedora)
    macOS
    Windows/Cygwin
    Windows/Mingw-w64
    Windows/MinGW, with some modifications to pdf2htmlEX. See pdf2htmlEX on TeX Wiki (in Japanese), special thanks to Haruhiko Okumura.

macOS

An easy way to install pdf2htmlEX on macOS is to use brew. After installing brew, open a terminal:

brew install pdf2htmlEX

Windows

An easy way to build pdf2htmlEX on Windows is MSYS2 + mingw-w64. Build-pdf2htmlEX-on-Windows (in Chinese)
Dependencies

Fedora

sudo yum install  cmake gcc gnu-getopt java-1.8.0-openjdk libpng-devel fontforge-devel cairo-devel poppler-devel libspiro-devel freetype-devel  poppler-data libjpeg-turbo-devel git make gcc-c++

Manual

    CMake, pkg-config
    GNU Getopt
    C++ Compiler that supports C++11, for example
        GCC >= 4.6.3
        A recent version of Clang
    libspiro
    poppler >= 0.25.0 with xpdf headers (compile with --enable-xpdf-headers)
        Install libpng (and headers) BEFORE you compile poppler if you want PNG background images generated
        Install libjpeg (and headers) BEFORE you compile poppler if you want JPG background images generated
        Install poppler-data if your want CJK support
    fontforge (with header files)
        A recent version or my fork, use the pdf2htmlEX branch, which is a modified version of the 20140101 release.
        Older versions may or may not work.

Optional

    To generate SVG background images and process Type 3 fonts
        cairo >= 1.10.0 with SVG support
        FreeType
        Add -DENABLE_SVG=OFF to cmake to disable it.
    To add hinting information for TTF fonts
        ttfautohint
        Run pdf2htmlEX with --external-hint-tool=ttfautohint
    To optimize CSS and JavaScript code with YUI Compressor and closure-compiler
        java >= 6

Compiling

git clone git://github.com/coolwanglu/pdf2htmlEX.git
cd pdf2htmlEX
cmake . && make && sudo make install

Stable releases can be found at https://github.com/coolwanglu/pdf2htmlEX/releases.

In order to create the debug version, add -DCMAKE_BUILD_TYPE=Debug to cmake.
Troubleshooting

If you installed poppler or fontforge into a place other than /usr (If you install them from source code, they are installed to /usr/local by default), you need to set up environment variables for pkg-config, and maybe also INCLUDE_PATH, LIBRARY_PATH and LD_LIBRARY_PATH because some GNU/Linux distributions do not set them up for you (e.g. Fedora). If you are not sure about this, just install those libraries to /usr by passing --prefix=/usr to configure.

If you see error messages about:

    goo/GooString.h, read the dependencies again, poppler should be compiled with --enable-xpdf-headers
    spiroentrypoints.h, install header files of libspiro
    undefined reference of Py_xxx, install header files of python-2.x
    libintl.h, install gettext and set your system include path accordingly.
    glib.h: No such file or directory, install the development header files of glib-2.0, and make sure that the location of glib.h is in INCLUDE_PATH.

Hacker's magical scripts

These scripts have been reported to be useful in various situations. They have NOT been tested by pdf2htmlEX's authors, use at your own risk! If you want to contribute, please create your own gist somewhere, and post a link and description here.

    rajeevkannav/pdf2htmlEX.sh install on Ubuntu 15.04

实操步骤:

  1. 项目拉取

git clone git://github.com/coolwanglu/pdf2htmlEX.git
cd pdf2htmlEX
# 后台执行
cmake . && make && sudo make install
  1. 执行cmake时报错,找不到poppler和libfontforge两个库。
    poppler这个好解决,直接去官网下载源码编译安装即可:

wget https://poppler.freedesktop.org/poppler-0.33.0.tar.xz
 tar -xf poppler-0.33.0.tar.xz 
 cd poppler-0.33.0/
./configure --enable-poppler-glib
sudo make install

    3. fontforge类库安装

git clone https://github.com/coolwanglu/fontforge.git fontforge.gitgit checkout pdf2htmlEX
./autogen.sh
./configuremakesudo make install

# 注意下面都是不对的。
sudo apt-get install lfontforgesudo 
apt-get install libfontforge1sudo 
apt-get install libfontforge-dev

    4.  切换到pdf2html目录,执行cmake .

图片.png

    6 然后make && sudo make install

图片.png

结果一直报错。百度半天没找到结果

不知道是版本问题还是系统问题没搞定。

最终打开源码把报错的行给注掉了。编译终于通过了。

图片.png

  7. 试水

图片.png

体验就是效果真是不错!没得说,很强大!

官方文档中提供的一键安装脚本:

HOME_PATH=$(cd ~/ && pwd)
LINUX_ARCH="$(lscpu | grep 'Architecture' | awk -F\: '{ print $2 }' | tr -d ' ')"
POPPLER_SOURCE="http://poppler.freedesktop.org/poppler-0.33.0.tar.xz"
FONTFORGE_SOURCE="https://github.com/fontforge/fontforge.git"
PDF2HTMLEX_SOURCE="https://github.com/coolwanglu/pdf2htmlEX.git"

 
  echo "Updating all Ubuntu software repository lists ..."
apt-get update
  echo "Installing basic dependencies ..."
apt-get install -qq -y cmake gcc libgetopt++-dev git
  echo "Installing Poppler ..."
apt-get install -qq -y pkg-config libopenjpeg-dev libfontconfig1-dev libfontforge-dev poppler-data poppler-utils poppler-dbg

  echo "Downloading poppler via source ..."
wget "$POPPLER_SOURCE"
tar -xvf poppler-0.33.0.tar.xz
cd poppler-0.33.0/
./configure --enable-xpdf-headers
make
make install

  echo "Installing fontforge ..."
cd "$HOME_PATH"
apt-get install -qq -y packaging-dev pkg-config python-dev libpango1.0-dev libglib2.0-dev libxml2-dev giflib-dbg
apt-get install -qq -y libjpeg-dev libtiff-dev uthash-dev libspiro-dev
  echo "cloning fontforge via source ..."
git clone --depth 1 "$FONTFORGE_SOURCE"
cd fontforge/
./bootstrap
./configure
make
sudo make install

  echo "Installing Pdf2htmlEx ..."
cd "$HOME_PATH"
git clone --depth 1 "$PDF2HTMLEX_SOURCE"
cd pdf2htmlEX/
cmake .
make
sudo make install

echo 'export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc

cd "$HOME_PATH" && rm -rf "poppler-0.33.0.tar.xz"
cd "$HOME_PATH" && rm -rf "poppler-0.33.0"
cd "$HOME_PATH" && rm -rf "fontforge"
cd "$HOME_PATH" && rm -rf "pdf2htmlEX"

省心方式:docker

docker pull bwits/pdf2htmlex-alpine
alias pdf2htmlEX="docker run -ti --rm -v ~/pdf:/pdf bwits/pdf2htmlex-alpine pdf2htmlEX"

命令说明:


This page lists some common recipes of pdf2htmlEX
First thing first

    It is highly recommended that you install ttfautohint and always add --external-hint-tool=ttfautohint to each of the following recipes. This tool enhances font rendering for all browsers on Windows.
    Double check you have poppler-data installed, for CJK characters.
    Double check you have run sudo make install, or pdf2htmlEX may not be executed correctly

The simplest case

Suppose you have a PDF file pdf/test.pdf, simply running

pdf2htmlEX --zoom 1.3 pdf/test.pdf

would produce a single HTML file test.html in the current directory.
Advanced

pdf2htmlEX -f 3 -l 5 --fit-width 1024 --bg-format jpg pdf/test.pdf

would convert only the 3rd, 4th and 5th pages, and fit the page width to 1024 pixels. Background images will be generated in the JPEG format.
For publishers

pdf2htmlEX --embed cfijo --dest-dir out pdf/test.pdf

would produce a test.html and accompanying files in the out directory, in this way all the resources (fonts, images, css and javascript) are stored in separated files such that the viewer can take more advantage of browser caches.
For advanced publishers

pdf2htmlEX --embed cfijo --split-pages 1 --dest-dir out --page-filename test-%d.page pdf/test.pdf

would do something similar above, but each individual page is stored in a separated file. The files are named as test-0.page, test-1.page and so on, as specified in the command line. There is still a test.html which loads the pages dynamically through ajax. In this way the publishers are given full control, who can organize the pages as they like, for example, to implement lazy page loading.
The Ultimate Hand

pdf2htmlEX --fallback 1 pdf/test.pdf

would also produce a single test.html, which, however, consists of images and hidden text. This mode provides maximum accuracy and compatibility, at the cost of larger file size. Use this mode only when pdf2htmlEX cannot correctly process your files otherwise.
More

Just remember man pdf2htmlEX and pdf2htmlEX --help are always your best friends.

相关文档:

pdf2htmlEx学习笔记之ubuntu编译_tounicode cmap is not valid and got dropped for fo-CSDN博客

Building · coolwanglu/pdf2htmlEX Wiki · GitHub

https://gist.github.com/rajeevkannav/d07f822e209a22d07176

pdf2htmlex linux 编译,关于pdf2html的各种编译-CSDN博客

https://stackoverflow.com/questions/68757803/pdf2htmlex-cairofontengine-cc-error-during-dobuild

  • 18
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
要在Ubuntu上将Markdown换为PDF,你可以使用retext这个工具。它可以保留一些定制过的Markdown样式,如GitHub Flavored Markdown,并生成PDF文件。不过需要注意的是,生成的PDF文件可能会比较大,因为其中包含了用于定制Markdown的CSS样式。\[1\] 另外,如果你在VSCode中遇到MarkdownPDF无法显示LaTeX公式的问题,你可以按照以下步骤解决:首先,在Markdown文件的开头添加以下内容: ```html <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ tex2jax: {inlineMath: \[\['$', '$'\]\]}, messageStyle: "none" }); </script> ``` 然后,下载一个相关的插件,并在VSCode中打开侧边预览,点击预览界面上的PDF按钮即可生成PDF文件。\[2\] 如果你想在retext中更改默认字体,可以按照以下步骤进行操作:点击编辑菜单,选择"改变默认字体",弹出对话框后,在左下方找到"书写系统"并点击。在弹出的选项中选择简体中文,然后在字体选项中选择一个适合的字体,比如Sans Serif。这样做是因为retext的默认字体是Ubuntu系统字体,为了保证换后的PDF在Windows和Mac上查看时没有问题,需要选择一个被广泛支持的字体。\[3\] #### 引用[.reference_title] - *1* *3* [ubuntu环境下markdown换成pdf](https://blog.csdn.net/weixin_34006965/article/details/89430006)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insert_down1,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [解决vscode中markdown pdf无法显示latex公式(ubuntu)](https://blog.csdn.net/weimengchuan/article/details/119567958)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insert_down1,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值