Skip to content
This repository has been archived by the owner on Dec 9, 2018. It is now read-only.
Lu Wang edited this page Sep 29, 2013 · 70 revisions

General

Languages and libraries used in pdf2htmlEX

  • C++ (most part)
  • C (wrapper of Fontforge)
  • HTML (output)
  • CSS (complicated enough to be considered as a language)
  • Javascript (UI actions/effects)
  • Python (scripts for testing / packaging)
  • Poppler (PDF parsing)
  • Fontforge (font manipulation)
  • jQuery (for the default UI)

pdf2htmlEX doesn't work for my pdf file

  • Bug reports are always welcome, please file an issue with the link to the broken pdf file.
  • However there are several exceptions when the bug cannot be fixed in time (or at all)
    • The file does not follow the PDF standard (it might still be displayed correctly in PDF viewers)
    • Something wrong with libraries used by pdf2htmlEX (poppler / fontforge)
    • There are a few technical limitations of pdf2htmlEX. See this page

I want more features!

  • Create a patch, or hire someone to do so.
  • Best hackers do not work for free.
  • But great ideas are more valuable than money.

'Cannot open the manifest file'

  • Run 'sudo make install' or 'make install', depending on your environment.

The generated HTML file freezes my Firefox

  • Don't zoom in too much
  • Use a smaller value for --font-size-multiplier

The generated HTML file looks awful

Check if your browser meets the requirements.

The generated HTML file is too large

  • File embedded in HTML are encoded in Base64, whose size is 1/3 larger.
  • There is built-in compression support in PDF, but no such feature in HTML. Fortunately most HTTP servers support compression (gzip/deflate), and you may check the actually network communication cost by compressing the HTML file with gzip, which is usually smaller than PDF.

Text and Font

Text are correct but not readable

  • Install ttfautohint and run pdf2htmlEX with --external-hint-tool=ttfautohint
  • Try --auto-hint 1 carefully, which is experimental now.

I got incorrect text after copy & paste

  • try run with --tounicode 1
  • Make sure you CAN copy & paste with a PDF viewer
    • If you can not, neither can pdf2htmlEX

Image

There is no image generated

  • Make sure you did not specify --process-nontext 0
  • Make sure libpng (and headers) is installed BEFORE poppler was compiled.

Generated text are too small to read

  • try run with --zoom 2

Images are blurred

  • try run with --hdpi 288 --vdpi 288
Clone this wiki locally