Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

converting html to pdf - no equations #10317

Closed
wrobell opened this issue Oct 21, 2024 · 3 comments
Closed

converting html to pdf - no equations #10317

wrobell opened this issue Oct 21, 2024 · 3 comments
Labels

Comments

@wrobell
Copy link

wrobell commented Oct 21, 2024

Trying to convert a page to PDF with

$ pandoc --pdf-engine=xelatex -f html -t latex https://mlg.eng.cam.ac.uk/blog/2021/03/31/what-keeps-a-bayesian-awake-at-night-part-1.html -o what-keeps.pdf

The equations in PDF file are

2024-10-21_17-33-52_654x105

$ pandoc --version
pandoc 3.1.11.1
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: /home/wrobell/.local/share/pandoc
Copyright (C) 2006-2023 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.

Seems to be related to #2758

@wrobell wrobell added the bug label Oct 21, 2024
@jgm
Copy link
Owner

jgm commented Oct 21, 2024

The HTML contains things like

<p>\begin{equation}
    p(D) = \int p(\theta) \prod_{n=1}^N p(D_n \cond \theta) \,\mathrm{d}\theta.
\end{equation}</p>

So you need to enable the raw_tex extension: try with -f html+raw_tex+tex_math_dollars.

@wrobell
Copy link
Author

wrobell commented Oct 21, 2024

@jgm thanks for the suggestions, i have tried:

greymatter$ pandoc --pdf-engine=xelatex -f html+raw_tex+tex_math_dollars -t latex https://mlg.eng.cam.ac.uk/blog/2021/03/31/what-keeps-a-bayesian-awake-at-night-part-1.html -o what-keeps.pdf
Error producing PDF.
! Undefined control sequence.
l.183     p( X \cond

greymatter$ pandoc --pdf-engine=xelatex -f html+tex_math_dollars -t latex https://mlg.eng.cam.ac.uk/blog/2021/03/31/what-keeps-a-bayesian-awake-at-night-part-1.html -o what-keeps.pdf
Error producing PDF.
! Undefined control sequence.
l.190 variables (\(\hat X_{\text{MAP}} = \argmax

greymatter$ pandoc --pdf-engine=xelatex -f html+raw_tex -t latex https://mlg.eng.cam.ac.uk/blog/2021/03/31/what-keeps-a-bayesian-awake-at-night-part-1.html -o what-keeps.pdf
Error producing PDF.
! Undefined control sequence.
l.183     p( X \cond

@jgm
Copy link
Owner

jgm commented Oct 21, 2024

Well, these aren't pandoc parsing errors; they are errors produced by pdflatex, presumably because it expects some packages to be loaded that our default template doesn't load. You need to figure out what these are and load them (e.g. by adding \usepackage{PACKAGENAME} in header-includes).
See https://tug.ctan.org/info/symbols/comprehensive/symbols-a4.pdf

Repository owner locked and limited conversation to collaborators Oct 21, 2024
@jgm jgm converted this issue into discussion #10319 Oct 21, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

2 participants