-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: MathJax-node in Pandoc #3153
Comments
Can you clarify how this would help? As far as I can see, tex2html from mathjax-node produces regular HTML that depends on MathJaX CSS and fonts. So, MathJax fonts would have to be included to get self-contained output. Perhaps this can be done though. Is there a way to get pure SVG output that doesn't depend on fonts? |
By the way, if we did do this, I think a |
My bad: I have installed the MathJax fonts locally so I didn't see it call resources from cdn.mathjax.org. I should have checked the source. I made ickc/pandoc-MathJax-node public. There's some test files to show the results of different provided |
As a sidenote, the |
+++ ickc [Oct 11 16 04:57 ]:
It can indeed parse mathml in math tags. |
Thanks for showing how to do this! But I wonder whether it's even necessary to make changes to pandoc. Look how easy it is. Create a markdown file,
Now all you need to create an HTML file with SVG math is:
I don't see any particular changes that are needed in pandoc, since this simple pipe suffices. It would be possible to add a new option |
OK, two reasons for more integration into pandoc:
So, proposal: add an option Unfortunately, since the HTML writer is pure, we can't do this piping in the writer. We need to do it in two places, |
May be a filter that passes each math element through The remaining question then is if this filter should be officially embedded in pandoc or as an external filter. |
I was replying too quick before I think clear: it doesn't quite work as an external filter since the |
It would certainly be possible to write a filter that works +++ ickc [Oct 14 16 15:14 ]:
|
Data uris seems to some some advantages: increased file size (though gzip seems to reduce that increase to be negligible), more computation complexity (battery life, rendering time, though today's smartphone are very capable so that might be negligible). And since HTML support inline SVG, inline SVG might be better. i.e. those math element are turned into raw HTML element and left as inline HTML in the output. |
I've created a standalone filter -- try it out! |
I will try it soon. Would it be possible to bundle it in the pandoc's binary installer, similar to citeproc? |
+++ ickc [Nov 20 16 15:54 ]:
Possible, yes, but I don't know if it's desirable, since it |
Beauty thy name is pandoc-tex2svg! I've moved to Homebrew so having to use Haskell was a bit of pain, but I've never had so much gain from so little pain! Thanks. |
+++ suknat [Dec 14 16 04:41 ]:
Beauty they name is pandoc-tex2svg!
Unfortunately pandoc-tex2svg is too slow.
The reason is clear: there's a huge startup cost when node
loads the library.
If the filter were written in node, then the library could
be loaded once, not once for each equation.
I started writing this, actually, but got confused about
how to deal with the callbacks. If a JavaScript programmer
wants to colaborate on this, let me know.
|
You can write a brew formula for it. I've seen quite a few pandoc filters written in haskell has brew formula. You can take a look at those examples. (a recent one is pandoc-sidenote). |
(I considered posting this in pandoc-discuss, but there isn't a lot of discussion over there. So may be it is easier to index and reference when put together here.) Some updates:
|
In reply to #3153 (comment)
I'm not a real JavaScript programmer, but I try :-) I looked into this. Firstly, how slow is
(so for example
Next I wrote a pandoc filter in node (this took most of the time), and indeed it is confusing how to deal with the callbacks, and it needs a change to the With this, the times are as follows:
So speed-wise, this seems reasonable. The filter may need some more work to be production-quality (for example it doesn't report errors), but it seems good enough for me for now! |
@shreevatsa - Fantastic! Can you publish your filter as
a github repository, not just a gist?
+++ Shreevatsa [Nov 09 17 15:57 ]:
… In reply to [1]#3153 (comment)
Unfortunately pandoc-tex2svg is too slow. The reason is clear:
there's a huge startup cost when node loads the library. If the
filter were written in node, then the library could be loaded once,
not once for each equation. I started writing this, actually, but
got confused about how to deal with the callbacks. If a JavaScript
programmer wants to colaborate on this, let me know.
I'm not a real JavaScript programmer, but I try :-) I looked into this.
Firstly, how slow is pandoc-tex2svg? To measure, I [2]removed caching
from pandoc-tex2svg.hs, took math-samples.md
([3]https://github.com/jgm/pandoc-tex2svg/blob/f4154482/math-samples.md
) and made bigger copies of it:
% cp math-samples.md math-samples-1.md
% function next() { cat "math-samples.md" "math-samples-$1.md" > "math-samples-$
((1+$1)).md" }
% for i in {1..9}; next $i
(so for example math-samples-10.md is 480 lines long, because
math-samples.md is 48 lines long) and timed how long the filter takes:
% for i in {1..10}; do echo $i; time pandoc math-samples-$i.md --filter pandoc-t
ex2svg -s -t html5 -o math-samples-$i.html; done
1
5.71s user 0.68s system 105% cpu 6.031 total
2
11.40s user 1.32s system 107% cpu 11.847 total
3
17.04s user 1.99s system 107% cpu 17.671 total
4
22.92s user 2.63s system 107% cpu 23.686 total
5
28.86s user 3.35s system 108% cpu 29.744 total
6
34.87s user 4.03s system 108% cpu 35.785 total
7
41.05s user 4.78s system 108% cpu 42.138 total
8
46.26s user 5.30s system 109% cpu 47.266 total
9
52.46s user 6.02s system 109% cpu 53.421 total
10
58.43s user 6.70s system 109% cpu 59.451 total
Next I wrote a pandoc filter in node (this took most of the time), and
indeed it is confusing how to deal with the callbacks, and it needs a
change to the pandoc-filter-node library (reported it here:
[4]mvhenderson/pandoc-filter-node#7). I found a workaround using the
async/await feature which is supported from NodeJS 7.6 or later
(released recently in February 2017) (am not enough of a JavaScript
programmer to figure out how to make it work in earlier versions), and
came up with this filter:
[5]https://gist.github.com/shreevatsa/170a1a8f217b20d86b5836e5e4821021
With this, the times are as follows:
% for i in {1..10}; do echo $i; time pandoc math-samples-$i.md --filter ../pando
c-mathjax-svg-filter.js -s -t html5 -o math-samples-$i.html; done
1
0.83s user 0.09s system 99% cpu 0.922 total
2
0.95s user 0.09s system 108% cpu 0.955 total
3
1.12s user 0.09s system 112% cpu 1.075 total
4
1.25s user 0.10s system 114% cpu 1.179 total
5
1.38s user 0.10s system 115% cpu 1.278 total
6
1.44s user 0.10s system 114% cpu 1.340 total
7
1.57s user 0.10s system 115% cpu 1.447 total
8
1.76s user 0.11s system 117% cpu 1.589 total
9
1.85s user 0.11s system 117% cpu 1.662 total
10
1.93s user 0.11s system 117% cpu 1.745 total
So speed-wise, this seems reasonable. The filter may need some more
work to be production-quality (for example it doesn't report errors),
but it seems good enough for me for now!
—
You are receiving this because you were mentioned.
Reply to this email directly, [6]view it on GitHub, or [7]mute the
thread.
References
1. #3153 (comment)
2. https://gist.github.com/shreevatsa/7be352a692fef4cdccc76d03b9f12bf8
3. https://github.com/jgm/pandoc-tex2svg/blob/f4154482/math-samples.md
4. mvhenderson/pandoc-filter-node#7
5. https://gist.github.com/shreevatsa/170a1a8f217b20d86b5836e5e4821021
6. #3153 (comment)
7. https://github.com/notifications/unsubscribe-auth/AAAL5C31001kPechZYqBq_XK7xd0kZQ4ks5s0yDXgaJpZM4KRlk0
|
@jgm Sure, published here: https://github.com/shreevatsa/pandoc-mathjax-filter It still needs to be made npm-installable and robust and all that, but hopefully I will either figure it out or someone who knows such things will help :-) |
I've added the link to the wiki, where there was already https://github.com/lierdakil/mathjax-pandoc-filter |
Haha oops… written nearly 3 years ago, and even the code looks somewhat similar. There is even a package (two actually) on npm. I thought I had thoroughly searched before getting here, but I must have been searching for the wrong thing. Anyway, I tried it and couldn't quite get it to work, so have added a note to that one as well. |
The suggestion is made as a sidetrack in a few issues: #2758 and jgm/pandoc-templates#219. I think may be a dedicated issue should be made.
What Does It Do?
It is similar to what MathJax does, but on the "server-side". i.e. rather than render it every time it is viewed, it pre-renders it.
What Problems Can It Solve?
Alternative Rendering Engine for MathML Output
At the very least, it provides another choice to output MathML. If features unique to MathJax is needed,
mathjax-node
will be a better alternatives to the existing choices of MathML output options.Static, Non-Rasterized and HTML Rendering Engine Independent
There were already a few other choices in pandoc that provide a truly static (does not depends on javascript) output: MathML, PNG. PNG are rasterized, MathML support is broken.
MathJax-node
have CHTML/SVG options that are superior to both (MathJax also has MathML and PNG output; and CHTML is the newest and fastest option).Self-contained MathJax
It is difficult to self-contained MathJax since
mathjax.js
will requests additional.js
files. To self-contain Maths, either you sacrifies MathJax (and use some other options which might lacks some of the features of the MathJax you relies on), or you need to embed MathJax in your package (see mathjax/MathJax-grunt-cleaner to make the MathJax footprint smaller for embedding).In this sense,
MathJax-node
provides a self-contained way of using MathJax, and would be useful in, say, ePub(3) output.How to implement it in pandoc?
No idea, but the following are my observations:
MathJax-node/bin/tex2html could be used to create a filter to process each math elements and turn it into raw HTML.
It will then has an external dependancy of
npm install mathjax-node -g
. The size is big (63.8 MB) since it includes the whole MathJax (54.5MB) in it. However if pandoc only use some particular features, the MathJax can beMathJax-grunt-cleaner
cleansed to about 2MB (say for the CHTML option only). May be theMathJax
andMathJax-node
can be forked and created a small foot-print version for pandoc specifically?In terms of command line option, one way is that whenever
pandoc ... --self-contained ... --mathjax ...
is used, then it actually usesMathJax-node
to provide a "self-contained MathJax". Or else may be--mathjax-node
option can be provided."I want to do it now"
Install MathJax-node by
npm install mathjax-node -g
.page2html
as a post-processorIf the output is HTML5, MathJax-node/bin/page2html can be used as a post-processor after pandoc.
page2html
is specifically said topage2html
as a pre-processorI tried to process the markdown file directly with
page2html
and it seems ok. Basicallypage2html
is used here beyond what it is designed for, but any markdown syntaxes probably will be ignored because it would treat it as texts. I'm not sure if raw HTML, div, span used in the pandoc markdown source might confusespage2html
though.In this case,
page2html
is used as a pre-processor to process the markdown first. The Math becomes raw HTML, and should be ok when parsed by pandoc, and then output to any HTML-related output including ePub(3).Caveat
The text was updated successfully, but these errors were encountered: