Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

math support #6

Closed
blaenk opened this issue Jun 16, 2015 · 48 comments
Closed

math support #6

blaenk opened this issue Jun 16, 2015 · 48 comments
Milestone

Comments

@blaenk
Copy link

blaenk commented Jun 16, 2015

This simply means enabling the use of $ and $$ delimiters (inline and block-level respectively, as in Pandoc) to very conveniently leverage something like MathJax. Hoedown also supports this.

@raphlinus
Copy link
Collaborator

It should be relatively straightforward, but nothing is ever simple. Take *a $b*c$ for example. At the very least the scanner for emphasis has to be aware of the math spans to properly skip over them (the rule is likely quite similar to that for ```.

Not on the critical path for rustdoc, but worth doing.

@vyp
Copy link

vyp commented Jun 17, 2015

Is this in commonmark?

@raphlinus
Copy link
Collaborator

@vyp No, however it is one of the more commonly requested extensions. See http://talk.commonmark.org/t/mathematics-extension/457 for more discussion.

@vyp
Copy link

vyp commented Jun 17, 2015

Oh sorry, I just saw #4. Intention was that commonmark may have guidelines on cases like *a $b*c$.

@vyp
Copy link

vyp commented Jun 17, 2015

@raphlinus Ah yes, I'm familiar with that thread. Second commenter there.

@raphlinus
Copy link
Collaborator

I'm very open to doing this. By far the most helpful thing from the community would be a precise spec. One got suggested in a reddit thread. But are these the right rules? Would the rules for _ just work? I'm not sure that even $1234 presents a serious problem, because it would need a closing $ to become a false positive. That thread also links this Survey of syntaxes for math in markdown, which is an incredibly useful resource for figuring this out.

@jeanm
Copy link

jeanm commented Jan 9, 2017

I've also made a table to summarise some of the syntaxes. I've only focused on the platforms that are likely to be used by mathsy people, since mathsy people are the ones likely to use this feature anyway. Feel free to add a comment there if there's anything relevant I've missed!

I'm not sure that even $1234 presents a serious problem, because it would need a closing $ to become a false positive.

Not by itself, but there might be more than one dollar sign in a sentence (say if you're making a list of prices)

@azerupi
Copy link
Contributor

azerupi commented Jan 10, 2017

Not by itself, but there might be more than one dollar sign in a sentence (say if you're making a list of prices)

That wouldn't be a problem I think. A space before an ending $ would not be allowed. So

This $15 dollar item and this $12 dollar item ...
                              ^ would not be counted as math ending character

@notriddle
Copy link
Collaborator

There could be spaces in the middle of it, I assume. So this can happen.

It costs US$5 or CA$4.8.
           ^-------^

@jeanm
Copy link

jeanm commented Jan 10, 2017

@notriddle yes that's what I had in mind. Also:

$1 and 10$

The exchange rate today is $1 to 1.36AU$

However these are pretty contorted cases. In this thread on the CommonMark forums John MacFarlane seems to agree that the dollar syntax works just fine:

The problem with the \(..\) and \[..\] forms is that \( and \[ already have clear and important meanings in CommonMark (and other Markdown versions). They are escaped parentheses and backslashes. It's very important to keep this behavior, or you're left without an easy way to write literal special characters when this is needed.

Hence I prefer the $ syntax. Pandoc has supported this for a decade now, with some simple heuristics that prevent unwanted capture of regular $ characters. This has worked just fine. It's extremely uncommon to write things like US$50,000, and one can always escape the $ in cases like this. Anyway, my experience with pandoc is that this is not a pain point. People simply don't complain about unwanted capturing of $ characters.

@raphlinus
Copy link
Collaborator

If you replace those $'s with _, you'll see that it doesn't form markup. The opener needs to have space (or punctuation) before it, and the closer needs to have space (or punctuation) after it. So it would need to be an example like Dinner was only $10, which puts the restaurant in the small$ category. This seems contrived enough to me I don't think it would happen often. Perhaps somebody could analyze an existing corpus of text.

@raphlinus
Copy link
Collaborator

What @jeanm said :)

@brendanzab
Copy link

brendanzab commented Aug 5, 2018

For me, as mentioned in pikelet-lang/pikelet#109, what I'd really want is the ability to add HTML classes to code (both inline and code). Something like:

Inline math: {katex}`\Gamma \vdash x : \tau`

```{katex}
\Gamma \vdash x : \tau
```

With that I could then run KaTeX on all the things with that ID.

That way at least Github wouldn't completely bork my maths in the previews.

@oberien
Copy link
Contributor

oberien commented Jun 3, 2019

In heradoc I have implemented the following three ways of creating latex-math, such that it still renders well in plain CommonMark renderers without math support:

Inline-math: `$ foo`
Equation without numbering: ```$$\nfoo\n```
Equation with numbering: ```$$$\nfoo\n```

@CAD97
Copy link

CAD97 commented Jun 3, 2019

Summarizing and continuing discussion from #100:

Whatever math syntax is chosen should piggyback off of `-delimited code blocks, both for nicer fallback in an engine without the math rendering extension, and to avoid the \-doubling issue that is required for, say, just running KaTeX or MathJax on the markdown engine output.

The rest of this post concerns mainly KaTeX since I have much more experience using KaTeX than MathJax. Probably, the most powerful solution would use an ahead-of-time translation (to MathML + JS? KaTeX pre-render? RRIR?) rather than relying on display-time JS translation, but a minimal solution should rely on the preexisting translation tools.

Interestingly, for block syntax, no extra ahead-of-time processing needs to be done for display-time integration with KaTeX. The markdown

```math
1 + 2
```

is translated to the html

<pre><code class="language-math">1 + 2
</code></pre>

and can be translated with KaTeX via JS like

for (let math of document.getElementsByClassName("language-math")) {
    let span = document.createElement('span');
    katex.render(math.innerText, span, {displayMode: true});
    math.parentNode // <pre>
        .parentNode // context
        .replaceChild(span,   // new
            math.parentNode); // old
}

into the following DOM:

DOM
<span>
  <span class="katex-display">
    <span class="katex">
      <span class="katex-mathml">
        <math>
          <semantics>
            <mrow>
              <mn>1</mn>
              <mo>+</mo>
              <mn>2</mn>
            </mrow>
            <annotation encoding="application/x-tex">1 + 2</annotation>
          </semantics>
        </math>
      </span>
      <span class="katex-html" aria-hidden="true">
        <span class="strut" style="height: 0.64444em;"></span>
        <span class="strut bottom" style="height: 0.72777em; vertical-align: -0.08333em;"></span>
        <span class="base">
          <span class="mord">1</span>
          <span class="mord rule" style="margin-right: 0.222222em;"></span>
          <span class="mbin">+</span>
          <span class="mord rule" style="margin-right: 0.222222em;"></span>
          <span class="mord">2</span>
        </span>
      </span>
    </span>
  </span>
</span>

which is the correct way for displaying KaTeX-rendered display-style text, and the exact same output generated with KaTeX's renderMathInElement and $$ blocks. [test source]

You can make the argument that co-opting the language-math class as "display with KaTeX" is semantically incorrect, especially since we're stripping the <pre><code> and replacing it with a <span>. (The original version shown in #100 does not do this; this transformation is completely optional and only helps in display cases where <code> has a visible style (such as rustdoc).) co-opting language-mathml or language-LaTeX is definitely wrong (it should show the code), and the same argument can be made for language-MathJax and language-KaTeX. As a potential alternative, we could use ```$$/language-$$, as that evokes the $$ of traditional LaTeX display math guards (even though said usage is technically deprecated), and there isn't yet an esolang called "$$" to display the code of (there is for $).

Similarly, $` TeX `$ or `$ TeX $` can be display-time processed, but I'll omit the implementation here for brevity.

My next comment will discuss more "first-class" support and what that would mean.

@CAD97
Copy link

CAD97 commented Jun 4, 2019

CommonMark Wiki on math extensions: https://github.com/commonmark/commonmark-spec/wiki/Deployed-Extensions#math

If you don't care about fallback behavior and are willing to change the parse tree in degenerate cases, \[ display-math \] and \( inline-math \) are almost perfect. There are two main problems:

  • \[ is useful in today's Markdown for escaping meta-uses of [ (i.e., links). (\( is allowed and renders as just ( in nearly every Markdown implementation, but [ is much more commonly the escape target than (. Ditto for right brackets.)
  • You have to have some idea about LaTeX nesting, because e.g. \[a+\text{b+$c$+d}+e\] is valid in KaTeX (fun fact! Using \(\) for the inner breaks KaTeX's renderMathInElement). Probably the "best effort" behavior of matching unescaped curly brackets and then ending on the matching \]/\)

It's for this reason plus fallback that I really think any solution should actually lean on "code literal" syntax. This gives us growable fences for free. So then the "reasonable" choices per that are bracket-style \(` math `\) and \[` math `\] or dollar-style $` math `$ and $$` math `$$. The bracket-style requires parser support to avoid a false positive on (`code`) (which can be avoided by moving the close bracket within the code block and accepting a missing backslash for the leading as valid, or moving that inside as well), and the dollar-style can be done purely by display-time JS. (I have a codereview.SE post which implements basically this.)

What do I actually recommend? For now, I think $$` math `$$ and JS display-time processing. The brackets being part inside/outside looks wrong as the noscript fallback. For a markdown engine support? The CommonMark wiki seems to suggest they're leaning towards optionally enabling single $ affixes for inline, which runs afoul of the known having more than one textual $ in a paragraph. (And allowing processing based on "language type" on tilde fences only.) I still think that using a code fence is ideal for both simplicity of finding the end for implementation and user, but the exact way of specifying the fence as specially processed really remains up to the implementer of the CommonMark extension, at least until there's a blessed pattern for extending literal blocks.

Javascript to do both dollar-style and bracket-style
(() => {
    const todo = []; // don't mutate document while iterating

    function processMath() {
        if (arguments.length == 3) {
            const [prev, code, displayMode] = arguments;
            prev.splitText(prev.textContent.length - 1).remove();
            code.childNodes[0].splitText(code.textContent.length - 2).remove();
            const span = document.createElement('span');
            katex.render(code.textContent, span, {displayMode: displayMode, throwOnError: false});
            code.parentNode.replaceChild(span, code);
        } else if (arguments.length == 4) {
            const [prev, code, next, displayMode] = arguments;
            prev.splitText(prev.textContent.length - 1 - displayMode).remove();
            next.splitText(1 + displayMode); next.remove();
            const span = document.createElement('span');
            katex.render(code.textContent, span, {displayMode: displayMode, throwOnError: false});
            code.parentNode.replaceChild(span, code);
        } else {
            throw Error(`Wrong number of arguments to ${processMath}`);
        }
    }

    for (const code of document.getElementsByTagName('code')) {
        const prev = code.previousSibling;
        const next = code.nextSibling;

        if (prev && prev.nodeType === Node.TEXT_NODE) {
            // dollar style
            if (next && next.nodeType === Node.TEXT_NODE) {
                if (/\$\$$/.test(prev.textContent) && /^\$\$/.test(next.textContent)) {
                    todo.push(() => processMath(prev, code, next, true));
                    continue;
                }
                if (/\$$/.test(prev.textContent) && /^\$/.test(next.textContent)) {
                    todo.push(() => processMath(prev, code, next, false));
                    continue;
                }
            }

            // bracket style (start outside, end inside)
            if (/\[$/.test(prev.textContent) && /\\\]$/.test(code.textContent)) {
                todo.push(() => processMath(prev, code, true));
                continue;
            }
            if (/\($/.test(prev.textContent) && /\\\)$/.test(code.textContent)) {
                todo.push(() => processMath(prev, code, false));
                continue;
            }
        }
    }

    for (const f of todo) f();
})()

@cben
Copy link

cben commented Jun 4, 2019

See also https://github.com/cben/mathdown/wiki/math-in-markdown with math syntaxes from a lot of markdown implementations.

While dollars and other TeX-like syntaxes seem most common, I second the recommendation for literal-based syntax, for example like GitLab, especially if you're considering embedding other non-markdown syntaxes (mermaid etc).

@brendanzab
Copy link

brendanzab commented Jun 4, 2019

On a more philosophical level, I guess the fear I have with something like:

```math
```

..is how do you distinguish between talking about math syntax, and using the math renderer? Here it's fine - just use tex or latex for syntax highlighted stuff. This becomes trickier with something like mermaid, bob, or plantuml. The advantage of using the code fences is graceful degradation to renderers that don't support it (unlike with dollars, which look terrible). But maybe a different directive than the language name should be used to describe how it should be rendered?

@cben
Copy link

cben commented Jun 4, 2019

There is a discussion on making that distinction:

@marcusklaas
Copy link
Collaborator

Thanks to @CAD97, @cben and @brendanzab for the excellent considerations and resources!

It looks to me that there seems to be some consensus on using code fences for display math, although it's not yet clear what language specifier would be best. This seems like a good idea: it degrades gracefully on renderers without math support. The alternatives like \[ … \] or $$…$$ don't have this property or may clash with escaped links.

The inline case is more diverse and more difficult. Dollar affixes seem to be most widespread and will be very familiar to LaTeX users. There are some concerns about unintended math spans from the natural use of dollar signs, but from the discussion earlier (January of 2017) in this thread I understand that with the right heuristics this can be managed in a way like it is for emphasis. Personally, I am partial to code span based syntax, like $`…`$ as suggested by @CAD97 or simply double backtick delimited spans. Again, these have the advantage that they degrade gracefully, have low risk of false positives and their semantics is fairly clear as there is no need for tricky heuristics. The big downside is that they seem to be nowhere near as widely used as simple dollar affixes.

Since there is no clear consensus on math syntax, we could even opt to not introduce any additional syntax or semantics. This is already possible for displays by letting users decide the language that makes sense for them, bypassing the issue flagged by @brendanzab. We could do something similar for inline math by exposing the length of the delimiters for code spans. The decision to interpret double backticked code spans as math or not is then left to the user. This would eliminate the need for a math option entirely.

@ratmice
Copy link

ratmice commented Jun 10, 2019

I have been meaning to find the time for writing up a proposal for,
https://talk.commonmark.org/t/consistent-attribute-syntax/272

With that in place we could get a graceful mechanism for dealing with inline code span language identifiers via ...{.math}, or even ...{.$}, personally at least, it is this aspect (how to represent math to the event handler), that I care about much more than the front-end decisions involving which format to parse. Because if we had support for that consistent attribute syntax, the parsing heuristics could piggy back onto all of that, merely adding the appropriate attribute.

Also @brendanzab I seem to recall (though really should check!) that the "info-string" of code fences can contain information besides the language of some form after the space, up-to newline. I hadn't seen this in pulldown_cmark though. Such that one could imagine something similar to:

```latex render
```

@CAD97
Copy link

CAD97 commented Jun 10, 2019

Per the CommonMark dingus at least, the language-tag class is the up-to-first-whitespace and the rest of the line is ignored. That would indeed mean that it's available for "extra" instructions to the markdown engine. It currently is not used in any way.

Of course, any solution that uses the block info-string doesn't work for inline, so the general extension mechanism (whether it be {. .} or whatever) would have to be used.

@SchrodingerZhu
Copy link

@Timmmm
Copy link

Timmmm commented Apr 4, 2021

Given that $ and $$ are the de facto standard and Pandoc hasn't experienced any issues with that syntax and there's an implementation of it is there a compelling reason to do anything else?

@SchrodingerZhu
Copy link

SchrodingerZhu commented Apr 5, 2021 via email

@YJDoc2
Copy link

YJDoc2 commented Sep 29, 2021

Hey, I was going through the discussion, and I might have a suggestion which might not work, or might solve a couple of problem:
Add a CustomTag, with syntax as @custom( _ , _ ) which will have at least two arguments. Both can be quoted strings, which I feel might reduce the things to check for in parsing. This tag will effectively act as a hook point for libraries which want to extend or provide custom processing , such as for images or custom references and asset systems.
The rendering condition for this will be :

  1. Something in the processing pipeline should convert this into one of the other tags
  2. If this tag is found at time of final render, the second argument of this will be rendered (with/without)? HTML escaping

So for inline math, we can do

@custom("math","1+2+3")

which then can be processed by math processing section to produce appropriate latex stuff (:sweat_smile:)

The escaping of this can also be of simple format , writing a \ before @ will escape @ , and then rest of the text starting form custom... will be parsed as normal markdown.
As I said, this would not only solve inline math issue, but also allow others to extend syntax as required if needed for their applications.
Of course, this will have issues, and two of them I see are :

  1. Support for multiline second argument : if we decide to support multiline second argument, we will need to process multiline argument given for second correctly with appropriate bracket and quote matching
  2. Type of arguments : While writing this I have assumed the arguments to be quoted strings, so that we just have to check for escaped quotes and everything else is part of the string. If we allow unquoted text as argument, then we also have to make sure that brackets and other stuff is also matched correctly.

I feel this along with @CAD97 's codeblock math can solve the math problem, and also allow others to extend the syntax as required, and solve additional problems such as adding support of custom id extension or other things as people need in their own crates.

As said in start this might not work, so would really like to hear opinions on this.

Thanks :)

@YJDoc2
Copy link

YJDoc2 commented Oct 5, 2021

Sorry to bother , but @marcusklaas , is this still active? 😅😅 If so, can you please take a look at my comment above? Thanks :)

@dkasak
Copy link

dkasak commented Oct 6, 2021

@YJDoc2: as an extensible extension syntax, your suggested syntax is not bad. But I think the primary concern is that it is yet another different and completely non-standard way of doing things which is incompatible with all other implementations.

I think the primary question is not how a completely new syntax would look like but rather why the widely supported $...$ and $$...$$ syntax shouldn't be adopted.

@CAD97
Copy link

CAD97 commented Oct 6, 2021

I think the primary question is [...] why the widely supported $...$ and $$...$$ syntax shouldn't be adopted.

The big one, even if you solve the "two $ in a paragraph turns into math when undesired" problem, is that it's not a widely supported markdown feature. (With testing, apparently MathOverflow actually does!)

Instead, most markdown renderer which "supports" math syntax does it the naive (and problematic) way, which you can already do with pulldown-cmark: just render the markdown, then run KaTeX or MathJax on the output. This works okayish for very simple stuff, but you will quickly run into clashes where the math is interpreted as markdown.

For minimal support, a markdown engine has to treat $ pairs/fences as escapes from markdown syntax, as you're now writing syntax for a different language (LaTeX).

But it gets wrinkly, quickly. You also have to understand (a subset of) your embedded language in order to understand when it ends. For MathJax style math, this means at least allowing \$ in the math (passing through unchanged) as well as unescaped $ within matching {}.

To be honest, I'm beginning to think the only somewhat reasonable approach (for $ fences) is to run the KaTeX/MathJax pass before the markdown parser.

@ratmice
Copy link

ratmice commented Oct 6, 2021

I agree with @CAD97 almost entirely about pre-parsing,

The reason that I like the consistent-attribute-syntax extension + code fences, is that it gives something for a pre-parser of $, $$ a syntax to output and pass on to the markdown parser (assuming we could agree to support that extension). The attributes also provide a path to solving the use-mention distinction problem that was stated above.

I need a larger subset of latex than either KaTeX/MathJax support (i.e. I convert markdown to TeX, and the markdown contains code fences with TeX code to be rendered). The same problem occurs if say you want to embed a graph in markdown and pass it to a graph-drawing algorithm for rendering.

So my preference is to punt on $ and other syntax woes, adopt an approach which works for arbitrary renderers, then let people preparse $ into that.

@CAD97
Copy link

CAD97 commented Oct 6, 2021

@ratmice : if passing the data through (mostly) unmodified is the only goal (and not human readability, since it's an internal step after preparsing), then the code block technique is sufficient. You can use however many ` are required to escape their presence in the data embed, and tag it with whatever textual prefix you aren't likely to accidentally use. And blocks are trivial, since you can use the language tag to stick the metadata.

A general extension point for markdown would be great to have, but I don't think pulldown-cmark should be defining one. (That would be the domain of CommonMark, the actual specification.)

@ratmice
Copy link

ratmice commented Oct 6, 2021

@CAD97 Yes, and I do agree. The specific extension I was referring to is https://talk.commonmark.org/t/consistent-attribute-syntax/272

It was mentioned it in my comment a long time ago, I probably should have linked to it again so people didn't have to search through the plethora of comments, apologies. I haven't quite followed what is going on with this extension or if there exists any other sufficient proposals though.

@cben
Copy link

cben commented Oct 6, 2021

it's not a widely supported markdown feature ...

Instead, most markdown renderer which "supports" math syntax does it the naive (and problematic) way,

I wish I kept some record in https://github.com/cben/mathdown/wiki/math-in-markdown which tools take such buggy shortcut. I've followed bug trackers in many tools, and the general trajectory seems to be:

  1. a majority of them start by buggy integration of KaTeX/MathJax after (or before) rendering.
  2. some users bump into bugs
  3. after enough user pressure, at least 50% become nearly bug free — either by doing it Right, or piling on enough kludges.
    (EDIT: it feels like ~half, but I don't have any numbers to back this up)

Running the math render before markdown is also buggy! You'll render math inside indented literal blocks, fenced literal blocks, HTML islands (including literals like <pre>) etc.

=> You really do need the markdown parser to understand the $ fences as a literal-like syntax. Doesn't need to fully parse it, but does need to note leave it untouched and note where it starts/ends for later rendering by KaTeX/MathJax.

And yes nesting: $math... \text{non-math $nested math$ non-math} ...math$ makes life even more ...interesting 😁. Though supporting such nesting correctly is a corner case. IMHO any markdown parser that only understands the basic escaping $...\$...$ correctly is Acceptably Good because for those rare cases there exists a workaround: use the other style of fences inside: $math... \text{non-math \(nested math\) non-math} ...math$ or \(math... \text{non-math $nested math$ non-math} ...math\).

@dkasak
Copy link

dkasak commented Oct 6, 2021

[...] is that it's not a widely supported markdown feature.

Well, I count pandoc and MathOverflow as "widely" in this instance. It is the closest thing we have to a standard on the matter. I certainly have a lot of documents which use the syntax, and I'm interested in being able to parse them with Rust.

But it gets wrinkly, quickly. You also have to understand (a subset of) your embedded language in order to understand when it ends.

I realize it's tricky, but pandoc does it, so we know it's not impossible! And it works well too.

So my preference is to punt on $ and other syntax woes, adopt an approach which works for arbitrary renderers, then let people preparse $ into that.

But if parsing the $-based syntax is as tricky as @CAD97 describes, won't all those users preparsing $ into the arbitrary renderer syntax need to implement this tricky logic themselves? Isn't it better to implement this tricky logic correctly once, in pulldown-cmark?

@conanchen
Copy link

just implement one, better than nothing.

@Eumeryx
Copy link

Eumeryx commented Feb 6, 2022

I suggest using the math delimiter from this talk:

  • ${ and }$ for inline math
  • $${ and }$$ for block math

Using them has the following benefits:

  1. reduces ambiguities wrt other inline use of dollars, e.g: 100$ on that site and 150$ on the other
  2. keeps asimmetry among opening and closing delimiters, so that it is possible to automatically exclude pending delimiters to avoid errors or to perform massive substitutions with other delimiters
  3. it is still compatible with latex: indeed opening and closing {} are hidden by latex processors as superfluous symbols (but they are not in the meta language!), so it is back compatible with other solutions.

@cben
Copy link

cben commented Feb 14, 2022

These benefits are good points. @Netsaver, I'm curious, is there any markdown software that specifically recognizes these delimeters? Googling, I also found you described the same syndax in pbek/QOwnNotes#529 but it's unclear to me whether there is a markdown processor actually requiring ${...}$ or do you just add { and } as a personal convention with processors that recognize $...$?

It's certainly possible to use that as just a convention, which begs the question should a tool care — why not recognize just dollars and let users insert braces if they wish — but that loses on benefits 1 and 2...

@CAD97
Copy link

CAD97 commented May 23, 2022

GFM now supports math notation using $$:

https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/writing-mathematical-expressions

$$
1 + \text{Can I embed **markdown** in \texttt{\\text}}?
$$

$$ 1 + \text{Can I embed markdown in \texttt{\text}}? $$

(it is there, check the edit history... but it's not actually rendering on my end 🙃)

$`1`$ renders as

$1$

What's somewhat interesting is that GFM doesn't allow escaping $ with \$ (likely due to their use of MathJax not allowing such escapes), instead requiring escapes to be written <span>$</span>.

@Timmmm
Copy link

Timmmm commented May 23, 2022

Yeah I assume we've all seen this critique of Github's implementation. It makes me all the more convinced that @Eumeryx's suggestion is a great idea.

@cben that article should answer your question "why not recognize just dollars and let users insert braces if they wish"!

@daira
Copy link

daira commented Oct 18, 2022

The critique of GitHub's implementation now admits that "Most of this has been fixed now". The issues raised there were not at all fundamental.

My two-penneth: the highest priority for $\LaTeX$ in Rust documentation is that it is usable for the people reading and writing it. This strongly favours the more concise option of using $ and $$ delimiters. The more extraneous cruft, the less usable it will be for the intended purpose, which is to clearly convey the mathematical content both in rendered documentation, and when reading the original Rust source in a text editor or IDE. Mathematicians are very used to reading and writing $\LaTeX$ source, most often with the $ and $$ delimiters. This is even more true now that you can use Markdown-embedded $\LaTeX$ on the web at sites that many mathematically-oriented software engineers use every working day, including GitHub and HackMD, with precisely that syntax.

In my experience as a cryptographic engineer, there is a very significant usability gap between the $a + b$ syntax and, for example, the :math:‵a + b‵ syntax sometimes used when embedding math in reStructuredText. To understand how much overhead the latter results in, see this example (scroll down to the Conventions section).

All of the syntactic ambiguities with $...$ are entirely solvable, and in practice the solutions used in pandoc are sufficient. If users hit a case where their documentation is not rendered as intended, they will fix it. (They need to proof-read their rendered docs anyway, since there are many existing potential causes of misrendering!) Of course the rules that are chosen, whether from pandoc or something else, should be precisely documented.

@tgross35
Copy link

tgross35 commented Jan 10, 2023

There’s some good news here: as of Chrome 110 109, MathML will be supported. That means the latest versions of Chrome, Firefox, Safari and Edge will all support it in about a month. https://caniuse.com/mathml

if you’re not familiar, it’s a XML style format for displaying equations that can just be in an HTML block and rendered properly by the browser. So, this sidesteps the KaTeX vs. MathJax issue and instead means we just need a TeX->MathML parser, of which there are many (don’t know about any written in rust though)

Test page for MathML if you want to check your browser: https://www.w3.org/Math/testsuite/mml2-testsuite/index.html (the files in TortureTests/Complexity are the best overview tests)

@tgross35
Copy link

For the task of tex->mathml, this crate is available: https://crates.io/crates/latex2mathml. License is MIT

There haven't been any updates since 2020, but it looks like it's a "done" crate. If a dependency is an issue, the entire crate could likely be pulled into this repository.

@Martin1887
Copy link
Collaborator

This is a complex but really useful feature. I will review the open pull requests and the different implementations in other parsers like mdBook relatively soon, but the final implementation will take some time.

@gengjun
Copy link

gengjun commented Oct 3, 2023

wow, this thread has been open for 8 years! do we now have a way to just skip $ and double $$ pairs and let browser parse the math on client side ?

@tgross35
Copy link

tgross35 commented Oct 3, 2023

I opened a proposal to the commonmark spec to add support for display blocks, which would theoretically allow math as well as mermaid/graphviz: commonmark/commonmark-spec#745

@Martin1887 Martin1887 added this to the v0.11 milestone Feb 10, 2024
@Martin1887
Copy link
Collaborator

Closed by #734.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests