MathJax 3 completion handler fails if the content contains processing instructions #2662

bthomale · 2021-03-24T16:02:47Z

Issue Summary

We found a bug where, when using tex-mml-chtml conversion, if the input file is xhtml containing a processing instruction, the completion handler never fires. Additionally, the javascript will show the following error in the console:

Unhandled Promise Rejection: TypeError: t.getAttribute is not a function. (In 't.getAttribute(e)', 't.getAttribute' is undefined)

We've replicated this in WebKit / Safari on macOS, iOS. And also in the embedded webkit view in Android.

Essentially, it's iterating through the content looking at the "class" attribute of various nodes. Somehow it is assuming they're all elements, and when it hits a processing instruction node (where getAttribute is not defined) it throws this error. You can see exactly what is happening if you turn on "break on exceptions" in the WebKit debugger.

We noticed that it only happens during TEX input processing, which we don't need. Turning off TEX works around the problem, which is what we have done. But it's still a bug, so I thought you might be interested in looking at it.

I've included 3 different versions of the file to illustrate this:

tex-mml-chtml-with-pi.xhtml

This is the one that fails. It has a completion handler defined that logs "done" to the console. If you open up the console you'll see the error I described, and "done" is never logged.

tex-mml-chtml-no-pi.xhtml

This illustrates that without the pi the example completes successfully and logs "done" to the console.

mml-chtml-with-pi.xhtml

This illustrates that with the pi, but with mml-chtml ONLY (no TEX) it also completes successfully and logs "done" to the console.

Steps to Reproduce:

Open the attached file tex-mml-chtml-with-pi.xhtml
Open the debug console
Refresh the page. You'll see the error and that "done" is never logged.

Technical details:

MathJax Version: 3.1.2
Client OS: macOS 11.2.3
Browser: Safari 14.0.3

^ I was able to replicate with that. Easier than trying to reproduce with mobile.

Supporting information:

tex-pi-bug.zip

See the above attached zip for the promised example files. Just open them in Safari.

dpvc · 2021-03-24T16:28:00Z

Thanks for the report. I can reproduce the issue. Frankly, I have never seen these processing instructions before, so was unaware even of their existence! I will have to modify the code to skip over non-Element nodes.

pkra · 2021-03-24T19:58:24Z

FWIW, XML processing instructions are not valid in (x)html5.

dpvc · 2021-03-24T20:11:24Z

Thanks @pkra. I'm not able to find a reference to that. I see where they are not allowed in HTML5, but I can't find a restriction in XHTML5. Do you have a citation for that?

pkra · 2021-03-24T20:46:16Z

(I don't mean for this to have any bearing on what the MathJax team decides to support.)

IIUC it's difficult to find because there's not really such a thing as XHTML5. E.g., https://en.wikipedia.org/wiki/HTML5#XHTML_5_(XML-serialized_HTML_5) writes

XHTML 5 is simply XML-serialized HTML 5 data (that is, HTML 5 constrained to XHTML's strict requirements, e.g., not having any unclosed tags), sent with one of XML media types.

So HTML5 does not allow processing instructions and XML serialization only adds restrictions, it does not remove any.
In any case the nu validator will complain about these documents (though initially about the missing doctype, then about the intial instruction).

That being said, given that this is coming from VitalSource (who, ahem, should really become a MathJax sponsor?), it probably comes from an epub file and, alas, epubcheck does allow processing instructions. [Still, this should really be sorted out in production]

dpvc · 2021-03-24T20:49:46Z

@pkra, thanks for the additional information. I was mostly just curious about what the various standards allow. I have worked out a means of ignoring those nodes that should resolve the issue, but your comment piqued my interest.

bthomale · 2021-03-24T21:03:15Z

As far as I know, processing instructions are valid in xhtml (and valid in epub files). And yes, this came from debugging an issue with a real epub on our platform, so people are using these.

Thanks for fixing this for us!

…athJax#2662)

dpvc · 2021-03-24T23:58:09Z

I have made a pull request to resolve the issue. In the meantime, you can use the following configuration to work around the problem for now.

MathJax = {
  startup: {
    ready: function() {
      //
      //  Patch the HTMLAdaptor's kind() method to return '' for node types we can't process.
      //
      const {HTMLAdaptor} = MathJax._.adaptors.HTMLAdaptor;
      HTMLAdaptor.prototype.kind = function (node) {
        const n = node.nodeType;
        return (n === 1 || n === 3 || n === 8 ? node.nodeName.toLowerCase() : '');      
      };
      
      //
      //  Subclass the DOM string walker to handle unknown node types
      //
      const {HTMLDomStrings} = MathJax._.handlers.html.HTMLDomStrings;
      class myDomStrings extends HTMLDomStrings {
        handleContainer(node, ignore) {
          if (this.adaptor.kind(node) === '') {
            this.pushString();
            return [this.adaptor.next(node), ignore];
          } else {
            return super.handleContainer(node, ignore);
          }
        }
      }
      
      //
      //  Use our DOM string class instead of the usual one
      //
      MathJax.config.options = {DomStrings: new myDomStrings()};
      
      //
      //  Do the usual processing
      //
      MathJax.startup.defaultReady();
      
      //
      //  Inform us when it's done
      //
      MathJax.startup.promise.then(function() {
       	console.log("done");
      });
    }
  }
};

dpvc · 2021-03-24T23:58:52Z

PS, this could be put into a separate script file that is loaded before MathJax itself if you need it for multiple pages.

Have HTMLDomStrings skip node types that it can't process. (mathjax/MathJax#2662)

dpvc added this to the 3.1.3 milestone Mar 24, 2021

dpvc added Accepted Issue has been reproduced by MathJax team v3 labels Mar 24, 2021

dpvc self-assigned this Mar 24, 2021

dpvc added a commit to mathjax/MathJax-src that referenced this issue Mar 24, 2021

Have HTMLDomStrings skip node types that it can't process. (mathjax/M…

e686e53

…athJax#2662)

dpvc mentioned this issue Mar 24, 2021

Have HTMLDomStrings skip node types that it can't process. (mathjax/MathJax#2662) mathjax/MathJax-src#658

Merged

dpvc added Ready for Review Test Needed labels Mar 24, 2021

dpvc removed their assignment Mar 24, 2021

dpvc added the Code Example Contains an illustrative code example, solution, or work-around label Apr 1, 2021

dpvc added a commit to mathjax/MathJax-src that referenced this issue Apr 8, 2021

Merge pull request #658 from mathjax/issue2662

3d59e72

Have HTMLDomStrings skip node types that it can't process. (mathjax/MathJax#2662)

dpvc added Merged Merged into develop branch and removed Ready for Review labels Apr 8, 2021

dpvc added Fixed v3.1 and removed Merged Merged into develop branch labels Apr 27, 2021

dpvc closed this as completed Apr 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MathJax 3 completion handler fails if the content contains processing instructions #2662

MathJax 3 completion handler fails if the content contains processing instructions #2662

bthomale commented Mar 24, 2021 •

edited

Loading

dpvc commented Mar 24, 2021

pkra commented Mar 24, 2021

dpvc commented Mar 24, 2021

pkra commented Mar 24, 2021 •

edited

Loading

dpvc commented Mar 24, 2021

bthomale commented Mar 24, 2021

dpvc commented Mar 24, 2021

dpvc commented Mar 24, 2021 •

edited

Loading

MathJax 3 completion handler fails if the content contains processing instructions #2662

MathJax 3 completion handler fails if the content contains processing instructions #2662

Comments

bthomale commented Mar 24, 2021 • edited Loading

Issue Summary

Steps to Reproduce:

Technical details:

Supporting information:

dpvc commented Mar 24, 2021

pkra commented Mar 24, 2021

dpvc commented Mar 24, 2021

pkra commented Mar 24, 2021 • edited Loading

dpvc commented Mar 24, 2021

bthomale commented Mar 24, 2021

dpvc commented Mar 24, 2021

dpvc commented Mar 24, 2021 • edited Loading

bthomale commented Mar 24, 2021 •

edited

Loading

pkra commented Mar 24, 2021 •

edited

Loading

dpvc commented Mar 24, 2021 •

edited

Loading