Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MathJax 3 completion handler fails if the content contains processing instructions #2662

Closed
bthomale opened this issue Mar 24, 2021 · 8 comments
Labels
Accepted Issue has been reproduced by MathJax team Code Example Contains an illustrative code example, solution, or work-around Fixed Test Needed v3 v3.1
Milestone

Comments

@bthomale
Copy link

bthomale commented Mar 24, 2021

Issue Summary

We found a bug where, when using tex-mml-chtml conversion, if the input file is xhtml containing a processing instruction, the completion handler never fires. Additionally, the javascript will show the following error in the console:

Unhandled Promise Rejection: TypeError: t.getAttribute is not a function. (In 't.getAttribute(e)', 't.getAttribute' is undefined)

We've replicated this in WebKit / Safari on macOS, iOS. And also in the embedded webkit view in Android.

Essentially, it's iterating through the content looking at the "class" attribute of various nodes. Somehow it is assuming they're all elements, and when it hits a processing instruction node (where getAttribute is not defined) it throws this error. You can see exactly what is happening if you turn on "break on exceptions" in the WebKit debugger.

We noticed that it only happens during TEX input processing, which we don't need. Turning off TEX works around the problem, which is what we have done. But it's still a bug, so I thought you might be interested in looking at it.

I've included 3 different versions of the file to illustrate this:

  1. tex-mml-chtml-with-pi.xhtml

This is the one that fails. It has a completion handler defined that logs "done" to the console. If you open up the console you'll see the error I described, and "done" is never logged.

  1. tex-mml-chtml-no-pi.xhtml

This illustrates that without the pi the example completes successfully and logs "done" to the console.

  1. mml-chtml-with-pi.xhtml

This illustrates that with the pi, but with mml-chtml ONLY (no TEX) it also completes successfully and logs "done" to the console.

Steps to Reproduce:

  1. Open the attached file tex-mml-chtml-with-pi.xhtml
  2. Open the debug console
  3. Refresh the page. You'll see the error and that "done" is never logged.

Technical details:

  • MathJax Version: 3.1.2
  • Client OS: macOS 11.2.3
  • Browser: Safari 14.0.3

^ I was able to replicate with that. Easier than trying to reproduce with mobile.

Supporting information:

tex-pi-bug.zip

See the above attached zip for the promised example files. Just open them in Safari.

@dpvc
Copy link
Member

dpvc commented Mar 24, 2021

Thanks for the report. I can reproduce the issue. Frankly, I have never seen these processing instructions before, so was unaware even of their existence! I will have to modify the code to skip over non-Element nodes.

@dpvc dpvc added this to the 3.1.3 milestone Mar 24, 2021
@dpvc dpvc added Accepted Issue has been reproduced by MathJax team v3 labels Mar 24, 2021
@dpvc dpvc self-assigned this Mar 24, 2021
@pkra
Copy link
Contributor

pkra commented Mar 24, 2021

FWIW, XML processing instructions are not valid in (x)html5.

@dpvc
Copy link
Member

dpvc commented Mar 24, 2021

Thanks @pkra. I'm not able to find a reference to that. I see where they are not allowed in HTML5, but I can't find a restriction in XHTML5. Do you have a citation for that?

@pkra
Copy link
Contributor

pkra commented Mar 24, 2021

(I don't mean for this to have any bearing on what the MathJax team decides to support.)

IIUC it's difficult to find because there's not really such a thing as XHTML5. E.g., https://en.wikipedia.org/wiki/HTML5#XHTML_5_(XML-serialized_HTML_5) writes

XHTML 5 is simply XML-serialized HTML 5 data (that is, HTML 5 constrained to XHTML's strict requirements, e.g., not having any unclosed tags), sent with one of XML media types.

So HTML5 does not allow processing instructions and XML serialization only adds restrictions, it does not remove any.
In any case the nu validator will complain about these documents (though initially about the missing doctype, then about the intial instruction).

That being said, given that this is coming from VitalSource (who, ahem, should really become a MathJax sponsor?), it probably comes from an epub file and, alas, epubcheck does allow processing instructions. [Still, this should really be sorted out in production]

@dpvc
Copy link
Member

dpvc commented Mar 24, 2021

@pkra, thanks for the additional information. I was mostly just curious about what the various standards allow. I have worked out a means of ignoring those nodes that should resolve the issue, but your comment piqued my interest.

@bthomale
Copy link
Author

As far as I know, processing instructions are valid in xhtml (and valid in epub files). And yes, this came from debugging an issue with a real epub on our platform, so people are using these.

Thanks for fixing this for us!

@dpvc
Copy link
Member

dpvc commented Mar 24, 2021

I have made a pull request to resolve the issue. In the meantime, you can use the following configuration to work around the problem for now.

MathJax = {
  startup: {
    ready: function() {
      //
      //  Patch the HTMLAdaptor's kind() method to return '' for node types we can't process.
      //
      const {HTMLAdaptor} = MathJax._.adaptors.HTMLAdaptor;
      HTMLAdaptor.prototype.kind = function (node) {
        const n = node.nodeType;
        return (n === 1 || n === 3 || n === 8 ? node.nodeName.toLowerCase() : '');      
      };
      
      //
      //  Subclass the DOM string walker to handle unknown node types
      //
      const {HTMLDomStrings} = MathJax._.handlers.html.HTMLDomStrings;
      class myDomStrings extends HTMLDomStrings {
        handleContainer(node, ignore) {
          if (this.adaptor.kind(node) === '') {
            this.pushString();
            return [this.adaptor.next(node), ignore];
          } else {
            return super.handleContainer(node, ignore);
          }
        }
      }
      
      //
      //  Use our DOM string class instead of the usual one
      //
      MathJax.config.options = {DomStrings: new myDomStrings()};
      
      //
      //  Do the usual processing
      //
      MathJax.startup.defaultReady();
      
      //
      //  Inform us when it's done
      //
      MathJax.startup.promise.then(function() {
       	console.log("done");
      });
    }
  }
};

@dpvc
Copy link
Member

dpvc commented Mar 24, 2021

PS, this could be put into a separate script file that is loaded before MathJax itself if you need it for multiple pages.

@dpvc dpvc added the Code Example Contains an illustrative code example, solution, or work-around label Apr 1, 2021
dpvc added a commit to mathjax/MathJax-src that referenced this issue Apr 8, 2021
Have HTMLDomStrings skip node types that it can't process. (mathjax/MathJax#2662)
@dpvc dpvc added Merged Merged into develop branch and removed Ready for Review labels Apr 8, 2021
@dpvc dpvc added Fixed v3.1 and removed Merged Merged into develop branch labels Apr 27, 2021
@dpvc dpvc closed this as completed Apr 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Issue has been reproduced by MathJax team Code Example Contains an illustrative code example, solution, or work-around Fixed Test Needed v3 v3.1
Projects
None yet
Development

No branches or pull requests

3 participants