-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML parser test failures after #6993 #6439
Comments
This is intended to match Chromium and WebKit. Fixes #5117.
cc @whatwg/html-parser |
Thanks @TRowbotham! The rationale for changing the reprocess the token part was this:
So, evidently, things wouldn't work with "process the token using the rules for". What about my original question? Given this case:
What happens if the spec says to reprocess the token? Does it reprocess the token still in the "in foreign content" mode, and again say to reprocess the token (i.e. get stuck in an infinite loop)? If yes, maybe it can be fixed by setting a flag before reprocessing the token, and checking for that flag in the tree construction dispatcher (and resetting the flag there also)? |
Yes, the given case causes the parser to get stuck in an infinite loop if it just says reprocess the token. It seems I slightly misunderstood the meaning of reprocess the token, but perhaps it works in our favor. I understood it as "reprocess the token using the current insertion mode", and this seemingly appears to yield the desired result; the given case does not result in an infinite loop and the 2 test failures I was seeing go away. Is there anything wrong with changing the spec to read "reprocess the token using the current insertion mode" instead of "Process the token using the rules for the "in body" insertion mode."? If the intent is to break out of foreign content to whatever the last non-foreign element was, then using whatever the last insertion mode was kinda makes sense. |
I can't confidently say whether or not this is the right interpretation. This should probably be made clearer. 🙂
As long as it does what we need and is unambiguous, it's good. I'd like to hear what others think, in particular @hsivonen |
Is there any difference between: + <p>Reprocess the token according to the rules given in the section corresponding to the current
+ <span>insertion mode</span> in HTML content.</p> and + <p>Process the token according to the rules given in the section corresponding to the current
+ <span>insertion mode</span> in HTML content.</p> (i.e., "Reprocess the token" vs "Process the token") The clause in "Any other end tag" in foreign content says "Process the token ...". |
PR: #6455 |
This is the sort of thing that's hard to review without writing the code. I'll try to get to the code soon. |
The regression was introduced in f690ad9 (whatwg#6399). For the case: <table><math><p>foo the <p> token would not have foster parenting enabled, thus inserting it into the table. Fixes whatwg#6439.
corresponding changes in HTML spec are: whatwg/html@f690ad9 and follow-up discussion at whatwg/html#6439
* wip on updating html5lib-tests * fix up parse error parsing * add better debug output * wip * wip * wip * wip * adjust all switches to BogusComment (according to html5gum) * wip * wip * wip * wip * wip * wip * wip (test3 done) * fix test1 * wip on entities.test * get rid of addnl_allowed in charref tokenizer * remove bogusname??? * fix escapeFlag.test: End tag surrounded by bogus comment in RCDATA or RAWTEXT (in state RawData(Rawtext)) * update html5lib tests * Revert "remove bogusname???" This reverts commit 575b077. * wip restore bogusname * more bugfixes * Revert "wip restore bogusname" This reverts commit eb28165. * fix a bug when peeking characters in BeforeAttributeValue * make eat() pre-process input characters input where it matters (JSON-escaped): "<!DOCTYPE0\r\nPUBLIC'" * update charref states * add regression tests, skip broken test * fix hang * fix bug where ignore_lf was not reset during unconsuming * fix webkit02.dat-26 test * fix wbekit02.dat-22 * fix ack self-closing * fix tests26.dat-19 * fix foreign-fragment.dat-65 corresponding changes in HTML spec are: whatwg/html@f690ad9 and follow-up discussion at whatwg/html#6439 * fix search-element.dat-0 * fix search-element.dat-1 * fix bug in charref tokenizer wrt newline normalization
I'm seeing 2 test failures with the recent change from #6399; in particular, the change to use the "in body insertion mode" directly, instead of reprocessing with the current insertion mode. In both cases[1][2], we lose out on foster parenting with the switch to using the "in body insertion mode", which causes the
<p>baz
part to be inserted into the table rather than outside the table. Switching back to previous behavior where reprocessing the token uses the current insertion mode seems to resolve the failures.[1] https://github.com/html5lib/html5lib-tests/blob/master/tree-construction/tests9.dat#L270
[2] https://github.com/html5lib/html5lib-tests/blob/master/tree-construction/tests10.dat#L236
The text was updated successfully, but these errors were encountered: