Restrict the main element to be used once per document #3354

annevk · 2018-01-15T11:45:32Z

Multiple main elements are still allowed if all but one have the hidden attribute set.

This also slightly alters the definition of the body element as it effectively encompasses all content of a document, except for some metadata.

Fixes #100.

/acknowledgements.html ( diff )
/dom.html ( diff )
/grouping-content.html ( diff )
/indices.html ( diff )
/sections.html ( diff )

Multiple main elements are still allowed if all but one have the hidden attribute set. This also slightly alters the definition of the body element as it effectively encompasses all content of a document, except for some metadata. Fixes #100.

foolip · 2018-01-15T12:33:14Z

@tobie, I think there's a differ bug here, outside of the changes made. Note how https://html.spec.whatwg.org/multipage/dom.html#phrasing-content and https://whatpr.org/html/3354/dom.html#phrasing-content both say "Text, in the context of content models" but in https://whatpr.org/html/3354/97d8432...38465fa/dom.html it's "Text , in the context of content models" with more whitespace. Seems like this is the case in lots of other places too. Can you take a look?

tobie · 2018-01-15T13:04:53Z

@foolip: yeak that's an unfortunate but known issue of the HTML diff service that pr-preview uses. It adds line breaks all over the place, which obviously creates issues with inline elements.

foolip

Looking at which elements still have just "flow content" as their content model, and which get "flow content excluding main", I'm not sure I understand the reasoning.

The places where main is already disallowed are nav, aside, header and footer. And then header and footer are disallowed in address, dt, th.

Also disallowing main in address, dt and th seems fine, if not particularly useful. I don't quite see why it should be disallowed in article, section, blockquote, td, figure or form. It doesn't particularly make sense, but neither does dd, dialog, figcaption or li where it is still allowed.

I'd like to err on the side of allowing, basically.

foolip · 2018-01-15T12:40:04Z

source

@@ -15172,7 +15175,7 @@ interface <dfn>HTMLBodyElement</dfn> : <span>HTMLElement</span> {};
   <dd w-nohtml>Uses <code>HTMLBodyElement</code>.</dd>
  </dl>

-  <p>The <code>body</code> element <span>represents</span> the main content of the document.</p>
+  <p>The <code>body</code> element <span>represents</span> the contents of the document.</p>


To be clear, this is a change that could have been made without the other changes in this PR, right? This was inaccurate before as well, since headers and footers could also be inside the body. If that's not the right way to think of it, I might be missing something that I should be paying attention to elsewhere in these changes.

Yeah, I haven't actually looked at when this was written, but that seems correct.

It seems this description of the body element goes back to 2006, before the main element was introduced.

Thanks for checking that.

foolip · 2018-01-15T12:55:13Z

source


-  <p class="note">While there is no restriction as to the number of <code>main</code> elements in a
-  document, web developers are encouraged to stick to a single element.</p>
+  <p>A document must not have more than one <code>main</code> element, unless all but one of the


I don't think the difference between should and must will matter much, but is there some logic to when we use which? Looks like we have 12 "element must not" and 3 "element should not", but I can't really tell why.

Is it intentional that having only <main hidden> isn't allowed? Seems like a harmless thing to allow, if, for example, a script populates it and then removes the hidden attribute.

I think must is preferable if you can, but sometimes the requirement might be hard to fulfill and then should can be reasonable.

"must" wins in number of occurrences, so OK.

foolip · 2018-01-15T12:57:15Z

source

+   data-x="attr-hidden">hidden</code> attribute on those that are not current:
+
+   <pre>&lt;!doctype html>
+&lt;html lang=en-CA>


Good example!

foolip · 2018-01-15T13:19:58Z

source

@@ -16118,7 +16121,8 @@ Space is not the only void</pre>
   <dt><span data-x="concept-element-contexts">Contexts in which this element can be used</span>:</dt>
   <dd>Where <span>flow content</span> is expected.</dd>
   <dt><span data-x="concept-element-content-model">Content model</span>:</dt>
-   <dd><span>Flow content</span>, but with no <code>header</code>, <code>footer</code>, or <code>main</code> element descendants.</dd>
+   <dd><span>Flow content excluding main</span>, but with no <code>header</code> or


I'd prefer "Flow content excluding main" to be expanded in place, treating main more like other elements. Here: "Flow content, but with no header, footer or main element descendants." (unchanged, I think)

I started with that and expected the review feedback to go the other way given that repeating yourself too much is never great. Happy to flip it back though.

I may well have asked for the opposite if you did, who knows :)

I think the changes will be fewer if we trim the cases where main is disallowed, but let's start with figuring that out, and then inline the changes if the changes are no more than currently.

foolip · 2018-01-15T13:20:48Z

source

@@ -42744,7 +42779,8 @@ interface <dfn>HTMLTableCellElement</dfn> : <span>HTMLElement</span> {
   <dt><span data-x="concept-element-contexts">Contexts in which this element can be used</span>:</dt>
   <dd>Where <span>flow content</span> is expected.</dd>
   <dt><span data-x="concept-element-content-model">Content model</span>:</dt>
-   <dd><span>Flow content</span>, but with no <code>form</code> element descendants.</dd>
+   <dd><span>Flow content excluding main</span>, but with no <code>form</code> element


This is the form element, any reason to change this?

For the same reason I changed td and blockquote. If there's one main in a document it doesn't make sense as a descendant of them.

annevk · 2018-01-15T14:27:40Z

Given where main makes sense out of these elements it seems it basically has to be a child of either body or div. Given that it doesn't make sense as being "flow content". I've updated this PR as appropriate. While doing so I also found that the indexes needed updating too.

annevk · 2018-01-15T14:45:52Z

It'd be good to get feedback from @stevefaulkner on that last bit though as the W3C fork was in a somewhat inbetween state, where you could still nest main in td, blockquote, or section for example. I suspect that was an oversight, but would be good to be sure.

stevefaulkner · 2018-01-15T15:14:37Z

@annevk, section not an oversight, as there is no rule to stop use of a section which effectively contains everything in the document it seemed reasonable to allow in this case. Also given sections practical meaning as = div, it seemed OK to allow. As for blockquote and td yes they should be included in restrictions, it was an oversight.

foolip · 2018-01-15T15:59:00Z

@tobie, seems like something's wrong with https://whatpr.org/html/3354/97d8432...8f1b75f/dom.html, commit 8f1b75f reverted "Flow content excluding main" but it's shown as an addition.

foolip · 2018-01-15T15:54:41Z

source

@@ -15142,7 +15140,8 @@ interface <dfn>HTMLStyleElement</dfn> : <span>HTMLElement</span> {
   <dt><span data-x="concept-element-contexts">Contexts in which this element can be used</span>:</dt>
   <dd>As the second element in an <code>html</code> element.</dd>
   <dt><span data-x="concept-element-content-model">Content model</span>:</dt>
-   <dd><span>Flow content</span>.</dd>
+   <dd>Zero or one <code>main</code> element optionally intermixed with <span>flow


Should be zero or more to allow for <main hidden>, right?

foolip · 2018-01-15T16:29:04Z

Since we're down to debating what should happen with each case where flow content is allowed, here an exhaustive listing of the elements where "content model" has "flow content" in its description, and the validity of <main> within:

element	was allowed	now allowed
address	yes	no (change)
article	yes	no (change)
aside	no	no
blockquote	yes	no (change)
body	yes	yes
caption	yes	no (change)
dd	yes	no (change)
details	yes	no (change)
dialog	yes	no (change)
div	yes	yes
dt	yes	no (change)
fieldset	yes	no (change)
figcaption	yes	no (change)
figure	yes	no (change)
footer	no	no
form	yes	no (change)
header	no	no
li	yes	no (change)
main	yes	no (change)
nav	no	no
section	yes	no (change)
td	yes	no (change)
th	yes	no (change)

This is somewhat simplified because the content model of some elements, like figure, is more complicated.

tobie · 2018-01-15T16:55:44Z

@foolip: OK, seems there's an AWS issue that's not being properly handled by pr-preview. Have you observed this elsewhere?

foolip · 2018-01-15T17:21:50Z

@tobie, this is the first time I know that I've looked at a preview again after PR changes, so couldn't say if this has happened elsewhere.

annevk · 2018-01-15T19:11:23Z

@stevefaulkner I think I'd prefer trying it out with restricting it from within section too as I think it makes little sense, but that's subject to review from @sideshowbarker.

Speaking of which, it'd be good to have an HTML checker PR lined up as well so when this change is made folks can verify the impact on their own sites.

stevefaulkner · 2018-01-15T20:48:01Z

@annevk I am sanguine in regards to its restriction, as I recall it was restricted, but after discussion with Marcos Caceres I changed it to allow, but can't find the thread :-(

foolip · 2018-01-15T22:31:53Z

I'd also like to hear from @sideshowbarker if there are some principles about validator warnings/errors we should keep in mind, or indeed principles about conformance generally.

As it is, I'm inclined to err on the side of allowing even things that seemingly don't make much sense unless we think they are both plausible mistakes and would cause user confusion/harm. Going for just body and div as the allowed parents, I'd like to take a look at what proportion of documents out there would become invalid. I don't want to go overboard with measuring and statistics, but it would be very bad if we made lots of harmless things invalid because we hadn't thought them plausible.

marcoscaceres

Just noting that @foolip question, "Is it intentional that having only <main hidden> isn't allowed?" wasn't fully addressed yep. Might be worth clarifying that - so noting it in case it was forgotten.

Apart from that, LGTM!

annevk · 2018-01-16T07:54:34Z

I tried to think that through and I think requiring at least one without hidden is better, for progressive enhancement. Otherwise you'd end up with a mostly blank document if JavaScript is disabled, assuming <main> is used correctly to mark up the main contents.

foolip · 2018-01-16T09:01:25Z

I don't think we should make any attempt to ensure that documents have some useful content with JavaScript disabled. Currently, <body></body> is valid, and so is <table></table>, both cases that don't make sense on their own.

annevk · 2018-01-16T09:26:10Z

That is incorrect, we have a should requirement on that being not the case. See palpable content.

foolip · 2018-01-16T10:46:49Z

https://html.spec.whatwg.org/multipage/dom.html#palpable-content says:

As a general rule, elements whose content model allows any flow content or phrasing content should have at least one node in its contents that is palpable content and that does not have the hidden attribute specified.

Note: Palpable content makes an element non-empty by providing either some descendant non-empty text, or else something users can hear (audio elements) or view (video or img or canvas elements) or otherwise interact with (for example, interactive form controls).

This requirement is not a hard requirement, however, as there are many cases where an element can be empty legitimately, for example when it is used as a placeholder which will later be filled in by a script, or when the element is part of a template and would on most pages be filled in but on some pages is not relevant.

Conformance checkers are encouraged to provide a mechanism for authors to find elements that fail to fulfill this requirement, as an authoring aid.

That's a lot of leeway for validators to point out things that might be wrong, and I'd really prefer to err on the side of allowing that which might be reasonable.

annevk · 2018-01-16T11:03:27Z

When a document has multiple main elements, at least all but one must have the hidden attribute specified. The remaining main element should not have the hidden attribute specified.

Note: this requirement is analogous to that for [palpable content].

sideshowbarker · 2018-01-16T14:32:15Z

FYI, I’m in the process of adding use counters to the HTML checker backend to get some relevant data back on main usage — for example, the percentage of documents checked that contain main at all, any main with the hidden attribute, multiple unhidden main, any main within section, etc.

I had hoped to get those use counters added today my time but it’s looking like I won’t until some time tomorrow (= 8+ hours from now).

Relates to whatwg/html#3354

annevk · 2018-01-19T09:39:06Z

See #3354 (comment) for an alternative suggestion that makes it a "should not", but given the use case it seems reasonable to just allow it as @foolip has been suggesting all along.

foolip · 2018-01-19T10:46:03Z

@sideshowbarker

@foolip, given #3354 (comment), it seems maybe you already agree the cases other than main in section or form aren’t worth spending more time together on — so that is, for those other cases, the spec should define those as non-conforming (as @annevk wrote it already in this PR). Right?

Thanks for doing all that research! Are your tables for just the parent element so that <body><form><div><main> would could in the <div> category?

In any case, your numbers to confirm that it's really only section and form that are worth worrying about, as the rest makes less sense and is less common.

So I dunno maybe that httparchive data needs to doublechecked to see why it unexpectedly shows main in section being used only 1.3 times as often as main in form.

Bugs are certainly possible, but shouldn't we expect rather different results from these two ways of measuring, just like how Chrome's use counters can show high usage for something that's still very hard to find in httparchive? For the checker the bias is people who tried to get it right. Since the <form> bits seem to be from an ASP framework, maybe that's just used a lot and people can't control the all of the markup and thus don't bother to validate?

So, should we allow form ancestors? I think it will look a bit curios that we allowed it but not for example li. @stevefaulkner, would it make sense to allow it only if the form has no accessible name, and what does that map to? Having a name attribute?

foolip · 2018-01-19T10:47:35Z

@annevk another thing that kinda fell off the radar is <div hidden><main>. Should we bother with that case? @sideshowbarker, at what point does it just start to get annoying to explain in the error messages for the validator?

annevk · 2018-01-19T10:59:12Z

@foolip per feedback from @AliceWonderMiscreations I think we should allow all main elements to be hidden, if any are present, which would include that case.

foolip · 2018-01-19T11:04:57Z

I certainly support that change, but I was wondering about the case where an ancestors has the hidden attribute. The spec as written requires it to be on the main elements.

annevk · 2018-01-19T11:12:53Z

Oh I see, I'd rather not go there and keep things as simple as we can.

stevefaulkner · 2018-01-19T11:16:40Z

@foolip

would it make sense to allow it only if the form has no accessible name, and what does that map to? Having a name attribute?

Makes sense. In this case accessible name is provided from aria-label/aria-labelledby with fallback to title attribute if present and neither of previous are present.

Note: as far as I am aware the name attribute is not used in accname calculation on any element.

…o area)

sideshowbarker · 2018-01-19T11:25:57Z

I was wondering about the case where an ancestors has the hidden attribute.

Oh I see, I'd rather not go there and keep things as simple as we can.

Me too. At least unless/until we get feedback from users who are frustrated by it being that way (e.g., bug reports against the HTML checker).

@sideshowbarker, at what point does it just start to get annoying to explain in the error messages for the validator?

Well for the record here and since you asked, I don’t think it’d not be any more annoying to explain in a checker message than what the current message already says, which is just:

Error: A document must not include more than one visible main element.

Seems like I wouldn’t even need to change that message at all — since it already doesn’t what element the hidden attribute happens to be on.

And also as far as implementing “the case where an ancestors has the hidden attribute” on the checker backend, it wouldn’t be much additional work over what’s already implemented. (The existing check is already written in custom Java code rather than being expressed in the RelaxNG schema.)

annevk · 2018-01-19T11:29:49Z

@foolip @stevefaulkner I thought there was agreement that allowing <form> was fine because of the ASP.NET pattern? If we require something of that <form> that would basically not help existing content. I also don't see why we'd equate form with li.

stevefaulkner · 2018-01-19T11:35:32Z

@annevk @foolip

I thought there was agreement that allowing
was fine because of the ASP.NET pattern?

yes

If we require something of that that would basically not help existing content.

i thought @foolip meant disallowing only when the form element had an accessible name, which i would be surprised if it it effects any of the asp.net cases, but am fine with not restricting, as its likely that it will be caught by nesting restrictions anyway. Yeah I am unsure about the li connection as it is seriously unlikely that it would be used to wrap the main content of a document.

sideshowbarker · 2018-01-19T11:52:01Z

Are your tables for just the parent element so that <body><form><div><main> would could in the <div> category?

Not just for the parent element but instead for any ancestor. So it records a count for <body><form><div><main> in three places: as part of the total 47684, for main in form as well as for main in div.

For the checker the bias is people who tried to get it right.

I kind of wish that were the case, but for better or worse I think the reality is that most documents which get fed to the checker are not necessarily from people trying to get it right. Whenever I grab a random URL from the checker logs, I find the document at the URL typically has many markup errors and many CSS errors — often literally dozens of both.

I think in part the circumstance behind that fact is, a lot of URLs are getting sent to the checker by some kind of automatic means not be the authors who wrote them but instead … in some other weird way for some odd reason(s).

And that’s the situation even after having set up a lot of filtering on the checker backend to block certain bad-behaving browser extensions and other things that are clearly junk requests.

So I think it’s not necessarily safe to assume much at all on scale about the average usage pattern for the checker. I think for every document it processes that’s submitted from an author/developer trying to do the right thing, there’s at least one document it processes that’s a unbelievably bad mess of really horrible markup and CSS.

annevk · 2018-01-19T11:59:22Z

I made it to allow only "form without accessible name" as that seems reasonable. I also updated how I defined the hierarchy restriction to make it more similar to what we do for area as that avoids a lot of duplication throughout the document. I added Alice Wonder to the acknowledgments section (@AliceWonderMiscreations let me know if you prefer a different name there).

I think this is ready for a fresh round of reviews as I've addressed all the nits I found myself.

foolip · 2018-01-19T12:13:14Z

OK, so let's leave the hidden attribute ancestor stuff alone, it'd be a pretty straightforward tweak later.

foolip

This all looks good to me now!

foolip · 2018-01-19T12:23:24Z

source


-  <p class="note">While there is no restriction as to the number of <code>main</code> elements in a
-  document, web developers are encouraged to stick to a single element.</p>
+  <p>A <dfn>hierarchically correct <code>main</code> element</dfn> is one whose ancestor elements


This is great, simplified a lot of the other bits. Thanks!

annevk · 2018-01-19T19:32:30Z

I'll land this next Tuesday, barring any objections. It'd be great to get explicit confirmation from @AliceWonderMiscreations on #3354 (comment) before then though.

Also, thanks everyone for the help and my apologies about the delay in getting this done and all the angst it has caused over the intervening years. (To be clear, I realize more needs to be done and I will focus on that next.)

aardrian · 2018-01-19T20:02:36Z

@iandevlin:

@stevefaulkner Yes, ASP.NET requires everything to be wrapped in a <form> because the entire framework revolves around POST commands.

I thought that mostly disappeared in ASP.NET. If Microsoft is moving away / has moved away from it, that may not be sufficient justification.

AliceWonderMiscreations · 2018-01-19T20:39:54Z

#3354 is fine way to reference me, thank you.

sideshowbarker · 2018-01-31T23:11:24Z

The HTML checker now also implements the “hierarchically correct” restriction; validator/validator@8024cd8

JaninaSajka · 2018-02-01T14:31:41Z

Awesome! Thanks, Mike! Janina Michael[tm] Smith writes:

…

The HTML checker now also implements the “hierarchically correct” restriction; validator/validator@8024cd8 -- You are receiving this because you are on a team that was mentioned. Reply to this email directly or view it on GitHub: #3354 (comment)

-- Janina Sajka Linux Foundation Fellow Executive Chair, Accessibility Workgroup: http://a11y.org The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI) Chair, Accessible Platform Architectures http://www.w3.org/wai/apa

annevk added document conformance accessibility Affects accessibility labels Jan 15, 2018

annevk mentioned this pull request Jan 15, 2018

Consider aligning WHATWG main element definition with W3C definition #100

Closed

nit

38465fa

foolip reviewed Jan 15, 2018

View reviewed changes

only allow main in body/div

8f1b75f

foolip reviewed Jan 15, 2018

View reviewed changes

annevk added 2 commits January 15, 2018 19:52

zero or more

701e6bc

it's actually more complicated

433ec9c

marcoscaceres approved these changes Jan 16, 2018

View reviewed changes

sideshowbarker added a commit to validator/validator that referenced this pull request Jan 17, 2018

Add use counter for <main>

d7307e2

Relates to whatwg/html#3354

attempt at defining the restrictions in terms of ancestors (similar t…

be4c49d

…o area)

annevk added 2 commits January 19, 2018 12:27

simplify requirement

cac0c21

ack Alice Wonder

8460af5

annevk added 4 commits January 19, 2018 12:36

nits

783612c

tiny nit

0bc0404

main element ancestors abstraction didn't work on closer inspection

5050f0f

add accessible name restrictions to form

8200648

foolip approved these changes Jan 19, 2018

View reviewed changes

sideshowbarker approved these changes Jan 20, 2018

View reviewed changes

annevk merged commit 1dec930 into master Jan 23, 2018

annevk deleted the annevk/main-restriction branch January 23, 2018 08:10

Comandeer mentioned this pull request Jul 4, 2018

Main content of the page is no longer a common idiom without dedicated element #3795

Closed

Restrict the main element to be used once per document #3354

Restrict the main element to be used once per document #3354

Conversation

annevk commented Jan 15, 2018 • edited by pr-preview bot Loading

foolip commented Jan 15, 2018

tobie commented Jan 15, 2018

foolip left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annevk Jan 15, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annevk commented Jan 15, 2018

annevk commented Jan 15, 2018

stevefaulkner commented Jan 15, 2018 • edited Loading

foolip commented Jan 15, 2018

Choose a reason for hiding this comment

foolip commented Jan 15, 2018 • edited Loading

tobie commented Jan 15, 2018

foolip commented Jan 15, 2018

annevk commented Jan 15, 2018

stevefaulkner commented Jan 15, 2018

foolip commented Jan 15, 2018

marcoscaceres left a comment

Choose a reason for hiding this comment

annevk commented Jan 16, 2018

foolip commented Jan 16, 2018

annevk commented Jan 16, 2018

foolip commented Jan 16, 2018

annevk commented Jan 16, 2018

sideshowbarker commented Jan 16, 2018

annevk commented Jan 19, 2018

foolip commented Jan 19, 2018

foolip commented Jan 19, 2018

annevk commented Jan 19, 2018

foolip commented Jan 19, 2018

annevk commented Jan 19, 2018

stevefaulkner commented Jan 19, 2018 • edited Loading

sideshowbarker commented Jan 19, 2018

annevk commented Jan 19, 2018

stevefaulkner commented Jan 19, 2018

sideshowbarker commented Jan 19, 2018 • edited Loading

annevk commented Jan 19, 2018 • edited Loading

foolip commented Jan 19, 2018

foolip left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annevk commented Jan 19, 2018

aardrian commented Jan 19, 2018

AliceWonderMiscreations commented Jan 19, 2018

sideshowbarker commented Jan 31, 2018

JaninaSajka commented Feb 1, 2018 via email

annevk commented Jan 15, 2018 •

edited by pr-preview bot

Loading

annevk Jan 15, 2018 •

edited

Loading

stevefaulkner commented Jan 15, 2018 •

edited

Loading

foolip commented Jan 15, 2018 •

edited

Loading

stevefaulkner commented Jan 19, 2018 •

edited

Loading

sideshowbarker commented Jan 19, 2018 •

edited

Loading

annevk commented Jan 19, 2018 •

edited

Loading