Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema.org microdata failing validation #1885

Closed
fletc3her opened this issue Feb 9, 2016 · 9 comments
Closed

Schema.org microdata failing validation #1885

fletc3her opened this issue Feb 9, 2016 · 9 comments
Assignees
Milestone

Comments

@fletc3her
Copy link

I have some Schema.org microdata style markup which is failing the AMP validator.

<div itemscope itemtype="http://schema.org/Photograph" itemid="http://www.example.com/Pomeranian.html">
<link itemprop="mainEntityOfPage" href="http://www.example.com/Pomeranian.html">
<meta itemprop="headline" content="Pomeranian">
<meta itemprop="datePublished" content="2016-02-08">
...
</div>

The code validates through Google's Structured Data Testing tool, but I'm seeing it flagged with these AMP validation errors:

The attribute 'itemid' may not appear in tag 'div'
The mandatory attribute 'rel' is missing in tag 'link rel='.

As Schema.org metadata is recommended by the spec it seems like common microdata formats like this should validate.

@Gregable
Copy link
Member

Thanks for the report. This is definitely something we should fix up.

@Gregable Gregable self-assigned this Feb 10, 2016
@rudygalfi rudygalfi added this to the M1 milestone Feb 10, 2016
@EnigmaSolved
Copy link

Can the Validator also be updated to allow rel="canonical", a la the following tweak to the original example (I'm assuming in this case that the AMP url of the page this is displayed on would be something like http://www.example.com/Pomeranian.html/amp/):

<div itemscope itemtype="http://schema.org/Photograph" itemid="http://www.example.com/Pomeranian.html">
<link itemprop="mainEntityOfPage" rel="canonical" href="http://www.example.com/Pomeranian.html">
<meta itemprop="headline" content="Pomeranian">
<meta itemprop="datePublished" content="2016-02-08">
...
</div>

Currently the above throws the following AMP error:
The attribute 'rel' in tag 'link rel=' is set to the invalid value 'canonical'.

Or is better to just leave the rel="canonical" off now that the <link> element can be without a rel attribute if itemprop is present? I had interpreted that rel="canonical" would desirable for consistency (given that's how it is handled when putting this <link> element in the page <head> (eg,

<link itemprop="mainEntityOfPage" rel="canonical" href="http://example.ampproject.org/article-metadata.html" />
).

@Gregable
Copy link
Member

@EnigmaSolved, I think @honeybadgerdontcare 's change already addresses this. I created a document which looks like:

<!doctype html>
<html ⚡>
<head>
  <meta charset="utf-8">
  <link itemprop="mainEntityOfPage" rel="canonical" href="./regular-html-version.html" />
  <meta name="viewport" content="width=device-width,minimum-scale=1">
  <style>body {opacity: 0}</style><noscript><style>body {opacity: 1}</style></noscript>
  <script async src="https://cdn.ampproject.org/v0.js"></script>
</head>
<body>
Hello, world.
</body>
</html>

And it validates just fine. Is it possible that your error is already fixed? Alternatively, perhaps there is more than one <link rel=canonical ...> tag on the page, of which we allow only one?

If neither of these theories pan out, can you send a more complete document which you are still seeing this validation error on? Thanks.

@EnigmaSolved
Copy link

@Gregable, yeah, I think we can leave things as-is. What I was doing was something along the lines of:

<!doctype html> 
<html ⚡> 
<head> 
<meta charset="utf-8"> 
<link itemprop="mainEntityOfPage" rel="canonical" href="http://www.example.com/example-post/" /> 
<meta name="viewport" content="width=device-width,minimum-scale=1"> 
<style>body {opacity: 0}</style><noscript><style>body {opacity: 1}</style></noscript> 
<script async src="https://cdn.ampproject.org/v0.js"></script> </head> 
<body> 

<div itemscope itemtype="http://schema.org/BlogPosting" itemid="http://www.example.com/example-post/"> 
<link itemprop="mainEntityOfPage" rel="canonical" href="http://www.example.com/example-post/"> 
<h2 itemprop="headline">Post Headline</h2>
 ... 
</div>

</body> </html>

The reason I got to exploring all of this is that the method Google illustrates for including mainEntityOfPage within BlogPosting (https://developers.google.com/structured-data/rich-snippets/articles) is the following, which no longer validates in the AMP Validator (it throws two errors: The mandatory attribute 'charset' is missing in tag 'meta charset=utf-8'. and The parent tag of tag 'meta' is 'div', but it can only be 'head'.):
<meta itemscope itemprop="mainEntityOfPage" itemType="https://schema.org/WebPage" itemid="https://google.com/article"/>

And Google complains if there's not a mainEntityOfPage within the BlogPosting element, so thus I went looking for an alternate way to include that there, which led me to following (which I now understand is usually used in the page <head>):
<link itemprop="mainEntityOfPage" rel="canonical" href="http://www.example.com/example-post/" />

But I don't believe the rel="canonical" is required by Google (within the BlogPosting element). So with the recent Validator fix I can just leave off that rel="canonical" (while of course retaining the one in the page <head>) and then the AMP Validator and Google are both happy. In other words, the following works fine:

<!doctype html> 
<html ⚡> 
<head> 
<meta charset="utf-8"> 
<link itemprop="mainEntityOfPage" rel="canonical" href="http://www.example.com/example-post/" /> 
<meta name="viewport" content="width=device-width,minimum-scale=1"> 
<style>body {opacity: 0}</style><noscript><style>body {opacity: 1}</style></noscript> 
<script async src="https://cdn.ampproject.org/v0.js"></script> </head> 
<body> 

<div itemscope itemtype="http://schema.org/BlogPosting" itemid="http://www.example.com/example-post/"> 
<link itemprop="mainEntityOfPage" href="http://www.example.com/example-post/"> 
<h2 itemprop="headline">Post Headline</h2>
 ... 
</div>

</body> </html>

:)

@Gregable
Copy link
Member

Glad it worked out in the end. One of the other issues here is that the error messages we are producing are a bit misleading, for instance the The mandatory attribute 'charset' is missing in tag 'meta charset=utf-8'. when the tag you are trying to setup is not the <meta charset>. This is still a work in progress too.

@EnigmaSolved
Copy link

Yeah, I understand regarding that (the error messages). I figured out pretty quickly that it was just that it didn't like the <meta> tag (or that particular version of one) in that place. There's going to be so many edge cases it'll be hard to anticipate and create accurate errors for them all.

You know, it might work better (or be more manageable) to make the errors less specific (just reference the tag or property that the error is about), and then have lots of examples on the linked-to page of what could be going on. And certainly errors that seem to show up a lot (do you all have a way of tracking that via the Validator?) you can design more specific errors for. But it seems it'd be less work for you to update a list of examples here on Github (especially for edge cases) than to be trying to code for so many variations on errors (like the above example). Just a thought. :)

@Gregable
Copy link
Member

It's a good idea. We will probably have to do that in some cases. I suppose it's ironic that more specific messages end up worse in this case.

Right now, we're doing a few iterations to see if we can improve the context of these error messages. In this case, it shouldn't be too tough. The logic should be something like "if your meta tag doesn't have a charset attribute, don't generate errors related to the meta charset tag spec.".

@EnigmaSolved
Copy link

Sounds good, and I'm glad I could be helpful!

@nikse
Copy link

nikse commented Mar 4, 2016

Hello @Gregable , I am still having the problem described in the original post

error:

The parent tag of tag 'meta' is 'div', but it can only be 'head'. (see
https://www.ampproject.org/docs/reference/spec.html#required-markup)

html:

<div class="article-route" itemscope="true" itemtype="https://schema.org/NewsArticle">
<meta itemscope="true" itemprop="mainEntityOfPage" itemtype="https://schema.org/WebPage" itemid="#{url}">
...
</div>

I am using <script async="" src="https://cdn.ampproject.org/v0.js"></script> as my amp js

Any advice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants