-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the need for wrapping comment-open/comment-close text in data blocks #97
Conversation
index.html
Outdated
of the script element within the HTML document located by a URL (see [[!DOM]]). | ||
A <a>JSON-LD processor</a> MUST extract only the specified data block's contents | ||
parsing it as a standalone <a>JSON-LD document</a> | ||
and MUST NOT create a <a>dataset</a>.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why must it not create a dataset; what does it create if not a dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would want a targeted extraction to extract exactly what's expressed in the data block, and not alter it by putting it within a dataset.
If I'm extracting all data blocks into a single "thing", then I'd expect that thing to be a dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, wording needs to be changed, as the embedded JSON-LD may, itself, create a dataset, which should not be prohibited. I think you mean that an quads extracted from a targeted script should not be combined into a graph/dataset comprised of other markup in the document, including other JSON-LD script elements, embedded Mircodata, RDFa, or whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit 63017bd in this branch should resolve this, and I think we can merge.
This will require a corresponding change in the API doc. |
…f script escape sequences. This supports w3c/json-ld-syntax#97.
I am fine with this. Thanks to @BigBlueHat and @gkellogg! |
index.html
Outdated
"http://schema.org/name": [{"@value": "Encoding Issues"}], | ||
"http://schema.org/description": [ | ||
{"@value": "Issues list such as unescaped </script> or -->"} | ||
{"@value": "Issues list such as unescaped </script> or -- >"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my other comment--would expansion of the raw JSON-LD result in this variation or would those retain the entities from the original?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "other comment" (since it's from a different repo's PR review) w3c/json-ld-api@b969d20#r31502674
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original included JSON-LD in a script element with escaped entities. When extracted, and subsequently processed, the entities are resolved.
There's some trickery (-- >
vs. -->
) because of issues in the spec itself, as --
can not appear in a comment, which it would otherwise. Code in both common.js and extract-examples explicitly manages this.
index.html
Outdated
@@ -8434,7 +8434,7 @@ <h3>Graph Containers</h3> | |||
JSON-LD content can be easily embedded in HTML [[HTML52]] by placing | |||
it in a <a data-cite="HTML52/semantics-scripting.html#the-script-element">Script element</a> with the <code>type</code> attribute set to | |||
<code>application/ld+json</code>. Doing so creates a | |||
<a data-cite="https://www.w3.org/TR/html52/semantics-scripting.html#data-block">data block</a>.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Meant to go back and fix that once I saw how the magic worked. 😉
JSON-LD script elments MUST only contain valid JSON. HTML entity-based character encoding is used to avoid accidental script tag parsing problems--as described in the note in Restrictions for contents of <script> elements: https://www.w3.org/TR/html52/semantics-scripting.html#restrictions-for-contents-of-script-elements
I was not able to sort out when only the first script element would be chosen as the preceding text talks about always creating a data set.
4861a36
to
a52fb4c
Compare
@BigBlueHat can we pull this separate from w3c/json-ld-api#51? We need to resolve the dataset description, but handling entity escapes in script tags seems to be a bigger knot that I'd rather not hole the rest of this PR request up for, and maybe should be handled differently in w3c/json-ld-api#51 as a separate issue as well. |
…ing combined with any other content from the same HTML document.
This issue was discussed in a meeting.
View the transcriptJSON-LD script elements (data blocks) MUST only contain valid JSONBenjamin Young: #96 Benjamin Young: issue related to #97 Gregg Kellogg: what I pushed up there addresses the different aspects of the HTML script content. … HTML provides 2 mechanisms: specific backslash escapes in HTML comments, or HTML entities … Benjamin proposed to remove the comments, to keep it valid JSON, consistently with the declared content-type. Benjamin Young: it’s similar to injecting HTML tags in a <textarea>, without breaking the textarea. Rob Sanderson: to be sure i understand, the issue is if you have: <script>{"label": " |
Merging, as the remaining issue is separated to #100. |
…f script escape sequences. This supports w3c/json-ld-syntax#97.
This issue was discussed in a meeting.
View the transcript4.1. Remove the need for wrapping comment-open/comment-close text in data blocksBenjamin Young: #97 Gregg Kellogg: most of these are related to PRs just discussed, but not all Gregg Kellogg: #96 Benjamin Young: current draft talks now about html entities for escaping Gregg Kellogg: given that it has to be valid json-ld, how do we deal with escaped entities … leaves us with problem of how to represent <script> tags and comments in json? … so we could close 97 and leave 100 for further discussion Benjamin Young: one option is to mark propose closing and discuss more later Gregg Kellogg: DanBri involved and noted in discussion (of this or another related issue 100) Proposed resolution: close #97 as fixed with recent merges; other issues raised in #100 (Benjamin Young) Gregg Kellogg: +1 David Newbury: +1 Tim Cole: +1 Benjamin Young: +1 Ivan Herman: +1 Resolution #2: close #97 as fixed with recent merges; other issues raised in #100 |
Addresses the concerns expressed in #96.
This PR also includes putting the
<base>
handling content in its own section, and clarifies the extraction of a single, targeted data block. I can move those to separate PRs if desired, as they do introduce requirements unrelated to the topic in #96. They do clarify/specify the use of targeting vs. not targeting specific data blocks. But my apologies for the mixed PR.Preview | Diff