Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing without a predicate #19

Closed
gkellogg opened this issue Jun 30, 2018 · 25 comments
Closed

Indexing without a predicate #19

gkellogg opened this issue Jun 30, 2018 · 25 comments

Comments

@gkellogg
Copy link
Member

gkellogg commented Jun 30, 2018

For consideration by the JSON-LD 1.1 WG...

Assuming a nested set of resources where leaf nodes are frequently repeated, it is difficult to find the definition of the node after compaction. Imagine a classification that is used on the second item in a list, and again on the 26th. It would be nice to have a place to look up the label for the classification, instead of repeating it on both 2 and 26. Similarly, information about repeated people, services, or anything else could benefit from this pattern.

As prior art, and use case for inclusion, JSON API has the notion of "included" -- a slot where you can put resources that are included in others, such that developers can always know where to find them. In my work, this has come up with repeated services in IIIF, and classifications, people and places when describing the cultural heritage objects they relate to.

The identifier map pattern is already in this space, but insufficient as it requires a predicate to map to, and the relationship is to a resource somewhere nested in the data structure, not the top level resource. There would also need to be framing support as an extension to @embed:@never such that the inclusions were not embedded in the object data, but with a pointer to where they should go.

Example data:

{
  "id": "1",
  "type": "eg:Thing-with-Items",
  "eg:items": [
    {
      "id":"2",
      "classification": "enum:c6",
      "service": "enum:s2",
    },
    { "id": "3...26 go here", "type": "eg:X"}, 
    {
      "id": "27",
      "classification": "enum:c6"
    }    
  ],
  "included": {
    "enum:c6": {"type": "eg:Type", "label": "Classification 6"},
    "enum:p1": {"type": "eg:Person", "label": "Person 1"},
    "enum:s2": {"type": "eg:Service", "label": "Login Service"}
  }  
}

Playground example with identifier map: http://tinyurl.com/yd5z87xg

The inclusion term could either be a new keyword like @id (@included or @inclusions) that was then re-aliased in the context (to, e.g. included), or it could be a new keyword value for @container (included: {@container: @included}). I think the former is the (slightly) better design, as it makes it more obviously a field rather than a data structure. It would only be usable in a resource that is not nested within other resources (e.g. the top level JSON object ... which might be in an array or @graph). Framing could then use the same keyword: @embed: @included.

Original issue: Indexing without a predicate #650

@gkellogg
Copy link
Member Author

I wonder if this could leverage the @nest capability, which if you'll recall, allows you to nest properties of a node under an intermediate property. If that value were a node reference, then that could mean to apply the properties of the referenced node to the referencing node. For example:

{
  "@context": {
    "@vocab": "http://example/",
    "id": "@id", 
    "type": "@type",
    "eg": "http://example/",
    "classification": {"@type": "@id", "@nest": "@id"},
    "included": {"@container": "@id"}
  },
  "id": "1",
  "type": "eg:Thing-with-Items",
  "eg:items": [
    {
      "id":"2",
      "classification": "enum:c6",
      "service": "enum:s2",
    },
    { "id": "3...26 go here", "type": "eg:X"}, 
    {
      "id": "27",
      "classification": "enum:c6"
    }    
  ],
  "included": {
    "enum:c6": {"type": "eg:Type", "label": "Classification 6"},
    "enum:p1": {"type": "eg:Person", "label": "Person 1"},
    "enum:s2": {"type": "eg:Service", "label": "Login Service"}
  }  
}

Here, the "classification": {"@type": "@id", "@nest": "@id"} could signal that nested properties are found through the referenced id, and that strings are interpreted as IRIs. This could expand to something like the following:

[{
  "@id": "1",
  "@type": ["http://example/Thing-with-Items"],
  "http://example/items": [{
    "@id": "2",
    "@type": ["http://example/Type", "http://example/Service"],
    "http://example/label": [
      {"@value": "Classification 6"},
      {"@value": "Login Service"}
    ]
  }, {
    "@id": "2",
    ...
  }, {
    "@id": "27",
    "@type": ["http://example/Type"],
    "http://example/label": [{"@value": "Classification 6"}]
  },
  "http://example/included": [{
    "@type": "http://example/Type", "http://example/label": ["@value": "Classification 6"]
  }, {
    "@type": "http://example/Person", "http://example/label": ["@value": "Person 1"]
  }, {
    "@type": "http://example/Service", "http://example/label": ["@value": "Login Service"]
  }]
}]

Of course, here may be issues with this, but it leverages the nesting concept and keyword and prevents needing to add a new keyword.

@workergnome
Copy link

I can say that this is a feature that would make my life significantly better--we've been using @embed: @always to deal with this which is not an optimal solution. And it would be really, really nice to have more alignment between JSON-LD and JSON-API, and this is one of the major differences that makes that hard to do.

@azaroth42
Copy link
Contributor

Per json-ld-api#33, I prefer a new keyword rather than overloading permutations of existing keywords which has proven confusing in the past. That said, @nest: @id seems reasonably close to the desired semantics for the relevant keywords and functionality and @nest is new in 1.1

@iherman
Copy link
Member

iherman commented Sep 5, 2018

I think I understand the intention (references to other syntaxes help a lot), but I do not understand the exact mechanism. The playground example doesn't help, all three snippets on the screen look identical to me (the @context are all the same). Can someone provide a clearer proposal?

(If my understanding is remotely correct, it is also based on the new @container feature involving @id. I must admit that I am increasingly worried about overloading meanings. "@container":"@list" is used for the characterization of objects, whereas "@container":"@id" is for indexing. These are very different notions in my mind, and I would prefer to make things explicit by using some sort of a @index keyword of some sort. Overloading meanings make the language more difficult to understand.)

@azaroth42
Copy link
Contributor

Discussion on WG call of 2018-09-14 led to the conclusion that this is likely a framing issue, not a syntax issue. Framing based solutions to be explored.

(For the sake of issue management, we'll leave the issue here)

@azaroth42
Copy link
Contributor

azaroth42 commented Sep 17, 2018

In light of the framing discussion, how about something like a new value for the @embed flag that allows the frame to say that it should be put in @included rather than inline or omitted?

The syntax document would need to define "@included" so that it could be aliased to something else, otherwise expansion would try to apply the default vocab to it as a regular term.

Then the example in the issue would be generated by something like:

Data:

{
  "@context": {
    "included": "@included",
    "id": "@id"
  },
  ...
}

Frame:

{
 "items": [
  {
   "classification": { "@embed": "@included" },
   "service": {"@embed": "@included"}
  }
 ]
}

@azaroth42
Copy link
Contributor

On the WG call of 2018-12-14, we discussed the feature in the abstract and two separate implementation patterns.

The feature is, in JSON-LD terms, to have a way to not embed nodes in the compacted JSON tree where they're encountered, but instead have the nodes serialized in a separate part of the tree and referenced from where they're encountered. In essence, it is a desire to exclude some nodes from being embedded in the tree and instead treat them like "striped" RDF/XML, or as if they were disconnected nodes but without requiring @graph: [] at the root of the document.

  1. The reference could be completely internal. Just a way to create a pointer within the document, that could be arbitrarily generated like a blank node.
  2. The reference could be the URI of the resource, making the references object function very similarly to an identifier map, just without the predicate.

My original issue assumed option 2, but the use case could be solved with option 1 as well. I think option 1 would require more machinery than option 2, as the references would need to be understood in the context.

@iherman
Copy link
Member

iherman commented Dec 16, 2018

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript 5.5. Indexing without a predicate
Benjamin Young: #19
Rob Sanderson: I opened issue 19 initially in the CG
… based on an observation by 2 communities
Rob Sanderson: last time we discussed this we thought it might be a good fit for framing
… for creating it; and the context definition for understanding what the particular … is
… you could have an “included block” that contains hundreds of items, you only have to reference but don’t have to add all everytime you use them
Gregg Kellogg: looks very much like “itemref” of microdata
… unwinding this in compaction might be challenging to say at least
Benjamin Young: the current container options don’t work because they force you to have a name or these?
Rob Sanderson: no, the issue we like to solve -> if you don’t have the entire graph in memory, you don’t necessarily find the reference for everything
… i.e. optimizing search
Gregg Kellogg: just noting that there’s a CG started on updating N3
… one of the things they tackle first is adding formula/pattern
… in RDFa we did something similar by adding some reasoning
Ivan Herman: I’m looking at the example and I’m not sure I understand what the graph is you want to generate out of that?
Rob Sanderson: a very connected one
… that allows to avoid having to define a specific node every time it is encountered
Ivan Herman: relative URIs within JSON-LD?
… couldn’t we use an internal URL, like one would use in turtle
Pierre-Antoine Champin: I’m a bit confused by Ivan’s last example
Ivan Herman: the original problem is “we don’t want to repeat things”
… one way to do this in turtle is to use internal URIs
Pierre-Antoine Champin: could I extend this by having only the included key in the top-level object and a set of ids in the corresponding object?
… basically what we discussed during TPAC
Rob Sanderson: we can already do that, so you can have the pattern RDF XML kinda uses
… but it’s a very good point
… some things we want to have nested, some enumerated
… potentially a framing algo could say “put these refs separate, but those others nested”
Rob Sanderson: in JSON schema there’s a definitions block one can reference
Benjamin Young: https://json-schema.org/latest/relative-json-pointer.html
Rob Sanderson: $ref something something magic
… I was thinking more about arbitrary ids, or something along those lines
Benjamin Young: $ref is actually here https://json-schema.org/latest/json-schema-core.html#rfc.section.8.3.2.p.1
Gregg Kellogg: it kinda looks like nesting?
Gregg Kellogg: [gives example]
Rob Sanderson: I don’t think it’s a graph container, as there’s only one graph. It could be either an identifier container or a nesting container, without a mapped predicate
Pierre-Antoine Champin: [thinks about possible hacks to do that]
Ivan Herman: not sure which way we want to go
Benjamin Young: no more calls for the rest of the year

gkellogg added a commit that referenced this issue Dec 21, 2018
…ied URL for specifying context or frame.

For #19.
gkellogg added a commit that referenced this issue Jan 12, 2019
…ied URL for specifying context or frame.

For #19.
@iherman
Copy link
Member

iherman commented Feb 9, 2019

This issue was discussed in a meeting.

  • RESOLVED: Continue to explore @nest with additional features, such as @container:@id, as a solution to issue #19 {: #resolution3 .resolution}
  • ACTION: gkellogg and pchampin to explore effect of @nest+@container:@id on compaction and expansion
View the transcript 3. “itemref”, issue 19
Rob Sanderson: issue occurs when resource occurs multiple times in the graph. What would be nice that if you knew that terms got used repeatedly…
… would be nice if you had references from the inclusion to included. JSON API calls it “included”
… JSON Schema has $ref.
David Newbury: https://jsonapi.org/format/1.1/#document-top-level
David Newbury: an example of it in the JSON-API spec is here: https://jsonapi.org/format/1.1/#document-compound-documents
Rob Sanderson: useful in graph context so you can use references rather than values
… is this a frame issue or syntax? We decided both - could go into framing to know that “included” is not a predicate, it is the inclusion
… references block rather than <base#>included.
Gregg Kellogg: did you consider the RDFa approach, where there is a way to output triples where after parsing there is a reasoning step?
Ivan Herman: I thought that was more directly done…
Gregg Kellogg: … that was microdata. RDFa is more directly — reasoner takes triples and outputs w/ different subject.
Jeff Mixter: is there a way to solve this with @graph?
… I have a first block of JSON which is object outside of a graph and add subgraphs with aliased keyword
Ivan Herman: this is mixing levels — syntax is similar but this is not a graph
Gregg Kellogg: inverse properties? Included have reverse relationships to items that are included
… is classificaton_of is at term that is an @reverse – achieves separation of concerns but also includes expanding, compacting and framing for round trip
Rob Sanderson: would still need an @nest property.
Ivan Herman: there are two ways to look at this:
… 1) enum:c6 is an internal reference that we could handle with fragment id in graph, but I have an extra triple in the graph …
… you get extra links
… 2) conceptually expect value of enum:c6 to be physically replicated and put back into the node
… itemref did the replication option
… JSON Schema creates a fragment identifier, but is this what you are looking for?
David Newbury: our use case is the latter case
… because in a JSON only environment, knowing where to go is difficult.
Ivan Herman: Option 2) requires duplication and massaging in graph…
Rob Sanderson: gregg’s proposal w/ included : {"@container": "@id"} (sort of) works
Ivan Herman: included should be a nest
David Newbury: how do I get option 2) (included under classification)?
Gregg Kellogg: we’d still need an inverse thing. If I have an id map but want to say it is sort of transparent…
Ivan Herman: if a term is defined to be @nest, does @id still work or do you ignore that once and for all?
Gregg Kellogg: @nest allows me to use an intermediate property to hold things which are pushed up. We want subtree to be somewhere else
Ivan Herman: if included is @nest, is @container: @id still valid?
Gregg Kellogg: round tripping is an issue as well.
Benjamin Young: posted playground example above that uses “embedded”. Seems to do what you want. Note that “included” is an array in
… json API not an object. Also introducing a non-JSON reference mechanism
Ivan Herman: what you do is define a graph, not the content of the graph
Rob Sanderson: there is a blank node _:b0 which has a name and a type
Gregg Kellogg: use a preprocessing tool or do it the way RDFa does it?
David Newbury: I could do this but it wouldn’t be valid JSON-LD …
Gregg Kellogg: It would be, but it wouldn’t be the graph you are looking for
Harold Solbrig: (discussion about examples on FTF document… w/ @nest and rather than containing , references object…)
David Newbury: in practice we use @id in our main document and use a placeholder in data, but requires an addition piece of semantic ata
Pierre-Antoine Champin: 2 questions. 1) Do we agree that the enum term should be defined as well? (a: yes)
… 2) is "@type": "@nest" the way it would be written? (a: no)
Rob Sanderson: nest: https://www.w3.org/TR/json-ld11/#ex-65-defining-property-nesting
Gregg Kellogg: could handle it with n3 reasoning?
… it seems like we are trying to do things at a totally different level.
Adam Soroka: one other wrinkle … this would play oddly with a streaming processor.
Gregg Kellogg: this is the reason we did rdfa the way we did
Ivan Herman: in rdfa we define terms and additional semantic rules, which is what we do here.
Gregg Kellogg: it has already been done, we could just reference it.
Gregg Kellogg: https://www.w3.org/TR/html-rdfa/#property-copying
Ivan Herman: done through RDF, but way too complicated…
Pierre-Antoine Champin: reminds me of the very first version of RDF rdf:aboutEach
Rob Sanderson: http://tinyurl.com/ydgfcgl4
Harold Solbrig: (azaroth using playground example between jane and john…)
Ivan Herman: copying vs. referencing. We can say that copying stuff is outside json-ld.
… reference, however, might be doable. What do we need to make the example on the screen (enum:c6, … in issue #19) work
… . included is there because of bookkeeping. The approach feels natural
… if included is nested, you take it out of the equation altogether…
Rob Sanderson: needs to be a new syntax ("@id": "@nest"?)
Simon Steyskal: works as expected on playground but @id: @nest doesn’t work
Pierre-Antoine Champin: https://json-ld.org/playground-dev/#startTab=tab-nquads&json-ld=%7B%22%40context%22%3A%5B%22http%3A%2F%2Fschema.org%2F%22%2C%7B%22labels%22%3A%7B%22%40id%22%3A%22%40nest%22%7D%7D%5D%2C%22%40type%22%3A%22Person%22%2C%22labels%22%3A%5B%7B%22familyName%22%3A%22Doe%22%7D%2C%7B%22givenName%22%3A%22Jane%22%7D%5D%7D
Gregg Kellogg: is there a way through @nest to subsume @graph while defining a bush
Gregg Kellogg: today, nesting requires the object
Gregg Kellogg: There’s obviously work to be done…
Rob Sanderson: how much?
Gregg Kellogg: (waffles and ponders…) involves extending id of nesting… there a lot of angles to this, man.
David Newbury: to clarify, we’re not addressing framing right now, correct?
Ivan Herman: workergnome – is this approach still ok? Does it accomplish what you want?
Proposed resolution: Continue to explore @nest with additional features, such as @container:@id, as a solution to issue #19 (Rob Sanderson)
Ivan Herman: +1
Simon Steyskal: +1
Rob Sanderson: +1
Jeff Mixter: +1
Harold Solbrig: +1
Gregg Kellogg: +1
David I. Lehn: +1
Adam Soroka: +1
Pierre-Antoine Champin: +1
David Newbury: +1
Resolution #3: Continue to explore @nest with additional features, such as @container:@id, as a solution to issue #19 {: #resolution3 .resolution}
Benjamin Young: +1
Action #1: gkellogg and pchampin to explore effect of @nest+@container:@id on compaction and expansion

@iherman
Copy link
Member

iherman commented Mar 29, 2019

This issue was discussed in a meeting.

  • ACTION: propose a concrete solution, considering link and nest (Rob Sanderson)
  • ACTION: propose a concrete solution, considering link and nest (David Newbury)
View the transcript Indexing without a predicate
Rob Sanderson: Link: #19
Rob Sanderson: we discussed at the F2F
… there was an action of gkellogg and pchampin to look into it
Gregg Kellogg: I didn’t have time to look into it yet
Pierre-Antoine Champin: me neither
Rob Sanderson: when an item appears randomly in multiple places in the document,
… it would be nice to put this item in a kind of “bucket” where its full description is stored,
… rather than to have to browse the full document to find the random place where the full description is included
Ivan Herman: this is essentially the ‘itemref’ feature of microdata
… copying that mechanism in JSON-LD seems complicated, but maybe not impossible
Dave Longley: sounds like a framing issue, similar to "@anywhere"
Rob Sanderson: this is not only related to framing, you need something in the context as well
Gregg Kellogg: this is indeed very much like ‘itemref’
… my concern is that it will be complicated if we want to ensure round-trip (compaction/expansion)
… like we do for other features
… that could be done using default and framing, but seems like a very complex solution
Dave Longley: we do have special keywords in the framing compaction algorithm that are treated differently to avoid dropping undefined terms, etc.
David Newbury: is there a way to handle this as a post-processing step?
Gregg Kellogg: the RDFa reference mechanism involves looking in the graph, adding triples and remove triples that were part of the pattern
Ivan Herman: if we do that (i.e., reproduce the RDFa ref mechanism) people will run away screaming
… what we are trying to do is some sort of internal references, essentially relative URIs
… it would still require to define a bush and not a tree, which forces us to use @graph,
… but it might work
Dave Longley: if we consider working in memory, consider @link which is implemented in the Javascript processor
… to ensure that an object is stored only in one place
Gregg Kellogg: the problem is that you would typically create cycles internally
… I’m not sure relative URIs can be used without introducing a level of indirection
Action #3: propose a concrete solution, considering link and nest (Rob Sanderson)
Action #4: propose a concrete solution, considering link and nest (David Newbury)
Gregg Kellogg: if we are moving towards better support for streaming profiles
… we can’t rely on in-memory storage only
… You would need a lot of bookkeeping to handle this.

@azaroth42
Copy link
Contributor

Requirement: Functionality that allows resource nodes to be serialized at a particular location in the JSON tree, rather than where they are encountered. There isn't necessarily a direct relationship between the top resource node in the tree and the resource node to be serialized (similar to @nest) and the serialization should be a JSON object where the URI of the node is the key (similar to @container @id).

Rationale: There are many use cases (e.g. IIIF, Linked Art, JSON-API,...) where nodes are referred to sporadically throughout the graph, but without any particular obvious first location. Embedding them always would be overkill in many situations, and the cost to find the first occurrence is arbitrarily high based on the size of the tree. Instead, having them at a knowable location makes this a single look up, rather than a tree traversal.

Proposed Solution:

Introduce a new keyword @included which may be present only as a key at the top of the serialized tree. It may be aliased. It functions like @container @id, in that the keys are terms that resolve to URIs, and the values are JSON objects that have the properties of the resource. Resources in the tree can refer to the URIs of nodes in the @included property in the normal way.

Like @nest, @included in instance data does not generate a triple during expansion, instead expansion simply descends into the JSON object and processes each key/value into the graph.

And now the solution splits into two options for discussion:

Option A - framing:

In framing, @included is a new value for the @embed directive. @embed: @included means to instead embed the resource encountered in the @included instance data property, rather than where it is currently encountered.

Option B - compaction:

In a context, a property may be defined as @type: @included. This means that all of the values of the property are to be compacted into the @included structure and the (compacted) URI of the resource is used where the reference to the resource is encountered. As @included only works for resources, in these situations it also has the same meaning as @type: @id in that the value is a URI.

@iherman
Copy link
Member

iherman commented May 4, 2019

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript Indexing without a predicate
Benjamin Young: link: #19
Gregg Kellogg: Related w3c/json-ld-wg#52
Benjamin Young: this is issue is also known as @included
… proposed by azaroth
… there is a related proposal by gkellogg
Gregg Kellogg: there are several ways of doing something like id-ref
… one of them would be to combine @nest and @container:@id
… Rob’s proposal would better be handled in expansion (this is where syntactic sugar is removed).
… Properties declared as e.g. @container:@include would look into a special @include container.
… Problem with compaction, which can not easily reverse this kind of extension.
… More appropriate in Framing.
… Seems quite complex and convoluted, with a lot of corner cases.

@azaroth42
Copy link
Contributor

azaroth42 commented Jul 12, 2019

A hopefully simpler proposal and example:

Introduce a new keyword @included which may be present only as a key at the top of the serialized tree. It may be aliased. The value space is always a JSON object, functionally equivalent to an @id container, in that the keys are always URIs (allowing for compaction as normal), and the values are the serialization of the resource identified by that URI.

@included does not generate a triple during expansion. Instead, expansion descends into the JSON object when its URI is encountered in the graph.

@included is generated during compaction when the algorithm encounters @type: @included in a context for a property definition.

For example, this data:

{
  "@context": {
    "eg": "https://example.com/ns/",
    "hasService": {"@id": "eg:hasService", "@type": "@id"},
    "label": "eg:label",
    "Thing": "eg:Thing",
    "Service": "eg:Service"
  },
  "@id": "https://example.org/1",
  "@type": "Thing",
  "hasService": {
  "@id": "https://example.org/service",
  "@type": "Service",
  "label": "My Service"
  }
}

Expands in 1.0 and 1.1 to this form:

  {
    "@id": "https://example.org/1",
    "@type": [
      "https://example.com/ns/Thing"
    ],
    "https://example.com/ns/hasService": [
      {
        "@id": "https://example.org/service",
        "@type": [
          "https://example.com/ns/Service"
        ],
        "https://example.com/ns/label": [
          {
            "@value": "My Service"
          }
        ]
      }
    ]
  }

The following would also expand to the same form:

{
  "@context": {
    "eg": "https://example.com/ns",
    "hasService": {"@id": "eg:hasService", "@type": "@included"},
    "label": "eg:label",
    "Thing": "eg:Thing",
    "Service": "eg:Service",
    "included": "@included"
  },
  "@id": "https://example.org/1",
  "@type": "Thing",
  "hasService": "https://example.org/service",
  "included": {
    "https://example.org/service": {
        "@type": "Service",
        "label": "My Service"
     }
  }
}

When compacting, the @type: @included definition for hasService would trigger the creation of the included (as aliased from @included), and insert the compacted form of https://example.org/service into the JSON object.

@gkellogg
Copy link
Member Author

gkellogg commented Jul 22, 2019

I have a straw man implementation of the expansion part of this, which raises some cases to consider:

  • I presume that if the JSON document is an array that each object member may have its own included block.
  • What if an included block exists but no term has "@type": "@included"? Presume it's simply ignored.
  • What if multiple properties map to @included in the top-level object? Error, or first found wins?
  • What if a property references a missing included key? Presume that it's simply dropped.
  • What if a property with @type: @included contains values that aren't strings mapping to IRIs? I presume they're added to the output as if there were no included.
  • What if @included appears in a non top-level object?
  • What if a value in an included block includes a property with @type: @included? Creates an order-dependent expansion, or requires that included value expansion does not happen until the value is processed, not at the beginning. Seems like this should be disallowed.
  • What if the included map includes the key @none? error or ignored?
  • What if the value of the included key is not a map? Error or dropped?
  • What if a property has @type: @included, but there is no included block? Presume it defaults to an empty map.
  • Can included keys be vocabulary relative? Why not just treat them as strings?

Basic implementation

  1. After looking for type-scoped contexts, if there is no included map, and a key expanding to @included is found, create included from the value, by expanding keys document-relative and expanding the values using the expansion algorithm, and pass this to subsequent invocations of the expansion algorithm.
  2. After looking for @type: @json, if the property has a type mapping of @included, the value must be a string or an array of strings. Expand each value relative to either the document or the vocabulary, depending on the type-mapping of the property term and add the concatenation of any included map value found for that expanded value.

@gkellogg
Copy link
Member Author

Two other observations:

  1. as the expansion is entirely syntactic, why use IRI expansion for the keys? They could just be strings.
  2. Compaction would need to invent an identifier to use as a key of @included, as none remains in the expanded form. Plus, it would result in compacting the whole content of the property. If in compacted form the property of type @compacted had multiple string values, each one would be inserted when expanding, but could not reasonably be reconstructed when compacting.
  3. If the same value were repeated for different properties, it might be algorithmically challenging to match that with an existing value of @included.

We may want to skip compacting altogether. We previously thought about doing it with framing, although many of the same challenges remain.

@gkellogg
Copy link
Member Author

Some expansion examples:

Includes referenced identifier

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label"
    }
  }
}

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }]
}]

Includes with array value

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": [{
      "@type": "Data",
      "label": "label"
    }, {
      "@type": "Data2",
      "label": "label2"
    }]
  }
}

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }, {
    "@type": ["http://example.org/Data2"],
    "http://example.org/label": [{"@value": "label2"}]
  }]
}]

Included in different top-level objects of an array

[{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label"
    }
  }
}, {
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data2",
      "label": "label2"
    }
  }
}]

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }]
}, {
  "http://example.org/data": [{
    "@type": ["http://example.org/Data2"],
    "http://example.org/label": [{"@value": "label2"}]
  }]
}]

Multiple values for an @included property

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": ["http://example.org/data1", "http://example.org/data2"],
  "@included": {
    "http://example.org/data1": {
      "@type": "Data",
      "label": "label"
    },
    "http://example.org/data2": {
      "@type": "Data2",
      "label": "label2"
    }
  }
}

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }, {
    "@type": ["http://example.org/Data2"],
    "http://example.org/label": [{"@value": "label2"}]
  }]
}]

@included with no @type: @included

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@id"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label"
    }
  }
}

=>

[{
  "http://example.org/data": [{"@id": "http://example.org/data"}]
}]

Only uses first entry mapping to @included

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"},
    "includedA": "@included",
    "includedB": "@included"
  },
  "data": "http://example.org/data",
  "includedB": {
    "http://example.org/data": {
      "@type": "Data2",
      "label": "label2"
    }
  },
  "includedA": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label"
    }
  }
}

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }]
}]

Missing @included

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data"
}

=> Error: "missing @included referent"

Missing @included referent

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/other": {
      "@type": "Data",
      "label": "label"
    }
  }
}

=> Error: "missing @included referent"

@included in a non-top-level object

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "block": {
    "data": "http://example.org/data",
    "@included": {
      "http://example.org/data": {
        "@type": "Data",
        "label": "label"
      }
    }
  }
}

=> Error: "invalid @included map"

@included term within @included map

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label",
      "data": "http://example.org/data"
    }
  }
}

=> Error: "invalid @included map"

@included not a map

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": true
}

=> Error: "invalid @included map"

@azaroth42
Copy link
Contributor

What if multiple properties map to @included in the top-level object? Error, or first found wins?

I think it should be an error. I think multiple maps should be okay, but only one should be present. With multiple and picking the first, it would be determined by the key order ... which isn't deterministic without further processing (which we already specify, but isn't lovely). Alternatively, if the keys are unique across the maps, they could be merged ... but that seems like a challenge for creation, even if we don't specify it in 1.1. So ... my thought would be error.

What if a property references a missing included key? Presume that it's simply dropped.

e.g. that there's a reference but no included block with that value as a key? If the reference value MUST expand to a IRI, then there's still a valid triple, just no data with the URI as the subject. So I would keep the triple where the IRI is the object.

And this is why I think the keys in the block and the reference should be required to expand to IRIs.

What if @included appears in a non top-level object?

I think that should be an error. There's no point having non-top-level inclusion blocks, as you would never find them.

What if the included map includes the key @none? error or ignored?

Error, as it doesn't expand to an IRI.

What if the value of the included key is not a map? Error or dropped?

Error.

Can included keys be vocabulary relative? Why not just treat them as strings?

That is an excellent question. I think they should be vocabulary relative, such that you could do crazy things like having RDFS descriptions in the @included to give the domain / range of properties that are defined in the context.

We may want to skip compacting altogether. We previously thought about doing it with framing, although many of the same challenges remain.

I think the strongest use case is expansion -- there's data that looks like this in the wild, and the structure helps the target audience of JSON developers do their thing. If it we don't specify how to generate that form in compact() in 1.1 ... that's not the end of the world.

@azaroth42
Copy link
Contributor

Thanks for the work on this Greg!!

@ajs6f
Copy link
Member

ajs6f commented Jul 24, 2019

What if a property references a missing included key? Presume that it's simply dropped.

e.g. that there's a reference but no included block with that value as a key? If the reference value MUST expand to a IRI, then there's still a valid triple, just no data with the URI as the subject. So I would keep the triple where the IRI is the object.

IIUC, in @azaroth42's formulation, a bit of JSON can produce triples with the URI either as subject or object, which seems like a good opportunity for surprise. I think omission (or erroring out) is a bit more intuitive. But I'm not confident that I understand this very well, without "in the wild" examples.

@azaroth42
Copy link
Contributor

I think the case from the examples is this one:

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/other": {
      "@type": "Data",
      "label": "label"
    }
  }
}

e.g. data references "http://example.org/data", but that isn't a key in @included.
However there is a triple: <base> data <http://example.org/data>
Otherwise that document expands to an empty set of triples.

Certainly not a hill I'm going to die on, but if it doesn't expand, then I think the @type:@included should always error if the value of the property does not have a reference in the @included map. E.g. it cannot be a string, a number, or a JSON object, only a value that expands to a URI or an array of URIs, all of which are present (after expansion) in the map.

@gkellogg
Copy link
Member Author

Certainly not a hill I'm going to die on, but if it doesn't expand, then I think the @type:@included should always error if the value of the property does not have a reference in the @included map. E.g. it cannot be a string, a number, or a JSON object, only a value that expands to a URI or an array of URIs, all of which are present (after expansion) in the map.

That's the current interpretation as I laid out in Missing @included referent.

I'll create a PR for the API with a processing description and tests based on the examples I outlined above, plus some others.

Also, to note, @type in a term definition is used to interpret string values, so such a property may have a value object or node object value which would be expanded as if it were the value of any other property.

@gkellogg
Copy link
Member Author

gkellogg commented Jul 27, 2019

Based on discussion today, it sounds like the group would like to do something different, where we would rely on node references to do the job, and find a way to be able to do something like @graph, but where the graph name is something like @default. Thinking about this more, I don't think we can really make this work, as there's no where you could put in "@id": "@default", and trying to do it in the context doesn't leave a really satisfactory expanded form.

Looking more at included from json.api, maybe we want to do something like that, where the value of @included would be treated as an array of node objects in the same graph as the containing node object, and the @included would survive expansion. An example from above might look like the following:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "@base": "http://example.org/base/",
    "id": "@id", 
    "type": "@type",
    "enum": "http://example.org/enum#",
    "classification": {"@type": "@id"},
    "service": {"@type": "@ id"},
    "included": "@included"
  },
  "id": "http://example.org/base/1",
  "type": "Thing-with-Items",
  "items": [{
    "id":"http://example.org/base/2",
    "classification": "enum:c6",
    "service": "enum:s2"
  },
    3...26 go here
  {
    "id": "http://example.org/base/27",
    "classification": "enum:c6"
  }],
  "included": [{
    "@id": "enum:c6", "type": "Type", "label": "Classification 6"
  }, {
    "@id": "enum:s2", "type": "Service", "label": "Login Service"
  }]
}

This would expand to something like the following:

[{
  "@id": "http://example.org/base/1",
  "@type": ["http://example.org/Thing-with-Items"],
  "http://example.org/items": [{
    "@id": "http://example.org/base/2",
    "http://example.org/classification": [{
       "@id": "http://example.org/enum#c6"
    }],
    "http://example.org/service": [{
      "@id": "http://example.org/enum#s2",
    }]
  }, {
    "@id": "http://example.org/base/27",
    "http://example.org/classification": [{
      "@id": "http://example.org/enum#c6"
    }]
  }],
  "@included": [{
    "@id": "http://example.org/enum#c6",
    "@type": ["http://example.org/Type"],
    "http://example.org/label": [{"@value": "Classification 6"}]
  }, {
    "@id": "http://example.org/enum#s2",
    "@type": ["http://example.org/Service"],
    "http://example.org/label": [{"@value": "Login Service"}]
  }]
}]

If you flatten this, the @included would go away, and the node references and node objects would effectively be merged together, generating the same triples as we had before.

If we chose too, we could even allow a term definition aliasing @included that used @container: @id to get the form of an id map:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "@base": "http://example.org/base/",
    "id": "@id", 
    "type": "@type",
    "enum": "http://example.org/enum#",
    "classification": {"@type": "@ id"},
    "service": {"@type": "@ id"},
    "included": {"@id": "@included", "@container": "@id"}
  },
  "id": "http://example.org/base/1",
  "type": "Thing-with-Items",
  "items": [{
    "id":"http://example.org/base/2",
    "classification": "enum:c6",
    "service": "enum:s2"
  },
    3...26 go here
  {
    "id": "http://example.org/base/27",
    "classification": "enum:c6"
  }],
  "included": {
    "enum:c6": {"type": "Type", "label": "Classification 6"},
    "enum:s2": {"type": "Service", "label": "Login Service"}
  }
}

This is effectively equivalent to the json.api use of included, while adhering more closely to JSON-LD syntax and processing rules.

@iherman
Copy link
Member

iherman commented Aug 2, 2019

This issue was discussed in a meeting.

  • RESOLVED: focus <code>@included</code> text and example on original inclusion use case; mention value of it as an <code>@graph</code> replacement for bushes–and reference primer for further reading
  • RESOLVED: close issue #19 with merger of <code>@included</code> related PRs
View the transcript 4.1. Indexing with @included
Ivan Herman: -> #208 issue PR
Ivan Herman: -> #19 issue itself
Benjamin Young: We made good progress on this last week.
Gregg Kellogg: It is @included now.16:13:56 <bigbluehat> https://jsonapi.org/
@included comes from the JSON.API spec, and we are adopting this.
… Right now it’s just a container for collecting node objects that don’t have a direct rel with the node in which they are contained.
… There’s been some exchange on the issue highlighting a bush-like use for included.
… In JSON-LD 1.0 the top-level graph is used to collect nodes is a corner case. Everywhere else where graphs are used are seen as named graph.
… Included doesn’t carry that baggage.
… So @included can be used in favor of @graph in these places.
… In 1.0, you can’t have a graph name being a property of another node. With @included you can.
… We can’t use @graph to define a default graph.
… Except when it is the only property in a top-level object.
Benjamin Young: How does the actual inclusion take place?
Gregg Kellogg: The shape is similar to JSON.API. The value is seen as an array of node objects. If you have a node that is a value of a prop….
… Needed in jsonapi for node references that are not included in the main document as references, but should be included aside.
… Included blocks can be nested, and will be flattened out when done.
… You won’t compact back to included blocks after flattening.
… You can use included in a frame and have it match on diff subjects.
… The name @included is out of sync with other keywords.
… Dave suggested @include
… Just like jsonapi
Ivan Herman: It’s becoming bikeshedding
… I would go further than what you did. Ex 1.1.1 and 1.1.2 (bushes) should be removed.
… We should convince people to not use those forms anymore.
Gregg Kellogg: https://pr-preview.s3.amazonaws.com/w3c/json-ld-syntax/pull/208.html#example-111-using-graph-to-explicitly-express-the-default-graph
Gregg Kellogg: By removing these we won’t lose anything. We would have to remove everything after the note.
Gregg Kellogg: https://pr-preview.s3.amazonaws.com/w3c/json-ld-syntax/pull/208.html#example-103-simple-data-with-several-top-level-nodes
Gregg Kellogg: Also, we may want to reverse example 103 and 104. To clarify writing bushes.
Ivan Herman: Referring to @graph is misleading, as it has not been explained yet there.
Gregg Kellogg: We may want to change an example in the @graph section then.
… We can also just leave that out and use it in the best practices document.
Ivan Herman: Yes
Benjamin Young: Flattened representation will still contain @graph, so readers will have to know what it does.
… This plumbing shift is significant.
… This is going to cause issues for the json people that are operating on the flattened form.
Gregg Kellogg: If you flatten with a context, it would introduce a graph to contain it. This would change the shape dramatically.
… Same with framing. In 1.1 we don’t use an @graph at the top level if not needed.
… We could change the algo to use included instead.
… But we may not want to do that.
… So do we want to replace the main usage of @graph to @included?
… Included allows embedded nodes to go to one place. Like in jsonapi, they don’t want to include referenced things inline, but only a reference to an included block.
Benjamin Young: Useful for reducing payload size. And only including referenced things once.
… These are just IRI references?
Gregg Kellogg: There is no magic going on.
Ivan Herman: In JSON-LD it used to be hard to do these indexed references.
… The bush features can now be expressed in two different ways.
… These things happen.
… It’s a matter of taste which one you prefer.
… I personally always hated graph for representing bushes.
… I like this new included representation for bushes.
… I don’t want to hide the fact that bushes can be described with included instead of the graph ‘hack’.
Gregg Kellogg: We can say that included can also be used without other props in node object for describing node objects without semantic relationship.
Ivan Herman: I’m fine with that.
Benjamin Young: https://pr-preview.s3.amazonaws.com/w3c/json-ld-syntax/pull/208.html#included-blocks-to-be-flattened
Benjamin Young: If @included were @graph, this would make a named graph?
Gregg Kellogg: I think this would make one or two named graphs.
Benjamin Young: With included it won’t make named graphs.
Gregg Kellogg: Yes, just objects.
Benjamin Young: I see the value, but not keen on the new keyword.
… I think we need to explain these next to each other, with their nuances.
… The initial reason for this feature was not meant to displace graph.
… It was meant to bring in other referenced objects in the document.
Gregg Kellogg: What jsonld always had was the ability to reference node
… by defining @id or @vocab you can define that thing.
… our mission is to use json in the wild where this is a pattern of usage
Benjamin Young: You said exactly what I was typing.
… it would be good to use an example from jsonapi
Gregg Kellogg: jsonapi examples are quite long, with a lot of nesting
… we have a test case from jsonapi
… may be too long for here. But may be good for best practices document.
… It would overly complicated the spec to include here.
Benjamin Young: This solves the jsonapi case by aliasing included to @included.
Gregg Kellogg: Yes, you can have multiple properties that have multiple aliases.
… included can be a nested object
Ivan Herman: Can we talk about things that go to the primer?
Gregg Kellogg: What to do with example?
Ivan Herman: Switch the order of problem of Rob. In the primer we will have to spend more words on the fact that there are different things that can be used to do the same thing.
… We must have a primer.
… The current doc is already huge.
Benjamin Young: We need distinction in the main spec explaining diff between included and graph
Ivan Herman: To be honest, at this level there is no real diff between the examples.
… this is a side-effect with included.
… we should not fiddle around with that
Benjamin Young: The graph foundation exists in flattened output, and this won’t go away.
… this needs clarification
Gregg Kellogg: The use of included on its own is a by-product of the feature.
… it does not need its own description in the spec
… There are use cases where that can be useful
… That better lies in a non-normative text.
Benjamin Young: The bush usage would go to the primer
… focus of the text would go back to inter-document referencing.
Ivan Herman: We are falling back to other extreme that I don’t agree with
… we are hiding a feature of included
… In the inclusion part we should mention the alternative representation of bushes.
… because current graph-based bushes are a hack
… It has been around for a while, but we still should mention it.
Benjamin Young: This is an accidental feature since recently.
… it is essential to flattened output. I don’t see it as a hack.
… For non-turtle/trig users.
Ivan Herman: For semweb folks you don’t care it is a hack.
… we get into taste issues
… I don’t want to hide it.
Proposed resolution: focus @included text and example on original inclusion use case; mention value of it as an @graph replacement for bushes–and reference primer for further reading (Benjamin Young)
Ivan Herman: +0.5
Gregg Kellogg: +1
Benjamin Young: +1
Tim Cole: +1
Ruben Taelman: +1
Ivan Herman: If we have a primer, a reference can be put into it in CR
Resolution #2: focus @included text and example on original inclusion use case; mention value of it as an @graph replacement for bushes–and reference primer for further reading
Ivan Herman: include or included?
Benjamin Young: People may expect a URI to be included. But this is wrong.
Ruben Taelman: what was the keyword in JSON API?
Gregg Kellogg: included
Ruben Taelman: in that case, it might be helpful to be consistent with that
Tim Cole: +1 for @included
Proposed resolution: close issue #19 with merger of @included related PRs (Benjamin Young)
Tim Cole: +1
Ivan Herman: +1
Ruben Taelman: +1
Benjamin Young: +1
Gregg Kellogg: +1
Resolution #3: close issue #19 with merger of @included related PRs
4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants