Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can compaction algorithms be used to transform a JSON-LD document with unknown @profile value(s) to a JSON-LD document with only known @profile value(s)? #610

Open
TallTed opened this issue Jul 24, 2024 · 10 comments
Labels

Comments

@TallTed
Copy link
Member

TallTed commented Jul 24, 2024

Challenge arose in VCDM and Data Integrity

https://github.com/w3c/vc-data-model/blob/59ed20b1886d34976fa9e729c0b70556cd298e39/index.html#L4375-L4378

Applications MAY
use JSON-LD <a data-cite="JSON-LD11-API#compaction-algorithms">compaction
algorithms</a> to transform a document that uses an unknown JSON-LD context
to one that does not, so the new document's terms will match expectations.

I could not find instructions for the above described transformation. I think that such guidance should allow the application developer to make the described transformation without becoming a JSON-LD expert.

I think that the application developer must —

  1. expand the JSON-LD that has unknown JSON-LD context(s)
  2. remove the declaration of the unknown JSON-LD context(s)
  3. (maybe?) add declaration of known JSON-LD context(s)
  4. compact the JSON-LD

Seems best to me, to add such documentation to JSON-LD docs, which can then be cited by VCDM, Data Integrity, and others.

(attn: @msporny, @dlongley)

@gkellogg
Copy link
Member

gkellogg commented Jul 24, 2024

You're correct that compaction involves expansion. Expanding a JSON-LD document eliminates any contexts it uses, so there is no declaration of an unknown JSON-LD context. Compaction takes a specific (known) context to use for the compaction, and the result will include an @context element referencing the context used to do the compaction.

During discussion, the point seemed to be that if a document used a term that did not expand properly that such an entry would be dropped, although I don't see that concern explicitly in the text you cite.

  • If the term is encountered as part of context processing, there are cases in Create Term Definition that will raise an invalid term definition error among others. There are other places that can result in an invalid term definition as well, such as if the term does not expand any you get an invalid IRI mapping.
  • If the term is encountered when expanding the document (say, as an entry key), look at step 13.3 of the Expansion Algorithm, where if a term expands to null (which it will, if undefined), the whole entry is skipped, removing it from the expanded document and then from being used for compaction. There is a though to be able to optionally generate an error, but there's no issue tracking this right now (@dlongley suggested).

I'm not sure what else needs to be added to make the suggested statement unambiguous.

@TallTed
Copy link
Member Author

TallTed commented Jul 24, 2024

During discussion, the point seemed to be

I definitely expressed myself unclearly, as this point was not my intent at all.

I am thinking that something like the following should be added somewhere — could be any or all of JSON-LD, VCDM, or DI —

A JSON-LD document that declares an unknown `@context` value can be expanded
(which removes all `@context` declarations), and then re-compacted with one or
more known `@context` declarations. JSON-LD terms that are not found within
that known `@context` will remain as absolute URIs in the new document, while
JSON-LD terms that _are_ found within the known `@context(s)` will be changed
to plain literal terms.

Examples of such before and after documents — showing the changes of dummy "unknown" @context declarations and literal JSON keys to dummy "known" @context declarations, literal JSON keys, and URI JSON keys — would probably help comprehension.

@msporny
Copy link
Member

msporny commented Jul 24, 2024

Yes, what @TallTed says above is much closer to the language I was hoping for. As to where this language goes, I'm a bit ambivalent, but it would be nice for the JSON-LD spec to speak directly to that notion.

@dlongley
Copy link
Contributor

dlongley commented Jul 24, 2024

Just to be clear, this issue is not about "invalid or missing" term definitions. This is about receiving a document with an @context containing some previously unseen URL or object and re-compacting that document to one that uses a well-known context (i.e., one that uses term definitions that some code has been written against).

For example:

const incomingDoc = {
  "@context": "https://never-seen-before.example",
  // to code that doesn't know ^ this context, it must not
  // try to understand the meaning of the JSON key from
  // its literal string of characters
  "shouldBeConsideredOpaque": "some value"
};

const wellKnownContext = {
  "@context": {
    "cats": "https://cats.com#cats"
  }
};

const recompactedDocument = compact(incomingDoc, wellKnownContext);

if(recompactedDocument.cats !== undefined) {
  // do some cat stuff
} else {
  throw new Error("Sorry no cats!");
}

If we need a more concrete use case, what was discussed over in the VCWG was a case where the incoming document used multiple contexts, one that defined international driver's license terms and one that defined US-only driver's license terms. The consumer only understood the international driver's license terms, so they could recompact that context, removing any US-only context:

const incomingDoc = {
  "@context": [
    "https://international-dl.example",
    "https://usa-dl.example"
  ],
  "international_dl_field": "some international value",
  "usa_dl_field": "some US value"
};

const wellKnownContext = "https://international-dl-example";
const documentLoader = url => {
  if(url === wellKnownContext) {
    return {
      // return `RemoteDocument` object with static context
      contextUrl: null,
      documentUrl: url,
      document: {
        "@context": {
          "international_dl_field": "https://international-dl.example/vocab#dl_field"
        }
      }
    };
  }
  return someDefaultNetworkDocumentLoader(url);
};

const recompactedDocument = compact(incomingDoc, wellKnownContext, documentLoader);
// Note: the US fields will be fully expanded now to URLs and ignored,
// and `recompactedDocument` looks like:
/*
{
  "@context": "https://international-dl.example",
  "international_dl_field": "some international value",
  "https://usa-dl.example/vocab#usa_dl_field": "some US value"
}
*/

if(recompactedDocument.international_dl_field !== undefined) {
  // do some international DL stuff
} else {
  throw new Error("Sorry no international DL stuff!");
}

@TallTed
Copy link
Member Author

TallTed commented Jul 24, 2024

To help eliminate confusion within this issue, @gkellogg, please <strike> or otherwise edit #610 (comment) such that only the first and last paragraphs remain in play. If the rest of that comment needs to be pursued further, I suggest that it go into another issue.

As far as how "to make the suggested statement unambiguous" — ambiguity is not my concern. Removing the current requirement that implementers of VCDM or DI fully grok JSON-LD is my concern; implementers of VCDM or DI should generally be able to follow only the algorithms/recipes therein, which are much simpler and more focused than those in JSON-LD.

@gkellogg
Copy link
Member

Still not sure what needs to be added to the spec; could it be just a best practice?

The act of compacting a document always expands it first, which specifically is there to remove contexts, so (presuming that a document loader doesn't restrict it) an unknown context is used for the expansion, but the provided context (wellKnownContext) is used for compaction. This is just the way that JSON-LD work, and I don't see what adding any text would accomplish.

If you want to discuss a use case for re-compacting a JSON-LD document to eliminate unknown contexts, it would seem to be just "compact the document using the well-known context".

@msporny
Copy link
Member

msporny commented Jul 29, 2024

This is just the way that JSON-LD work, and I don't see what adding any text would accomplish.

Yes, you're right in "that's the way JSON-LD works". However, it seems like we need to say /something/ to avoid permathreads like this:

w3c/vc-data-integrity#272

Granted, only part of that permathread is about this issue, but it's clear that people don't quite understand how basic JSON-LD compaction works (nor probably want to learn about how it works). Pointing them to the existing section in the JSON-LD specification on compaction and expansion didn't seem to help either. The guidance that @TallTed is asking for would probably be fine as a BCP, but I'm not sure if some in that thread would agree.

We just merged some text this weekend that made an attempt at some guidance here:

https://w3c.github.io/vc-data-integrity/#validating-contexts

Perhaps, ideally, we wouldn't have that section in the Data Integrity specification, but would rather put it in a JSON-LD WG specification. Whether that's in the core JSON-LD spec, or a BCP document, is up to the WG to decide.

@gkellogg
Copy link
Member

I wouldn't be adverse to adding some informative paragraphs, or a sub-section to the Compaction Algorithm that describes how compaction can be used to remove/replace unknown contexts with a well-known context along with some text that describes why you might want to do this. But, you guys are probably in the best position to create such a PR.

@TallTed
Copy link
Member Author

TallTed commented Jul 29, 2024

The act of compacting a document always expands it first

So far as I have found, nothing has explicitly said that in such simple language, until this thread. Perhaps I've overlooked it

This is just the way that JSON-LD work, and I don't see what adding any text would accomplish.

People who don't already know that "[this] is just the way that JSON-LD [works]" would benefit by having even just that sentence added to the spec, but I think a few more sentences would be better. I don't think it needs to be more than a few paragraphs, if that much.

I believe the JSON-LD algorithms express that "compacting a [JSON-LD] document always expands it first", but understanding those algorithms requires a fairly deep dive into technical lingo, and making one's brain pretend it's silicon for long enough to walk through the algorithm oneself, which should not be necessary for all readers nor all deployers of these technologies.

A developer of a tool that they're linking to a JSON-LD processing library should be able to just know (and even this may be more than they really need to know) that if their tool asks the library to compress a given JSON-LD document based on (for instance) their corporate standard @profile declarations, then the JSON-LD library will (1) expand the original JSON-LD document based on the @profile declarations it contains, (2) replace those original @profile declarations with that/those provided as inputs to the JSON-LD library's compress (or (re)compress) routine, and (3) return a compacted JSON-LD document that uses the @profile declarations they provided and leaves out the original @profile declarations.

It might be better for such a library to make available an interface or API that starts with "replace existing @context..." and then walks through submission of file specification(s), URI(s), and/or plaintext @context values, which are then used to perform the expansion-and-(re)compaction described above.

@TallTed
Copy link
Member Author

TallTed commented Jul 29, 2024

I also suggest avoiding such TLAs (Three Letter Acronyms) as BCP, unless expansion is provided nearby. There are many possible interpretations of BCP, and it's not immediately clear whether "Best Current Practice(s)" is what was intended (though it seems likely).

@gkellogg gkellogg moved this to Future Work in JSON-LD Management Oct 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Future Work
Development

No branches or pull requests

4 participants