Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Schema for Thing Model #972

Closed
sebastiankb opened this issue Sep 30, 2020 · 21 comments
Closed

JSON Schema for Thing Model #972

sebastiankb opened this issue Sep 30, 2020 · 21 comments
Assignees
Labels
Propose closing Problem will be closed shortly if there is no veto. Thing Model Topic related to Thing Models

Comments

@sebastiankb
Copy link
Contributor

The Thing Model can be not used to do validation based on the TD JSON Schema. We need an own JSON Schema definition for Thing Model.

There can be diff file in the future to show the difference between TD and TM.

@sebastiankb sebastiankb added the Thing Model Topic related to Thing Models label Sep 30, 2020
@egekorkan
Copy link
Contributor

From what I understand, the TM will be rather flexible to point that only an @context is necessary? Should the current schema file be copied and all the required fields are removed to make it flexible to any precision but at the same time validate any key that is used in a TM?

@danielpeintner
Copy link
Contributor

The idea that was discussed was that it would be ideal having a patch (between the 2 schemas) or a base schema that is the same for both (TD and ThingModel) and that we can easily point to the differences. Not sure though if this is possible...

@egekorkan
Copy link
Contributor

I think that I at least don't have the knowledge on how to do a patch between two schemas.

Sorry to ping you like this @handrews but is there way to have a schema that has required fields for some properties of an object and another schema that has the exact structure but has less or no keys required in some of the required fields?
Below is a simplified example:

  • Schema 1 that corresponds to the current TD spec's status, having required fields in some places:
{
    "properties": {
        "id": {
            "type": "string"
        }
    },
    "required": ["id"],
    "type": "object"
}
  • Schema 2 that would be a schema for the TD Model, where most object fields are not required but need to be validated if they are present:
{
    "properties": {
        "id": {
            "type": "string"
        }
    },
    "type": "object"
}

We would need a solution that reduces writing it twice. I have thought of doing a $ref to the simpler schema from the more complex one (the current TD Schema) and adding required when needed but that would mean writing the whole structure of the schema twice.

Here is the schema in question by the way: https://github.com/w3c/wot-thing-description/blob/master/validation/td-json-schema-validation.json

@handrews
Copy link

handrews commented Oct 1, 2020

@egekorkan JSON Schema is an additive constraint system, so once you've applied a constraint you can't remove it.

Each subschema produces a validation result that depends on the instance and the subschema's keywords. That validation result doesn't "remember" how it was produced, so you can't negate any of the reasons.

When you say "writing the whole structure twice" do you mean because you have required keywords that get applied deeper in an object structure? Because $ref to simpler schema and then adding required on top of that is the recommended approach.

What draft of JSON Schema are you working with now? (FYI, OpenAPI 3.1-rc0 is using draft 2019-09, and OpenAPI 3.1 final will used the next draft, due out before the end of the year so probably draft 2020-12, so we expect that next draft to be very broadly implemented for not just validation but also code generation and other use cases). 2019-09 and 2020-12 have better extensibility, and while a custom keyword vocabulary still can't remove constraints, you could have a custom keyword that applied required at depth- you would probably not be the only people considering that sort of keyword.

@egekorkan
Copy link
Contributor

Thank you for responding quickly!

What draft of JSON Schema are you working with now?

We are in draft7 but I will update it to 2019-09. I didn't know about the release date of 2020-X and the changes it brings so I didn't consider it that much. What you are proposing seems to be exactly what we want, i.e. a custom keyword that applied required at depth.

When you say "writing the whole structure twice" do you mean because you have required keywords that get applied deeper in an object structure?

Yes, I think :) Below would be a more concrete example as if the links keyword and its href keyword are required in the current TD spec but not by the Thing Model spec:

  • The simpler schema with less required terms for the Thing Model:
{
    "$schema ": "http://json-schema.org/draft-07/schema#",
    "$id":"myid",
    "type": "object",
    "properties": {
        "title": {
            "type": "string"
        },
        "links": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "href": {"type": "string"},
                "rel": {"type": "string"}
            }
          }
        },
        "security": {"type": "string"}
    },
    "required": ["title"]
}
  • The extended schema with more required terms for the Thing Description:
{
    "$schema ": "http://json-schema.org/draft-07/schema#",
    "definitions":{
      "simpleSchema":{
        //schema above, this will be located in another file of course
      }
    },
    "type": "object",
    "$ref":"#/definitions/simpleSchema",
    "properties": {
        "links": {
            "type": "array",
            "items": {
              "type": "object",
              "required": ["href"]
          }
        }
    },
    "required": ["title","security","links"]
}

So in here, to be able to say that href is required in a link object, I had to partially rewrite the schema structure.

@handrews
Copy link

handrews commented Oct 1, 2020

@egekorkan yeah, the deepRequired idea (which has been proposed with several different syntaxes) would let you somehow stuff that links/items/href required into the top level. This would not be a JSON Schema standard keyword, but in 2019-09 you can use a meta-schema to indicate that you require an implementation to understand additional vocabularies, and that it should error out if it does not understand them.

That means people can write plugin systems to load vocabulary handlers. We've been working on being more clear about what a keyword can and can't do, so there are reasonable bounds on the plugin interface. Hopefully 2020-12 will be more clear on that.

Exactly what will be in 2020-12 is still in flux, but there are two proposed breaking changes:

  • (breaking from 2019-09 only) replace the new-in-2019-09 $recursive* keywords with $dynamic* that are a lot easier to understand. These are advanced keywords useful when extending recursive structures like meta-schemas. They are otherwise unlikely to be used.
  • (breaking from every version) split items (which currently has a single-schema form and an array-of-schema forms) into prefixItems (replacing the array-of-schema form) and items (single schema form, applies to everything after prefixItems, which means in the absence of prefixItems it behaves exactly like single-schema items does today). This also allows us to drop additionalItems as items+prefixItems now handles that case- this is good because there was a weird overlap in items and additionalItems that caused confusion.

While the prefixItems idea has been well-received so far, and as far as we know, array-form items is fairly rarely used, I'm a bit nervous on that which is why I'm bringing it up here to see if it sounds problematic.

@egekorkan
Copy link
Contributor

Alright, even more motivation to properly understand meta-schemas and 2019-09 for me :)
Interesting development regarding items as well. The TD supports both types of items and also has some assertions regarding additionalItems but from what I have seen in our plugfests, there has not been any prefixItems style usage of items. The validation schema for TDs also do not use it to validate TDs but it allows items keyword to be done in two ways. This will need to change :)

@handrews
Copy link

handrews commented Oct 2, 2020

The TD supports both types of items and also has some assertions regarding additionalItems but from what I have seen in our plugfests, there has not been any prefixItems style usage of items

Good to know. And yeah, the

{
  "type": "array",
  "items": [{"type": "boolean"}, {"type": "integer"}],
  "additionalItems": {"type": "string"}
}

style usage, which would become

{
  "type": "array",
  "prefixItems": [{"type": "boolean"}, {"type": "integer"}],
  "items": {"type": "string"}
}

seems to have always been pretty rare. The most common example is if you're trying to describe something like a variable-length function argument signature, where the first few arguments have specific meaning and then you can tack on however many arguments of some general type as you want. But that's not a common use case. And everyone is tired of having to say "single-schema items" or "array-form items" as it gets cumbersome.

The ambiguity, just in case it's relevant, is this:

{
  "type": "array",
  "additionalItems": {"type": "string"}
}

at a glance, to most people that looks like it should behave the same as this:

{
  "type": "array",
  "items": {"type": "string"}
}

but technically additionalItems is ignored if array-form items is not present. Which is confusing and has been inconsistently implemented, and people have argued back and forth over what it should do. So we got rid of the ambiguity.

@danielpeintner
Copy link
Contributor

In the scripting API task force we can produce an ExposedThing by initially passing in a TD fragment which can be then enhanced with handlers et cetera.

Technically the TD fragment we have in scripting is very similar to a TD model (it is aligned with a TD but does not have the same strong requirements). Moreover we also want to have a proper way to validate the input (see w3c/wot-scripting-api#287).

The validation we have in mind is to use the JSON schema we already have and remove all all "required": { ... } keywords. Doing so checks for example whether the types are correct but does not impose any requirements of presence of X or Y.

I believe that this new JSON Schema may be also applicable to Thing Model but I also read the following in this thread

Schema 2 that would be a schema for the TD Model, where most object fields are not required but need to be validated if they are present:

What means "most object fields are not required"?

I think it would be beneficial to have just one other JSON schema besides the main JSON schema which can be used in multiple use-cases and not multiple JSON schemas. Please not that in discovery a similar input/validation is expected for executing queries.

Note: I also found some mentioning of "td model" in the JSON schema provided. Is this mentioning of "td model still" correct? It is also the only appearance.
"description": "JSON Schema for validating TD instances against the TD model. TD instances can be with or without terms that have default values"

@sebastiankb
Copy link
Contributor Author

From today's TD call:

  • we should clarify if there is a minimum of requirements of mandatory terms in Thing Model like "title" and "@context"
  • clarify the relations to partial TD?
  • for this we need some input from @mlagally and @danielpeintner

@danielpeintner
Copy link
Contributor

From the scripting perspective (and the need for a partial TD) we need a JSON Schema that is essentially a copy of the JSON schema defined in TD + removing all required terms.

The questions is whether this works for ThingModel as well or there are mandatory terms like title ...
This would mean we have VERY similar JSON Schemas with just some slight differences. I would like to see not too many variants.

@relu91
Copy link
Member

relu91 commented Jan 21, 2021

It may help the discussion to have a look at those two PRs about Partial TDs: w3c/wot-architecture#577 w3c/wot-scripting-api#289 (still a draft).

The questions is whether this works for ThingModel as well or there are mandatory terms like title ...
This would mean we have VERY similar JSON Schemas with just some slight differences. I would like to see not too many >variants.

Like mathematicians wonder if P = NP, we have our question: PartialTD = ThingModel ? 🤣

@egekorkan
Copy link
Contributor

By the way, once this is decided, I would like to provide small scripts that basically take the original schema and do the changes on it. This way, we would avoid discrepancy between different schemas when the TD vocabulary changes.

@handrews
Copy link

@egekorkan if the schema isn't too complex in structure, you can just write it without the requireds and allOf different sets of requireds to get the correct behaviors. If the schema is more complex, I recommend just maintaining a set of either application/json-patch+json or application/merge-patch+json documents for the deltas.

@egekorkan
Copy link
Contributor

@handrews the schema is sadly a bit too complex at the moment. Nothing unusual or difficult to understand but big for sure: https://github.com/w3c/wot-thing-description/blob/master/validation/td-json-schema-validation.json

Do you recommend actually manually removing the unwanted keywords? I would still store the differences but also provide a script to generate them from the main schema.

@handrews
Copy link

@egekorkan yeah that looks like a reasonable but non-trivial schema. When you have a lot of references it's hard to layer in more constraints and have all of the references reach the layered definitions. Draft 2020-12's $dynamicAnchor and $dynamicRef substantially address that limitation, although not without cost in terms of easily reading a large set of schemas.

If you store the differences as JSON Patch or JSON Merge Patch (the media types I listed above), your entire script is basically someLibrary.applyPatch(schema, changes)

@relu91
Copy link
Member

relu91 commented Jan 22, 2021

If you store the differences as JSON Patch or JSON Merge Patch (the media types I listed above), your entire script is basically someLibrary.applyPatch(schema, changes)

Thanks for the suggestion it may help the definition of the algorithm that I am writing for partialTDs.

@sebastiankb
Copy link
Contributor Author

sebastiankb commented Feb 3, 2021

After having some internal discussion about the placeholder concept in Thing Models, the TM JSON Schema needs some further definitions. To see the motivation lets have this TM snippet:

                "water": {
                    "type": "integer",
                    "minimum": {{WATER_MINIMUM}},
                    "maximum": 100
                }

However, this is not working since this is not a valid JSON. The workaround would be

                "water": {
                    "type": "integer",
                    "minimum": "{{WATER_MINIMUM}}",
                    "maximum": 100
                }

When it comes to substitution the string have to be converted to an number representation back by removing the '"' character. The proposal would be compliant to the json-placeholder-replacer lib:

https://www.npmjs.com/package/json-placeholder-replacer

This means that the JSON schema for the TM should provide for number-based terms such as for minimum and maximum in additional string typing to enable the use of the placeholder feature.

@sebastiankb
Copy link
Contributor Author

From today's TD call:

  • it seems that the JSON Schema for Partial TD and TM have to be different
  • TM have some constraint such as rel=extend that should be only valid in TM or having placeholder feature (needs additional string types for number-based terms)
  • there was also a discussion if TM may be not a valid JSON anymore but this would result to more complexity in terms of tools support etc.

@egekorkan
Copy link
Contributor

I think that this PR can be closed now. Swapping to propose closing label

@egekorkan egekorkan added Propose closing Problem will be closed shortly if there is no veto. and removed PR available labels Jul 27, 2021
@egekorkan
Copy link
Contributor

Closing since we have a TM schema and its generator now. However, there are some really informative comments in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Propose closing Problem will be closed shortly if there is no veto. Thing Model Topic related to Thing Models
Projects
None yet
Development

No branches or pull requests

5 participants