Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'minLength' and 'maxLength' to StringSchema #889

Closed
RoboPhred opened this issue Apr 10, 2020 · 13 comments
Closed

Add 'minLength' and 'maxLength' to StringSchema #889

RoboPhred opened this issue Apr 10, 2020 · 13 comments
Assignees
Labels
Propose closing Problem will be closed shortly if there is no veto. validation Topic related to Normative Parsing, Validation, Consumption

Comments

@RoboPhred
Copy link

RoboPhred commented Apr 10, 2020

Currently, StringSchema provides no method to describe the expected length of the string.

Both min and max length should be able to be specified. There is currently no way to enforce that the string has content (minLength: 1), nor is there a way to indicate that the device may have a maximum allowable length for its property (maxLength: 64). Max length in particular is important for memory constrained devices.

These properties currently exist on JSON Schema, and should be supported here.

@egekorkan
Copy link
Contributor

Actually, there is the following sentence in the specification:

It is noted that data schema definitions within Thing Description instances are not limited to this defined subset and may use additional terms found in JSON Schema using a TD Context Extension for the additional terms as described in § 7. TD Context Extensions, otherwise these terms are semantically ignored by TD Processors (for details about semantic processing, please refer to § D. JSON-LD Context Usage and the documentation under the namespace IRIs, e.g., https://www.w3.org/2019/wot/td).

We are aware that some keywords are not included from JSON Schema and at least until JSON Schema becomes a standard, we didn't want that DataSchema vocabulary 100% inherits JSON Schema (I am only 90% sure about the reason). However, you can use such keywords even without context extension since a JSON Schema validator will detect the relevant words anyways. The context extension gives you the guarantee that they are well defined and understood properly by TD or JSON-LD parsers.

You can however ask why these are not included but e.g. minItems are included. I would guess that there wasn't anyone using these keywords and didn't request them. Your use case does make sense though :) If there is enough interest, they can be added in V1.1

@RoboPhred
Copy link
Author

RoboPhred commented Apr 14, 2020

I am exploring using the TD context extensions you suggested, but I am unable to find the IRI* to use in the context extension for the full json-schema spec.

Ultimately, I need a way to transform the context-annotated fragment of the TD into a json-schema object to pass to a validator. I have been experimenting with jsonld-cli, and I think the process is to expand the TD, retrieve the target (in this case, a cancellation DataSchema provided in the webhook example), and condense it back down by specifying the json-schema IRI as the context.

Unfortunately, I have not been able to complete this process owing to not finding the right context to use for the json-schema. I have tried using https://www.w3.org/2019/wot/json-schema# as defined in § 4 Namespaces, but that results in the error Dereferencing a URL did not result in a valid JSON-LD object. Even if it worked, that also seems tied to the WOT specification of DataSchema, which wouldn't have the validation properties I am looking for.

Can you point me in the right direction for which context IRI to use for json-schema? Does one exist, or has the json-schema standardization process not produced one yet?

* Forgive me if this is the wrong acronym, This is the first time I have seen json-ld and am still getting used to the spec.

@egekorkan
Copy link
Contributor

I am exploring using the TD context extensions you suggested, but I am unable to find the IRI* to use in the context extension for the full json-schema spec.

So this is something we do not specify. However, this document here can provide better explanations. I am not aware if there is an RDF notation of the entire JSON Schema specification. However, from what I understand you do not need to to parse a TD in a JSON-LD way to do validation.

Just to make sure I understand you correctly, let's say you have the following TD which has the minLength and maxLength keyword from JSON Schema:

{
    "@context": "https://www.w3.org/2019/wot/td/v1",
    "title": "MyLampThing",
    "properties": {
        "status" : {
            "type": "string",
            "minLength":4,
            "maxLength":10,
            "forms": [{"href": "https://mylamp.example.com/status"}]
        }
    }
}

So this means that a response delivered by the Thing should be something like "hithere" but not "hi" or "hithereeveryoneintheworld". You, as the Consumer or Client, interacting with the Thing want to validate the responses of the Thing. Given that a validator has the api like myValidator.validate(schema, payload) you would use like myValidator.validate(td.properties.status, currentPayload). So any JSON Schema validator should ignore the keyword "forms" but take everything else into account for validation.

If this is not your case, please provide more concrete examples or a "minimal working/not working case"

@RoboPhred
Copy link
Author

RoboPhred commented Apr 16, 2020

I am more interested in specifying requirements for values of writable properties.

For example, a thing exposes a location property that is mandatory and has a maximum length, which is settable by consumers. This property needs to specify two things:

  • It cannot be an empty string minLength: 1
  • It must be 64 characters or less maxLength: 64

A web interface that consumes TDs to generate configuration pages for things will be able to use this schema to properly validate the user's input before sending it off to the thing.

I am interested in finding the right context for json-schema because I am trying to keep my TD generator library strictly conformant to the spec, and allowing arbitrary properties with the assumption that the client knows what to do about it goes against that. I was hoping to end up with something like this

{
     "@context": [
       "https://www.w3.org/2019/wot/td/v1",
       {"jsc": "path/to/json-schema"}
     ],
     "properties": {
        "location" : {
            "type": "string",
            "jsc:minLength": 1,
            "jsc:maxLength": 64,
            "forms": [{"href": "https://mylamp.example.com/location"}]
        }
    }
}

The client would have knowledge that the DataSchema from the TD is json-schema compatible, and would now also know any jsc keys are also json-schema compatible.
I was hoping to use RDF expand/compress to pivot the context from TD to json-schema on properties.location, in order to reinterpret the property into a form a json-schema validator can accept. This would be needed as the jsc: prefix on the existing json-schema keys would make them not match if passed as-is.

The process should end up with something like:

{
  "@context": [
    "path/to/json-schema",
    {"td": "https://www.w3.org/2019/wot/td/v1"}
  ],
  "type": "string",
  "minLength": 1,
  "maxLength": 64,
  "td:forms": [{"td:href": "https://mylamp.example.com/location"}]
}

which can be processed by a json-schema validator.
(How it would know type is equivalent across td and json-schema I have not solved yet, perhaps I would check the td prefix and tell the json-schema to account for it. It is much more feasible to do this on the limited properties of DataSchema than it is to do so on the far more expansive json-schema spec).

I'm not sure how realistic my insistence on rigidly adhering to the TD and JSON-LD specs is, but so far it has worked out and I want to see how far I can take it.

@sebastiankb sebastiankb added the validation Topic related to Normative Parsing, Validation, Consumption label Apr 16, 2020
@egekorkan
Copy link
Contributor

I think you do have a valid use case and we are not aware of another vocabulary that contains all the JSON Schema words. Our solution is to add them to the JSON Schema vocabulary. However, as we have decided on 17.04.2020, we would like you to contribute your WoT implementation to the implementation report. For this, we would like to ask you to explain your implementation in one paragraph and provide TDs that are produced from this implementation. I am guessing that this (https://github.com/RoboPhred/wutwot/) is your WoT implementation? If you agree, I will give you more details on how to do that.
If you agree, we would also like to invite you to one of the calls so that you can introduce the implementation to us.

@RoboPhred
Copy link
Author

RoboPhred commented Apr 17, 2020

Sure, I can do a write-up. I only recently pivoted the project to the W3C TD so I haven't fully exercised the spec, but there is enough here to give feedback on.

The produced TDs are a bit behind the rest of the library however, as I am just beginning to rewrite it to take into account forms; most of it is still spitting out Mozilla WOT. The core library follows the spec much more closely, but the web endpoint needs work. This might all be sorted out depending on when you want the writeup though.

What sort of information would you want in a call?

@RoboPhred
Copy link
Author

On further reflection, I probably don't have much to demonstrate until I get around to implementing the web frontend for this project. Up until now I have been strictly following the standard and not using any extensions. That will change once I start working on the web frontend, as I will need extensions to explain to the frontend how the management-oriented things and affordances work. Until then, I would just be presenting back your own spec and nothing further.

I could talk about future direction and my needs in that area, but until I dogfood out the design I don't have any hard data to give.

@sebastiankb
Copy link
Contributor

fyi PR #896 is working on this topic

@sebastiankb
Copy link
Contributor

PR is merged

@sebastiankb sebastiankb added the Propose closing Problem will be closed shortly if there is no veto. label Oct 5, 2020
@egekorkan
Copy link
Contributor

Could we add at risk in the TD document, there are no known implementations with this feature

@danielpeintner
Copy link
Contributor

Could we add at risk in the TD document, there are no known implementations with this feature

I believe the same applies to minItems for ArraySchema which raises a more important question: Do we say somewhere what happens if the metadata is not respected? I don't find anything that says "refuse" or so... I just find "It can be used for validation." which is very vague...

@egekorkan
Copy link
Contributor

I think it should be rejected following a protocol-level error message but the general error messages should be described in a protocol-agnostic way as discussed at w3c/wot-scripting-api#200

@sebastiankb
Copy link
Contributor

the latest draft includes this two terms now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Propose closing Problem will be closed shortly if there is no veto. validation Topic related to Normative Parsing, Validation, Consumption
Projects
None yet
Development

No branches or pull requests

4 participants