Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object-based contains #1077

Closed
gregsdennis opened this issue Feb 21, 2021 · 24 comments · Fixed by #1092
Closed

Object-based contains #1077

gregsdennis opened this issue Feb 21, 2021 · 24 comments · Fixed by #1092

Comments

@gregsdennis
Copy link
Member

gregsdennis commented Feb 21, 2021

Inspired by this SO question.

We have contains to verify that at least one item in an array matches a subschema. And through additionalProperties and patternProperties we recognize that objects can have dynamic properties.

It seems like we should have a way to specify that at least one property matches a subschema.

It could make sense to extend contains to do this when it receives an object, or we could define a new keyword. If we extend contains, then minContains and maxContains would naturally follow. format (in all its glory) already works on multiple value types, so it's not something we haven't had before.

@jdesrosiers
Copy link
Member

I'm sure this has come up before, but I did a search in the vocabularies repo and didn't find anything.

I should point out that there is a way to express object-contains-at-least-one without a new keyword. It would be sugar for,

{
  "not": {
    "additionalProperties": {
      "not": { ... schema ... }
  }
}

Just like this is sugar for array-contains-at-least-one (aka: contains),

{
  "not": {
    "items": {
      "not": { ... schema ... }
  }
}

There is no workaround for minContains/maxContains for arrays or objects.

I wouldn't be opposed to adding a keyword or extending the scope of contains to cover objects.

This has come up twice in the last week and I vaguely remember seeing a similar issue not long ago as well. This indicates that there is need for something like this. While there is a workaround, unless they're a JSON Schema expert they're unlikely to figure it out on their own. I also feel like it fits a hole in JSON Schema, if we support a contains operation on arrays, why not objects.

Generally, I don't like the idea of having a keyword that does more than one thing, but the semantics of contains for array and object really are the same, so I wouldn't be opposed.

@gregsdennis
Copy link
Member Author

{
  "not": {
    "additionalProperties": {
      "not": { ... schema ... }
  }
}

I don't follow how this (or the array one) works like contains.

@jdesrosiers
Copy link
Member

@gregsdennis Let's look at the array one first because I think it's a little easier to follow.

{
  "items": {
    "not": { "required": ["foo"] }
  }
}

This schema asserts that none of the items in the array can have a property "foo". If an instance validates to false against this schema, then we know it must have at least one item with a "foo" property. Since that false result is the result we actually want, we negate it with not.

{
  "not": {
    "items": {
      "not": { "required": ["foo"] }
    }
  }
}

additionalProperties works just like items except on objects, so the same pattern works there too.

@gregsdennis
Copy link
Member Author

gregsdennis commented Feb 21, 2021

Cases:

[{ "foo": "bar" }, { "baz": 1 }]:

  • required returns: true for /0, false for /1
  • not inverts that: false for /0, true for /1
  • items combines those: false overall
  • not inverts that: true overall ✔️

[{ "foo": "bar" }, { "foo": 1 }]:

  • required returns: true for both
  • not inverts that: false for both
  • items combines those: false overall
  • not inverts that: true overall ✔️

[{ "baz": "bar" }, { "baz": 1 }]:

  • required returns: false for both
  • not inverts that: true for both
  • items combines those: true overall
  • not inverts that: false overall ✔️

Okay.... sorry. Had to work through that. Thought it might be helpful for others, too.

@m-mohr
Copy link

m-mohr commented Mar 18, 2021

I'm strong supporter for this. We also have numerous use cases to make good schemas for the STAC specification: https://stacspec.org/

There we often need to check whether an object contains an object with specific requirements. An example:
https://gist.github.com/m-mohr/26dfeca72f2e8402f13565d6248a3346

@jdesrosiers
Copy link
Member

jdesrosiers commented Mar 24, 2021

There doesn't seem to be an opposition to this idea. I'd like to write up a PR sometime soonish if there are no objections. I think the only thing left to get consensus on is whether to say contains applies to objects as well as arrays or to define new objectContains/objectContainsMin/objectContainsMax keywords that apply to objects. I think I could go either way.

@gregsdennis
Copy link
Member Author

I personally like expanding contains.

@jdesrosiers
Copy link
Member

jdesrosiers commented Apr 14, 2021

There seems to be consensus for expanding contains. I'm going to start writing this up in a few days unless someone tells me to hold off.

@jdesrosiers
Copy link
Member

It just occurred to me, should we do the same thing for uniqueItems? We could make it unique and have it apply to arrays and object values.

@gregsdennis
Copy link
Member Author

gregsdennis commented Apr 16, 2021

I don't see why it can't be supported (no conflicts, etc.), but it seems edge-case-y to me.

Maybe it's no less edge-case-y than contains supporting objects.

@awwright
Copy link
Member

awwright commented Apr 21, 2021

Following up my comment in #1092

What happens to schemas that allow an array or an object; that specify "contains" and "properties" and didn't intend for "contains" to apply to the object form?

@gregsdennis
Copy link
Member Author

@awwright that seems like an edge case to me.

Even so, I'd separate those cases and specify an array for the contains:

- allOf
  - type=array & contains
  - properties

I can't think of a single use where you'd want to describe those in the same schema object.

@m-mohr
Copy link

m-mohr commented Apr 21, 2021

While this is true for new schemas, this could be a breaking change for existing schemas. If you already have such a semantic with type: object & array and using contains, you may break things with the contains keyword now applying also to objects. To be backward compatible, it may indeed be better to make it a new keyword.

@karenetheridge
Copy link
Member

IME it is very rare for a schema to accept either an array or an object in a particular position. Generally it is one or the other so I don't see a problem in practice with one keyword being applicable to both instance types.

@jdesrosiers
Copy link
Member

I agree that the most pure JSON Schema way of doing it is to make a separate keyword, but I also agree that I don't see a way that using the same keyword would cause a problem. So, I could go either way. I think what we need is a really good example schema of where it would be problematic.

@Relequestual
Copy link
Member

My preference would be for a new keyword, however I also agree that expanding the definiton would probably make it simpler to understand.

While something accepting an array or an object may be an edge case, we do have to consider edge cases.

@jdesrosiers
Copy link
Member

It sounds like there are still no strong opinions on whether to use new keywords or the same keywords, but preference for new keywords is growing. I think the preference for new keywords has a slight majority. Would anyone advocating for same keywords be upset if we went with new keywords?

@gregsdennis
Copy link
Member Author

I personally like the simplicity of using the same keyword mostly because the min/max keywords would follow suit. It's really easy to reason about.

The single benefit of new keywords is backwards compatibility in an extreme edge case. I'm not sure that it's worth the effort. Do (can) we have metrics on how often people are combining object requirements and array requirements in the same schema (without using anyOf or a similar mechanism)?

@jdesrosiers
Copy link
Member

Do (can) we have metrics on how often people are combining object requirements and array requirements in the same schema (without using anyOf or a similar mechanism)?

I can only speak anecdotally, but I don't think I've ever seen anyone do that.

@Relequestual
Copy link
Member

We could do a small analysis of schemas in https://github.com/schemastore/schemastore/

Maybe if I find some time I can do that next week! But someone please feel free to jump in before me.

I'm strongly in favour of a new keyword. Mainly because we often run into problems when keywords can work in multiple ways. Although, given we're not convinced the potential for problems or confustion is very noticable, I guess I'd have to reduce my objection to "mildly concerning".

I'll tweet this issue and share it elsewhere and see if we can get any further community feedback.

@patrickkwang
Copy link

As far as I can tell, there are no schemas in schemastore that use contains while allowing that the instance be an object.
https://gist.github.com/patrickkwang/359afcc479e819721cf94254e3a80a9e

I'm a fan of extending contains rather than creating a new keyword.

@jdesrosiers
Copy link
Member

Unfortunately, I don't see consensus forming here. There is clear consensus that this feature should exist, but not what form it should take. I want to try a consensus measuring activity to get a better idea of where we are at on this. Please react to this comment based on the legend below.

  • 👍 Strong preference for objectContains
  • ❤️ Slight preference for objectContains
  • 😕 No preference
  • 🎉 Slight preference for contains
  • 😄 Strong preference of contains
  • 👎 This feature shouldn't be added

@gregsdennis
Copy link
Member Author

You used all the emojis in your voting system. How do I show support for the voting system itself? 😉

@jdesrosiers
Copy link
Member

I was hoping for more reactions, but it appears that there is general consensus for overloading contains. The only reaction for objectContains was a slight preference and the only strong preference reactions were in favor of contains. Therefore, I think we can proceed with review of #1092.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants