-
-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft 2019+ tests incorrectly depend on implementations supporting $schema
-less schemas but they are not required to process them
#311
Comments
There's no real reason other than that it'd be noise (a property needed in all schemas) -- it's also valid to leave it out even though it's highly recommended, so even if we did add it everywhere you'd still have a few (one at least) that tests the behavior without it. Your implementation doesn't allow explicitly choosing which version to apply? E.g. in mine, when I run the suite, for each folder, I explicitly use the appropriate draft, even though mine works as you say defaulting to the latest if you don't specify. |
@Julian given that we're strongly encouraging folks to start using The idea that |
Quite the opposite. If you specify with |
This comment was marked as off-topic.
This comment was marked as off-topic.
Right that's what I mean -- in mine, there are 3 choices -- if you specify with That being said though I'm definitely not against adding (My usual neuroses about not using large sweeping automated formatting changes there though still may mean we have to do this slowly enough to review the patches individually)
I don't think the test suite should be a guide there -- it contains in many cases intentionally pathological examples, but yeah fair enough I guess I shouldn't push back if I'm not disagreeing :D |
I'd say it's far more important to support reading and respecting
Yup. This is particularly critical with
Sure, we can have a test for this, but we only need one, and it should only test behavior specified by the specification. So I'm not sure what it should tests. The most recent specification does state a default meta-schema, but I believe it is the first to explicitly do so.
@Julian are you really saying that you're not OK with mechanically inserting the same |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
From a ticket closed as a dupe, sharing this again below:
To me, I don't feel the recommendation really applies strongly here, for reasons I intended to describe in json-schema-org/json-schema-spec#324 -- namely, what I think we should optimize for in the test suite is readability, not perfection -- and while that recommendation is perfect for schemas in isolation which otherwise wouldn't know what On the other hand I do like:
since it seems nice to be able to rely on something someone already needs to support, but I don't think, again in my opinion, that this will work really. Because the test suite will always need to also include a test where But yeah happy to either elaborate myself, or hear again from others. I know Greg and Henry may still be in favor of this. |
I don't think there's been a specific reason we don't include
Guessing what, exactly? JSON Schema should be suitably well defined to operate without a meta-schema. What's written in the spec is authoritative, what appears in the meta-schema is merely informative. |
@Julian, the role of However, as of 2019-09, we made reliable extensibility a major goal and developed Basic QA principles say that you can't write test cases that combine implementation-defined behavior with positive functionality tests (for those who don't know, my career bounced back and forth between systems QA and Development before settling into Technical Director roles spanning both departments). Doing so means that every test outcome is impacted by implementation-specific choices rather than standards conformance. So on those grounds, if someone came to me with a test plan where nearly every test case involved an implementation-defined condition, I would reject the plan. The larger topic here is whether the JSON Schema org supports the goal of reliable extensibility. If we do, then we really need to stop treating The reason for this is that the test suite makes it very clear that So, as currently designed, the test suite is undercutting an (apparent?) major project goal. It's very frustrating to see how little attention many implementations pay to If the JSON Schema org does not have a consensus that this is a major project goal, then we should open a discussion on that in the discussions repo, because there's no point in debating this detail without clear context on why we care about If we do have consensus on this, then the topic of how to best test But the minimum first step would be to get the implementation-defined condition out of all of the other test cases. The test suite should rely on well-specified functionality ( I would prefer to extend this to all drafts, as it would significantly help normalize |
I'm responding quickly not to say I've already understood what you wrote (which yeah thanks! I'll read it carefully) but just because:
If this is true, you already convinced me. Can you or someone point me at where this is?
I'll just state on the record that I do not intend the test suite to say this, and if we don't have it, it's purely an "accident" of lack of effort until now on my part, something I clearly want to fix. |
Thanks, @Julian From §8.1.1 The "$schema" keyword:
There's slightly more clear guidance around unknown §8.1.2.1 Default vocabularies:
Even if we take that last paragraph as guidance, it doesn't actually say what draft of the vocabularies in question should be used. The 2nd paragraph even notes that an implementation might refuse to process the schema. So there's an implicit allowance that a conforming implementation could outright refuse to the run the test suite. That might not be something that we want to imply, but that's what it says. IIRC, I did not want to require implementations to process an unknown thing as that could be a security risk. Regarding:
Oh, I did not mean to imply that you did! I apologize, I should have worded that better. I'm trying to work on my tendency to sound accusatory on this sort of thing. I also think it was not unreasonable given the history. When I went to reply to this issue, at first I was confused over why we hadn't been using |
My reading skills (of the spec) apparently leave what to be desired, since I figured you were referring to that paragraph, but I have up until this moment probably never internalized the last half of that paragraph, i.e.:
and was always focused on "yeah OK SHOULD, but doesn't need to". That's quite clear, so I'm now very much +1 on the change, and indeed as you say now even when we do now have tests without
Don't worry about it :) I'll very much assume good intent (at least until I look silly doing so :D). |
For my own complete clarity, and thanks again for bearing with me until now, I want to also cite something from json-schema-org/json-schema-spec#439 (which I intend to get back to and merge in the next few days), which covers why that section means an immediate +1, which is:
whereas it's now clear that a correct implementation, under
may perfectly well have decided that their "implementation-defined behavior" is "blow up and error" (EDIT: see the hidden comments below, there may be disagreement about this precise "implementation-defined behavior, but the point stands for some other behavior). So yes, anyone is welcome to do this (on all drafts, or ones with that language, I have no strong opinion since there's going to be "boilerplate" anyhow in most places now). If no one gets to it sometime soon, I'll get to it myself, probably after json-schema-org/json-schema-spec#439 and a few other issues are closed first. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Let's move any discussion of what the set of valid implementation-defined behaviors are, and whether they include "error out", elsewhere (like the ticket linked) so that this ticket can just be about what the spec says, not what it should say -- @awwright are you disagreeing there (about what it seems to say)? Or are you making a point about what you'd like it to say? (To me it seems very clear that what the spec says is that an implementation may choose to do something other than what the test suite assumes it does today). But speak up if you're disagreeing. |
The effect on testing is: if you're changing the meta-schema being used for running the tests, based on the draft directory, then you're not testing for compatibility with schemas in the wild (where this information is unavailable).
|
I thought it was implicit that the directory name for the tests would indicate which specification version to use for running those tests. If that's not clear enough, we should say so explicitly in a README. That is -- it's not okay for an implementation to just use whatever version it likes (i.e. its preferred default version) for all tests, as obviously the tests won't all pass if the versions don't match up, so what is the point? The different directories exist for the express purpose of demonstrating the different expected behaviours for each specification version. |
You are confusing intent with happenstance. It's not intentional that there are few to no tests involving I'd been intending to write more, using a new metaschema that used the format-validation vocabulary, but there was some confusion and disagreement about the existing format tests and I ran out of spoons. We could certainly add others that omitted certain vocabularies to ensure implementations did the right thing, but someone has to do it, and we're (almost) all volunteers. Maybe that will get better with the increase in paid staff. But please don't infer the lack of certain things is signalling some sort of intent. Output formats matter too, and we have no tests for those either, because it's complicated and someone has to do the work in thinking about how it's all going to work (and then make the case to everyone else to get it merged). |
The directory name indicates the draft that the schema was written to. It doesn't imply there's a way to pick the specification version you use for running a test. This is implied because if I see a JSON Schema in the wild, I would have no way of knowing which specification version it was for, it makes sense tests should work this way too. Testing multiple versions of a validator would be like saying "Should I test this JSON parser with RFC 4627, RFC 7159, or RFC 8259?" That is: while there's different versions of the JSON specification, there's only one version of JSON, and implementations are expected to be compatible with documents written for any of the RFCs, even when we don't know which one. If I wrote a validator by comparing the input to the known tests, and returning the expected result, it would pass all the tests, but it would quickly fail against instances in the wild. Likewise, if you're changing the behavior of the validator depending on the directory it was found in, that's a little bit cheating the tests. Adding $schema to the tests would be a slight variation on this, since $schema is a pretty good heuristic of specification version (this was intentional), but not necessarily: { "$schema": "https://json-schema.org/draft-03/schema"
, "$ref": "https://json-schema.org/draft-07/schema#"
, "type": "object"
} ... This is a schema that uses the draft-03 vocabulary to match all valid object-form draft-07 schemas (no boolean schemas). (Yes, according to the JSON Schema text, I would expect it to work this way!) |
What we have today is not a heuristic and not cheating :), it's explicitly the test suite telling anyone using it "stuff in this folder is written to be run against the draft whose folder it's in". There's no ambiguity, that's the current public interface of this suite itself. There are indeed suites which test different RFC implementations of e.g. URIs, even though from staring at one you don't know what version it's written for. Such a thing doesn't sound like an alien concept to me for JSON either, and I wouldn't be surprised if they existed, heck they'd be useful to us even if we ever had a But it seems we're already all agreed (well, at least unless @karenetheridge says they disagree) to add |
Even if this was the intention, there's no such thing as "running against a draft" (any more than I can parse JSON "against RFC 4627"). Sure, we can talk about how JSON was defined as of a specific publication. But this does not imply that a JSON parser (or a JSON Schema validator) should toggle between RFCs/publications/versions. Could you speak more specifically to this argument (there's normally no way to distinguish drafts and the tests shouldn't invent this parameter)?
What do you mean exactly? We can add a file that tests the $schema keyword in various combinations. But the bulk of the tests should omit it. |
@karenetheridge I was not ascribing intent to the authors of the test suite, and please note my apology to @Julian yesterday for not making that 100% clear, and allow me to extend it to you and everyone else who has worked on the test suite. It was an observation of the effect, regardless of the intent.
Up to draft-07, the requirements around |
This isn't true. Each version/draft of JSON Schema adds new functionality. As you pointed out with your draft-3/draft-7 example, boolean schemas aren't a thing in draft 3, but they are in draft 7. Similarly, drafts 6+ no longer recognize Regardless, I think that discussion, while it may feed into this one, should be handled elsewhere. All of that said, while my validator does allow the user to define what draft a non-draft-specific schema should be processed in, making this test suite possible, I'm on the side of including
|
@awwright you have a different view on how this does or should work than most of the other people in the project, which is a point of discussion at ietf-wg-httpapi/mediatypes#20 among other places. That discussion should be had and settled elsewhere before we make decisions one way or another based on any of our views. @Julian , it occurred to me to check the validation spec, and I see that it says:
It's possible to read that as "if your implementation is a validator, in the absence of I vaguely recall that I wanted to draw a sharper line between a generalized extensible JSON Schema implementation (where it would not make sense to default It's pretty wobbly reasoning, and definitely not covered by proper normative language, but I could see interpreting this to indicate that the test suite, as a test suite for validators, implies a default I would prefer to take a more general view, in hopes that the test suite will broaden its scope beyond validation, e.g. to annotation collection, although of course annotation collection is not mutually exclusive with validation and usually relies on it, so... ¯\(ツ)/¯ Anyway, I have |
I don't see any practical effect from adding Instead of adding the keyword everywhere, I think the README should be strengthened to clearly indicate that it is intended that each directory's tests are to be run against an implementation that is configured to use that specification version, and skipped if that version is not supported (there is no sense in me running the draft4 tests against my implementation, as I know there will be failures and I don't claim to support that version). |
I disagree, and it seems there's context here where some of this discussion has occurred, but let's focus on the things which are relevant to this ticket, which concerns simply whether we add It does seem that now both you and @karenetheridge are indeed disagreeing with the intended next steps here. Can one or both of you explain how you read the specification then? The section @handrews linked seems clear, so it seems the burden should be on you to explain it some other way. Specifically, I hypothetically have written an implementation which can accept schemas written either for draft 7 semantics or draft2020 semantics. This hypothetical implementation will return "true", i.e. valid, for all instances, when presented with a schema which does not have Henry has pointed to:
which to me seems to say very clearly that I may indeed decide to do so. And, if I may do so, I will have difficulty running the test suite under my implementation, because around half of the tests in the draft2020 folder will fail when run against my draft2020 validator! So in short to me, the spec seems clear, and effectively means we MUST add What have I missed? (n.b. we definitely should not collapse the suite into one giant file or directory, so that's not one of the options anyone's proposing I believe.) |
I was using JSON for simplicity; but my point holds for standards that add features over time. Consider ECMAScript: There's no standardized way to specify which ECMA publication is used for running code... You get whatever your particular version of V8 implemented, which this is usually sufficient regardless of the version of ECMA you wrote your code to.
I was trying to call attention to how the "type" keyword would have been ignored according to the draft-03 I-D, but this is no longer the case, so even though I'm specifying a draft-3 $schema, the newer behavior applies.
Can you detail how this follows? For example, ECMAScript, or CSS have no version identifiers, but they add functionality with each new release. |
I'll respond with another question: where in the specification does it say that My conclusion is: we have no way of indicating whatsoever what version of the specification we intend the implementation to use when running tests. Adding a $schema keyword won't make a whit of difference there. If we want a particular version of the spec to be used, we have to say so out of band. We should fix the spec to make this more clear: as I've described a few times before, I've assumed the algorithm that $schema keywords are followed sequentially until one gets to one where the $id matches the $schema, and then it's either a URI the implementation recognizes, in which case the spec version is derived from that, or it's not, and evaluation fails/dies. But the spec doesn't actually say any of this. The spec assumes that its version is the only one in existence at any time, and its publication has superceded all previous publications. This is in conflict with the test suite that very clearly acknowledges that multiple versions of the spec exist, and that implementations can support any number of them. Therefore: the test suite needs to be explicit about the intent for the tests in each directory. |
This comment was marked as off-topic.
This comment was marked as off-topic.
I'm not sure what you mean by "practical effect." The important thing here is testing the specification. How the tests are organized and configured is a secondary concern. It doesn't matter whether it's possible to figure the processing semantics out from outside of the specification (e.g. the directory structure) or not. If it is both possible to figure out the right processing semantics from within the spec, and a requirement that implementations do so correctly, then that is what we should test. Perhaps the most important argument is that it's possible to write a conforming implementation that only uses Our test suite should properly tests validators that actually rely on There's also, in my view, no real reason not to do it. Adding I just added it to all of 2020-12's test cases, by hand, in vi, no scripting, no IDE, no nothing. It took a few slightly tedious minutes, but I needed to stop staring at another problem so that was convenient. I'm happy to do the others. I understand @Julian's concerns over mass changes, and am also happy to do whatever verification work is needed to make everyone comfortable with it (I'm writing a little script that goes through and checks for missing or mismatched Finally, there is clearly a larger discussion to be had regarding how we expect implementations to process schemas. In my experience, people use I believe that we should orient the test suite towards real-world usage rather than convenience for test authors. Particularly when the inconvenience is not that substantial. |
From §8.1.1 The "$schema" keyword:
I think this pretty clear. It identifies the dialect, the dialect identifies the semantics / processing rules. I will grant you that it is not detailed, and could use some "MUST" etc., but there's a lot of other material in the spec that reinforces this. From §3 Overview:
Plus the entirety of From §9.3.2 Differing and Default Dialects (under §9.3 Compound documents):
This is pretty clear that All of this is about resources setting their own processing rules, not someone externally choosing them. |
Adding (It would be interesting to see if this impacts any of the major implementations before merging any such PR, just out of cauction.) |
I'm happy to download the change and run it on mine. Adding |
I'll post a PR in the next day or two. Got a bit distracted by some other stuff. |
The original question for this issue was, "Do we need After (lengthy) discussions in Slack and some time to think, I've concluded that no, the test cases don't need Warning: reasoning aheadConsider the schema But that doesn't matter as long as the result is correct. My question assumes that implementations will have "Draft X Compability" modes, but that's not the case, and it's not required. Basically, if my validator is given that schema and If it's compliant for all test cases within a single draft folder, then it's said to be compliant with that draft. None of this is dependent upon If, in some future version, we change the semantics of The test suite isn't asking, "Can the implementation process a schema with meta-schema X?" (Maybe that's a question it should ask, but if so it should do it as distinct test cases.) It's asking, "Can the implementation give me the desired result as indicated by the specification?" That I have to configure my implementation to properly handle some test cases is outside the scope of the test suite. Secondarily, it's my responsibility to document that such a configuration needs to be performed before processing some cases. * CaveatBecause the 2019-09 and 2020-12 spec allow for implementations to refuse to process schemas without the This needs to be fixed by adding We will continue discussion on "expected behavior when |
@gregsdennis I was only planning to do 2019-09 and 2020-12 in the PR, so I think we're good here. I had noted earlier that there is no language in draft-07 or earlier that says anything at all about what happens without |
In the wider discussion, we're conflating a few more issues than we realised (I think). We have "compliance" and we have "support". You can be "compliant" with a version of the spec while not "supporting" said version of the spec. Currently, for 2019-09 and 2020-12, based on the discussion, we need to add I'm going to strongly suggest that we have and ADR included in any PR that looks to resolve this issue, mostly to avoid us hashing this out again. |
I concur with #311 (comment) However, if it's true that this effect was not intended, I think we should put this on hold and fix that first, since the resolution will affect this issue. json-schema-org/community#189 @Relequestual Can you offer a definition for these terms? It would seem to me that schemas are "compliant" or not, whereas validators have a range of features that they "support" (or don't support). |
Again, it was 100% intended that conforming implementations can refuse to process schemas that do not have The only part that was unintentional was that I forgot to forbid completely arbitrary behavior. But that doesn't impact the reasoning here. Fixing that doesn't change anything about the need for Please stop treating the intended meaning of this clause as a bug. There are bugs related to it, but fundamentally it says what it was meant to say. |
Please stop discussing. The change is accepted, it's the correct one, we can summarize the back and forth above in an ADR if helpful for documentation (I can do that). Henry send a PR whenever you can. |
$schema
-less schemas but they are not required to process them
For example, the first test in the
const
file isThere is no
$schema
declaration. That means that this is a valid draft-6, -7, and -2019-09 schema.My implementation assumes latest, unless it can be determined that another one should be used (via keywords used or
$schema
declaration), but that may not be the case for others.This means that while we intend for this to be evaluated as draft 2019-09, it may not be, depending on how the implementation reads non-specific schemas.
The text was updated successfully, but these errors were encountered: