Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Published specification" should reference the specification #32

Closed
awwright opened this issue Mar 31, 2022 · 16 comments · May be fixed by #43
Closed

"Published specification" should reference the specification #32

awwright opened this issue Mar 31, 2022 · 16 comments · May be fixed by #43

Comments

@awwright
Copy link

awwright commented Mar 31, 2022

When I'm looking up a media type registration, I would expect the "Published specification" to link directly to the specification that specifies how I can parse and interpret it.

The document currently cites itself as the specification document, and then normatively references several other documents. While this may work technically, this seems like an unnecessary level of indirection. I vaguely recall seeing this pattern before, but it's somewhat confusing, and if we can't link directly to the spec, I think it deserves some explanation.

RFC 2854 is an example that directly references an external specification document.

I would also expect a change controller to be listed. Why are these listed as "n/a"? I'm not sure what that means, in this context.

@ioggstream
Copy link
Collaborator

@awwright I think these should be compiled with the support of IANA once the actual content of the document is done. Feel free to PR where you think its useful.

@dret
Copy link
Collaborator

dret commented Apr 8, 2022 via email

@ioggstream
Copy link
Collaborator

Labels are already here :) owners should be able to add them.

@awwright
Copy link
Author

awwright commented Apr 8, 2022

which media type are you talking about?

All of them are written this way (sorry I should have mentioned this)

@dret dret added the yaml label Apr 9, 2022
@jdesrosiers
Copy link
Contributor

When I'm looking up a media type registration, I would expect the "Published specification" to link directly to the specification that specifies how I can parse and interpret it.

This is a tricky case because there is no one published specification. There isn't even a fixed set of specifications. Each release of JSON Schema or Open API is it's own specification and future releases will be added to the list. We don't want to pin the media type to any specific release or have to go through the ceremony of updating the media type registration every time there is a new release. It's even more difficult for JSON Schema because the vocabulary system allows third parties to create their own dialects of JSON Schema, so the source of published specifications isn't even limited to the JSON Schema organization.

I think not linking directly to any one specification is the right thing. The media type defines how to identify what version of OpenAPI or dialect of JSON Schema the document conforms. That's just about all these media-types need to do. That's why I think it's reasonable for this document to cite itself as the specification document. Linking to the current specification, or all the existing specifications is an option, but I wouldn't want anyone reading it to be confused and think those are the only options. I think the alternative is to maintain a registry, but that doesn't seem necessary.

@awwright
Copy link
Author

This is a tricky case because there is no one published specification. There isn't even a fixed set of specifications. Each release of JSON Schema or Open API is it's own specification and future releases will be added to the list.

I didn't mean to imply that every specification that describes a new feature for a format/protocol must be listed. Optional extensions to protocols and media types don't need to update the registration. Only the essential semantics need to be referenced.

For example, Cookies weren't mentioned in the first couple of releases of HTTP/1.1 at all (RFC 2068, 2616). (Recent releases of HTTP now point out how the Cookie header is inconsistent with the header syntax.)

However, I don't think the problem is that there's "no one published specification"—all of the references we need are listed in the normative references; their URLs just need to be copied to their respective "Published specification" field.

We don't want to pin the media type to any specific release or have to go through the ceremony of updating the media type registration every time there is a new release.

You can link to a document that changes. This is how HTML is defined (maybe it's an odd exception). Even if that's not possible, the change controller can update the registration without much fuss. (I wouldn't describe it as a "ceremony".)

I think not linking directly to any one specification is the right thing. The media type defines how to identify what version of OpenAPI or dialect of JSON Schema the document conforms.

A Standards tree registration requires a written spec with expert review. This seems reasonable to me. I think the likely outcome of not linking to a spec, or not fully writing it out, would be the IANA assigns a media type in the Vendor tree (instead of the Standards tree).

@jdesrosiers
Copy link
Contributor

You can link to a document that changes.

Agreed. If such a document existed for any of these media types we wouldn't have a problem. My point is that no such document exists.

the change controller can update the registration without much fuss.

Good to know. Do you have any suggested reading you can share so I can better understand how that process works?

I think the likely outcome of not linking to a spec ...

What spec would you link to? This is the problem. The current release of JSON Schema is just one dialect out of many. There are many more that need to be covered by this media type including third-party dialects such as OpenAPI and MongoDB. These are not limited to extensions or add-ons to official JSON Schema releases (although some might be). The way the vocabulary system works, almost anything is possible. That's why it makes sense to me that the media type only define how to identify the dialect and delegate the rest of the semantics to that dialect. The dialects are the "extensions" like you mentioned that shouldn't require additional review or media type registration updates. The "expert review" only needs to cover the aspects that are currently in the document. The specific dialects are just extensions.

@awwright
Copy link
Author

awwright commented Apr 13, 2022

My point is that no such document exists.

What spec would you link to?

I don't think anyone would be confused if application/schema+json links to https://json-schema.org/specification.html (i.e. as a Table of Contents).

Good to know. Do you have any suggested reading you can share so I can better understand how that process works?

The top of https://www.iana.org/assignments/media-types/media-types.xhtml lists a few different RFCs, though not all of them apply (I forget off-hand which would be most important).

The current release of JSON Schema is just one dialect out of many. There are many more that need to be covered by this media type including third-party dialects such as OpenAPI and MongoDB.

From a media type perspective, these may not be application/schema+json documents strictly speaking. For example, keywords might follow a specific release of JSON Schema Validation, but as a whole, not necessarily follow other requirements.

For example, MongoDB will store the JSON Schema as BSON document or some data structure; to produce a valid "application/schema+json" document, it would probably need to add a "$schema" keyword then stringify it as JSON.

Regarding e.g. HTTP responses with Content-Type: application/schema+json, then you just follow the rules laid out in the specification. The specification has to document how forward compatibility is implemented, how old and deprecated behavior is handled, what each party is required to do and what they are required to accept, and so on.

Unfortunately JSON Schema omits much of this, or it internally contradicts itself. This was why I asked json-schema-org/community#119

For example, you suggest that we should be able to use "$schema" to select an alternate vocabulary or dialect; but nowhere in the specification does it mention how to handle unknown values of "$schema". It should probably error or give an indeterminate validation result, but this isn't explicitly mentioned, and would likely come up in expert review.

Likewise, I may have been too optimistic in my view that we could just remove functionality from JSON Schema and assume that implementations would continue to support it. While it's legal to implement since-removed behavior, this isn't proscribed, and maybe we should have done that (e.g. instead of removing the behavior, move it to a separate section called "Deprecated Keywords").

@jdesrosiers
Copy link
Contributor

From a media type perspective, these may not be application/schema+json documents strictly speaking.

That depends on how you define the media type. You want this to be a media type for standard JSON Schema. I wrote this up to be capable of describing all types of dialects and that's the root of our conflict here.

@awwright
Copy link
Author

That depends on how you define the media type. You want this to be a media type for standard JSON Schema. I wrote this up to be capable of describing all types of dialects and that's the root of our conflict here.

Can you speak a little bit more to this, to make sure I'm understanding you correctly?

First, maybe give me an example of how or why it would depend on the definition. Strictly speaking, many applications like MongoDB cannot be using application/schema+json, because at no point does it handle stringified JSON—it doesn't matter how we define the media type, they're using the data model, which is slightly lower-level.

Second—by "wrote this up" do you mean https://ietf-wg-httpapi.github.io/mediatypes/draft-ietf-httpapi-rest-api-mediatypes.html#section-2.2?

My understanding, when I wrote this issue, was that we would copy 14.1. "application/schema+json". One of the effects of the language in this repository is that the meaning of $schema going over HTTP (or email) will be different than the same value found in a JavaScript implementation. That doesn't seem correct to me.

@dret
Copy link
Collaborator

dret commented Apr 22, 2022 via email

@awwright
Copy link
Author

I like your comparison to application/markdown @dret, that's a useful comparison. Though there's one important difference, usually documents are delivered in Markdown because that's how it was authored by humans; rendering to HTML (or RFCXML, etc) is secondary; and if there's an error or incompatibility, that's not the same kind of big problem as a false positive in JSON Schema. I think JSON Schema is more like a scripting language in this regard.

@jdesrosiers
Copy link
Contributor

Sorry I haven't had the time to keep up with this. @awwright I'm not trying to ignore your concerns, I just don't have the capacity right now and won't for a few more weeks. At that time, I suggest we get together and discuss this in detail.

For now, I'll address this briefly.

the media type should clearly state [...] it's a "family of pretty similar dialects" and
even then it may make sense to list to some popular ones known, but to
mention that the list open.

This is exactly what it does right now. I agree that it's unfortunate that JSON Schema has become fragmented, but that is the situation we find ourselves in. I'd rather find a way to include those dialects than to dismiss them as not-really-JSON-Schema especially because the community knows these things only as "JSON Schema". Excluding them would be confusing at best. I think our hands are tied a bit because we aren't introducing something new, we're trying to standardize something that exists in the wild and should work for existing implementations (at least the major ones).

@awwright
Copy link
Author

No worries, I've been trying to make the Friday calls but I keep getting pulled into other gigs.

it's unfortunate that JSON Schema has become fragmented

I went on a little bit of a tangent and so spun off a reply at https://github.com/orgs/json-schema-org/discussions/169, but I hope you can address my first question: I think the problem of fragmentation has been greatly improved, how you figure this?

@ioggstream ioggstream added this to the WGLC for REST API milestone Apr 25, 2022
ioggstream added a commit that referenced this issue May 13, 2022
@ioggstream
Copy link
Collaborator

@awwright addessed in YAML #42 for now.

@ioggstream
Copy link
Collaborator

The YAML part was moved out and fixed there.

@ioggstream ioggstream removed the yaml label Jun 20, 2022
@darrelmiller darrelmiller moved this from In Discussion to Closed in HttpApi Active Issues Jul 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

4 participants