Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URI in Profile triggers CORS Unsafe Request Header Byte rule #436

Open
azaroth42 opened this issue Jun 24, 2024 · 7 comments
Open

URI in Profile triggers CORS Unsafe Request Header Byte rule #436

azaroth42 opened this issue Jun 24, 2024 · 7 comments
Labels
needs discussion spec:w3c tag-needs-resolution Issue the Technical Architecture Group has raised and looks for a response on.

Comments

@azaroth42
Copy link
Contributor

In the IANA registration [1], we define a media type parameter called 'profile'. Its value is a space separated list of URIs, for which we registered six initial values. These can be composed together, and new values can be added for other "constraints or conventions".

The IIIF specifications use this functionality, for example to define the specific structure of the response in an API [2] as part of the media type. Similarly in Linked Art, we do the same [3].

However, in the WHATWG specification for fetch [4], it says that the value for the Accept header is NOT CORS safe, if it has more than 128 bytes (which multiple URIs might easily cause) or (more importantly) if the value contains an unsafe header byte. The unsafe header bytes include the character ":" ... which prevents any URI or CURIE with a namespace prefix separate by : from being CORS safe.

This means that we cannot use the JSON-LD media type as registered for content negotiation via the accept header according to the fetch specification, which was much of the rationale for the profile parameter.

To resolve this, either WHATWG would need to change fetch, or W3C/IANA would need to change the definition of the media type and give some registration function for possible profile values, then all downstream specifications would need to register a safe profile value to use.

I've added the tag-needs-resolution label, as I think that's the level this would need to run up to :(

[1] https://www.w3.org/TR/json-ld11/#iana-considerations
[2] https://iiif.io/api/presentation/3.0/#63-responses
[3] https://linked.art/api/1.0/json-ld/#introduction
[4] https://fetch.spec.whatwg.org/#ref-for-cors-unsafe-request-header-byte

@azaroth42 azaroth42 added spec:w3c needs discussion tag-needs-resolution Issue the Technical Architecture Group has raised and looks for a response on. labels Jun 24, 2024
@davidlehn
Copy link
Contributor

You say "NOT CORS safe", but I think the issue is the header will not be "CORS-safelisted"? When that is the case, a preflight request will be used. If that succeeds, the request should be sent. This doesn't seem like a blocker for using any of these headers, just a bit more server complexity than the "simple requests" CORS case that doesn't do a preflight. You probably need to setup a Access-Control-Allow-Headers to allow Accept, others as needed, and probably want Access-Control-Max-Age to cache preflight requests if appropriate.

@pchampin had a w3id.org issue with CORS and preflights the other day, where I learned more about this topic and safelists and simple requests and redirects. perma-id/w3id.org#4185 and perma-id/w3id.org#4196.

I tried a simple fetch from a browser console, and it does appear to do the above behavior. No profile will skip preflight, adding profile with URL will do one, then send the request. You do need a server setup to handle CORS headers, but then it works.

await fetch(
  "https://example.com/test",
  {headers: {"Accept": "application/ld+json;profile=http://www.w3.org/ns/json-ld#expanded"}}
)

@BigBlueHat
Copy link
Member

Of note: https://www.w3.org/TR/dx-prof-conneg/

@pchampin
Copy link
Contributor

This was discussed during the json-ld meeting on 13 November 2024.

View the transcript

Issue Discussion

bigbluehat: We're working through the project list.

gkellogg: added issues that are class 1-3.

subtopic w3c/json-ld-syntax#436

<gb> Issue 436 URI in Profile triggers CORS Unsafe Request Header Byte rule (by azaroth42) [spec:w3c] [needs discussion] [tag-needs-resolution]

gkellogg: might just create "tokens" for profile paraemters.

gkellogg: tokens not being namespaced is mitigated by the fact that the media-type is the namespace.

bigbluehat: So, it treats the media-type as the namespace.
… Profile parameters not having a colon is wide-reaching

gkellogg: not sure how we update guidance for using profile parameters.

bigbluehat: This would be a breaking change for web annotations.
… That would mean web annotations needs their own media type.

niklasl: dlehn's reply may mean this isn't as horrible as it seems.
… I think the datasets working group has done something with this.

pchampin: This doesn't seem to be a problem where things can't work, but making them work is tricky, due to pre-flight requests.
… If we expect a server to support profile-based content-negotiation, it doesn't come automatically.
… If you want to support this, you'll also need to support pre-flight requests.

<bigbluehat> q|

pchampin: This is difficult to configure and easily forgotten.

<gb> Issue 436 URI in Profile triggers CORS Unsafe Request Header Byte rule (by azaroth42) [spec:w3c] [needs discussion] [tag-needs-resolution]

bigbluehat: There were some suggestions for defining enumerated values (tokens).

<pchampin> I think it wouldn't hurt to define "short names" for the profiles in addition to the currently defined IRIs

bigbluehat: The key is to not make it a breaking change.
… This would affect the media-type registration.

niklasl: Aren't link headers defined similarly, where there are pre-defined tokens and IRIs may also be used.

bigbluehat: Browsers have made decisions which are affecting what we can do.

<bigbluehat> > When processing the "profile" media type parameter, it is important to note that its value contains one or more URIs and not IRIs. In some cases it might therefore be necessary to convert between IRIs and URIs as specified in section 3 Relationship between IRIs and URIs of [RFC3987].

https://www.w3.org/TR/json-ld11/#iana-considerations

<niklasl> application/ld+json;profile="http://iiif.io/api/presentation/3/context.json"

niklasl: I think it would be good to add tokens. Rob's specific problem are more about the other uses of profiles.
… I wonder if our solution would be considered a solution for the issue; maybe parts of the issue can't be solved in the JSON-LD spec. Might recommend IIIF to use profile negotiation.
… But, using pre-flight does work, so that would be on their end.
… It's more that we put forward the design pattern and it has become more tricky.

bigbluehat: The ramifications of this are not just expand/compact/... Rob's point is for other specifications that used the same pattern.
… No we know to avoid it.

<niklasl> See also: https://www.w3.org/TR/dx-prof-conneg/ (and https://profilenegotiation.github.io/I-D-Profile-Negotiation/I-D-Profile-Negotiation.html )

bigbluehat: There's reason to document this in the best-practices document. How this affects other specs would mean that they cannot treat profile as being extensible, and will need a new media type.

gkellogg: we might create a registry to allow other specifications to add their profile parameters without needing a new media-type.

bigbluehat: niklasl shared a document on using the profile parameter for content negotiation.

pchampin: Reaching out the that TAG would be a good idea, as other specs rely on this, and they would be impacted.
… I'd like to see their thoughts and how much we should make the effort to try to change this.
… Regarding the spec, note that this is a working draft which has been inactive for a while. This might not be the strongest argument to take before the TAG. (The dataset exchange WG)
… Part of the reason that spec is stalled is that there are contentious discussions with IETF on where it belongs.

<niklasl> From the dx-prof-conneg draft: During 2018, DXWG members had a longer discussion with the JSON-LD WG at the annual forum TPAC in Lyon, France and it was concluded that the "profile” parameter in the Accept and Content-Type headers should be seen to convey profiles that are specific to the Media Type [such as JSON-LD's expanded .... ]

pchampin: But, is there enough interest in IETF to continue the work?

niklasl: There are aspects of the draft that goes into the profile parameter of the media type is the right way to go.
… The design of IIIF and Activity Streams I appreciate more when not looking at it from an RDF perspective.
… These are more useful at the intersection of JSON and RDF, which makes it easier to create specifications in a distributed way.
… If I believed (from RDF perspective) that format is irrelevant, general content negotiation works well.
… I can see how the TAG might argue from one of these perspectives. Maybe we shouldn't invent media-types on the fly.

<pchampin> https://www.w3.org/TR/vc-data-model-2.0/#media-type-precision

pchampin: Regarding the value of using JSON-LD media-type with parameter vs a new media-type, VC has had to rely on this for a while.
… The current solution is to have a dedicated media-type with additional language to explain the relationship between the two media types.
… We might point other specs to that solution.

<niklasl> +1 to mentioning that "third" point of view (very pertinent IMHO)

bigbluehat: I think we need to move on and come back to this issue.
… It would be great to write some of these things up on the issue so that we have something coherent to bring to the TAG.
… IETF has shifted their approach, and we're stuck in the middle. In the mean time, if we can collect thoughts in the issue.
… I don't think we know enough to lay out the preferred solution.
… If we go the short-name route, we run the risk of turning into a registry.

<bigbluehat> w3c/json-ld-syntax#443

<gb> Issue 443 `@protected` creates unresolvable conflicts when the same term is defined in two contexts top-level (by trwnh) [spec:editorial] [wr:commenter-agreed-partial] [class-2]


@azaroth42
Copy link
Contributor Author

Thanks for discussing it! I think we can close this particular issue -- the pre-flight does indeed solve the issue, and implementation notes in various specs in future versions could head it off completely.

That said, at a conference in Amsterdam this week there were at least 6 organizations wanting a solution for profile negotiation for non JSON-LD serializations where the profile param isn't available, and (sorry Rob, Lars and Nick) not as complicated as https://www.w3.org/TR/dx-prof-conneg/ ... just to update and try to get acceptance for Accept-Profile in the IETF: https://datatracker.ietf.org/doc/html/draft-svensson-profiled-representations-01

(The q values being one driver for this, rather than the link header on the request approach of dx-prof-conneg)

Perhaps there's sufficient convergence to at least have a call or two?

@rob-metalinkage
Copy link

If a less complicated option can be identified then I'd be happy to revist dx-connegp to accommodate. We have one simplification idea in the wings, but am currently buried in the details of how profiles themselves need to be defined at scale, for application level informatio content choices, not lust limited serialisation options which feel more like media-type concerns.

@gkellogg
Copy link
Member

This issue was discussed in the 2024-12-11 meeting

Topic: Profile Negotiation

Gregg Kellogg: We did have an open issue on this
... and now we have a good bit of feedback
... Herbert, Bob would you like to kick off the discussion?
Herbert Van de Sompel: I did send a link to provide an overview
... this discussion has been going on forever and has yet to have a resolution
... my hope is to sort out how to proceed
... we have a W3C route, and IETF route, etc.
... is there interest in pursuing this work? and if so where? and whom?
... as a sample of how hard this is, my two co-authors have both been radio silent in response to this recent interest
... it needs solving at the process level as much as technical
Gregg Kellogg: My concern is what issues exist for JSON-LD and what can we do about them
... one issue is the emergent problem of using URIs as profile values
... that is now recently undermined by other groups
... so an alternative to that may be using a `Link` header
... so our general question is how can we apply best practices here?
Herbert Van de Sompel: The two existing drafts overlap a good bit
... and the IETF work got copied into the W3C work
Niklas Lindström: I think we should say what we mean by "profile" here
... one is something like data shapes--which is not what I mean
... and the data shapes WG is talking about those in a sense more like application profiles from Dublin Core
... those are a preferred selection of terms from vocabularies
... I wouldn't say that's an emerging concern
... we are hoping that profile negotiation gets attention to help with this need
... but I'm not clear on what we mean by `profile` in the JSON-LD specs
Herbert Van de Sompel: If you look at the I-D, both of the things you've described are covered
... one is representation related
... and the other is application/vocabulary related which is independent...though interdependent of the representation
Pierre-Antoine Champin: I'm the team contact for JSON-LD and other groups
... like Data Set Exchange
... I'd like to see this work progress
... despite the growing list of issues
... what we talk about as profile negotiation in JSON-LD seems to differ a bit
... it doesn't really require content negotiation
... if someone sends me a document with `profile` that uses an IRI in the value, fetch will fail due to CORS
... I think it's great that this issue triggered a larger discussion
... there has also been some activity around the profile recommendation...which sadly stopped about 6 months ago
Niklas Lindström: For reference, at the Nat. library of Sweden, we experiment with prof-neg to allow consumers to select one of some (currently predefined) sets of "selection of vocabularies": https://github.com/libris/librisxl/blob/develop/rest/API.md#profile-negotiation *orthogonal to serialization format (all within the bounds of RDF)
... part of the reason it stalled is the ambiguity between the current drafts
... are there plans to take either of these further?
Herbert Van de Sompel: There was an attempt to talk to IETF dispatch about this work
... they decide where it fits within the IETF
... and pointed us to the HTTP API
... I've worked there and did the Link Set spec
... but that group came back and said "the industry" wasn't interested
... however, one is not required to go that route, but could publish a personal informational spec--which is what I did for Memento
... and Memento has been used by Web Archives
... a W3C route would be another route forward
... any of these have scoping concerns
... the I-D would be singularly about the negotiation
Pierre-Antoine Champin: Maybe we can dig into some of this offline
... and try to focus a bit on what JSON-LD is dealing with specifically
Gregg Kellogg: #436
#436 -> Issue 436 URI in Profile triggers CORS Unsafe Request Header Byte rule (by azaroth42) [spec:w3c] [needs discussion] [tag-needs-resolution]
Herbert Van de Sompel: That issue you mention with CORS and content-type is purely HTTP related
... there was some solution that Rob was happy with--preflight
Gregg Kellogg is scribing.
Benjamin Young: In JSON-LD, the profile is used as a parameter on a media-type; it's been surfaced in sub-groups, such as web annotation.
... We went that way to avoid registration issues, but CORS is creating problems
... Since then, the IETF has encouraged more top-level media types.
... The verifiable credentials group is using a new top-level media-type.
... We thought that using a parameter to express a preference would be the right way to go.
... Now, we need advice on other ways to handle this.
Herbert Van de Sompel: Thanks for those details, bigbluehat
... one of our issues talk about that specifically
... I did not know that using a `profile` parameter didn't work
Benjamin Young: Neither did we
Herbert Van de Sompel: I think it's actually good news
... one could use the `Accept-Profile` approach instead...since normal content negotiation no longer works
... so, we could instead use the `Profile` header instead
... what if a client expresses a `profile` attribute and an `Accept-Profile` header
... but now if `profile` no longer works, then that ambiguity is cleared up
Gregg Kellogg: The `Link` header is the other option
Herbert Van de Sompel: There's a problem with that also?
Gregg Kellogg: The `Link` header can be used for both request and response, correct?
Herbert Van de Sompel: The `Accept-Profile` is for requests, `Profile` is for responses
... the `Link` header could be used. I've never seen a problem with it
... but I do see an issue with using it alongside `Accept`
... you could also use URIs there...even multiples
Gregg Kellogg: And JSON-LD's existing use of `profile` show use of multiple URIs
... we even showed how to use `profile` to specify the frame to use
... probably some security concerns with that one...but perhaps some allow listing for those could help
... but those would likely be application specific
... probably where JSON-LD wants to go
... is to move away from `profile`
... and instead look to using the `Link` header or `Accept-Profile`/`Profile` to specify that
... not every application is hampered by CORS as the browsers are
... but we may want to deprecate that use
... so for the purpose of this group, that's what we're hoping to focus on
Herbert Van de Sompel: So, you could continue to use `profile` as a parameter
... and you can use them in `Accept` and `Content-Type`
... but as soon as you start doing profile negotiation, you use `Accept-Profile` instead
Pierre-Antoine Champin: So, if we went the individual specification route at the IETF, can a W3C spec depend on it?
Ivan Herman: No
Pierre-Antoine Champin: If we go that route, we need to have a normative reference
... we would end up relying on that
... so we'd need them to be full specifications to move forward
... we have had these conversations with the IETF
... and they said it needed to be there...but then no one will let it be developed there
... so we are at an impasse
... I will reach out again to see if there's a path forward there
... or to say we will do it in the Data Set Exchange WG to move it forward
... so this group can reference the work
Herbert Van de Sompel: Iirc the conversations from 2021, that if the W3C would express "real interest" that the IETF Dispatch could be moved
... Darrell Miller would be someone to contact there
... even if someone goes non-standard track
... someone can officially register HTTP headers
... I did this with Memento
... so, non-standard track documents can register headers
Ted Thibodeau Jr.: I'm a little concerned that these documents are only consumable via a web server
... if that profile is only expressed in the content type, how do I know what the profile is elsewhere
Gregg Kellogg: There are other places where HTTP headers are used
Ted Thibodeau Jr.: This means that I cannot have a local copy of that document
Pierre-Antoine Champin: TallTed you could make the same remark about content-types....
Herbert Van de Sompel: You can download it
Ted Thibodeau Jr.: Where's the profile info go?
Benjamin Young: The profile is typically in the context; profile negotiation is to specify the format, which isn't needed to consume the document.
... If the document is self-contained, you don't need the profile. It's used when requesting a particular form of a document with the same semantic information.
... If you need a file extension, you MUST use a top-level media-type.
Herbert Van de Sompel: That does raise questions around self-contained-ness of the representation
... that doesn't hold for all media types
Ivan Herman: Coming back to what can be done or not done
... I reacted to the individual submission
... but there are actually not rules against them per se
... but the question is really about stability
... if you get a specification that meets certain requirements
... then if it is stable enough than it can be referenced
... for example (though perhaps controversial), we have a green light to reference schema.org
... there is little probability that that will disappear in the coming years
... the other thing, if it's on the IETF turf, can we standardize it?
... there are examples. One recently.
... in the VC WG, there were issues with multikey and multibase
... the standing of the group had been that it should be done at the IETF
... it never happened there for reasons I do not know
... but we had to move on
... so now we have parts of those specifications in our VC specifications to solve for lack of progress at the IETF
Herbert Van de Sompel: If you go the IETF route, even with a personal spec, you do get an RFC which is stable
Ivan Herman: Yes, and we can always go to the management to get permission
Benjamin Young: I think we've centered around where to specify things; can we circle back to what the JSON-LD community needs?
... We have Accept-Profile and Profile headers to consider, as well as the Link header.
... Does the JSON-LD group feel that there is a way forward using these?
Herbert Van de Sompel: You could decide to step away from classic content negotiation
... and totally move to `Accept-Profile` and `Link`
... which would actually solve a problem with the I-D
Gregg Kellogg: That seems like the way forward for us
... and I don't think we'd have a problem linking to those
... with some wording that indicates some non-normative description around the issues with CORS
... and how to handle conflicting info like `profile` in `Accept` or `Content-Type` as well as `Accept-Profile` and `Profile`
Herbert Van de Sompel: In case you get in touch with folks at the IETF, please copy me
... I am willing to put cycles to get this draft done
... and I'll find my co-authors...hopefully
Pierre-Antoine Champin: I'm happy to help you move this forward
Niklas Lindström: Thank you for moving this forward!
Antoine Isaac: +1
... in the JSON-LD situation, we would probably need to consider...
... can you negotiate for two combinable profiles?
... we currently have asking for the frame and the specific vocab/context for that same JSON-LD
Herbert Van de Sompel: You can provide space delimited URIs in `Profile`
... I'd need to check on `Accept-Profile`
... if you would have the notion of combined profiles, you would have to give that combination another URI
... you'd need a "super URI"
... that points at the other three
... I think that's how it would work
... but it's been years...so I'd have to look again
Herbert Van de Sompel: There is a profile URI for framed now, correct?
Niklas Lindström: Yes
Herbert Van de Sompel: Does it fit with this notion of profile? ... it feels a bit different
Gregg Kellogg: It was a similar notion
... to define how you would like to get a document back
... we expected client to get JSON-LD with the context they wanted
... so we used these to be more explicit
... and you might still have a vocabulary profiling happen
... and maybe these are too many concepts into one vehicle
Antoine Isaac: Reacting with some history
... we discussed making the framing profile and the more semantic profile
... in the end, we had concluded that how we did it was fine
... but that's just a memory from Lyon

@rob-metalinkage
Copy link

Hi, The meeting times are nor workable for me on a regular basis, and I thought the group had decided to go a different way than dx-connegp.

But very interesting discussion. If there is energy to revisit the dx-connegp draft and try to find a simple solution that meets needs I am happy to participate. OGC input to this work has been on hold pending othet work on best practices for OpenApi and json schema, and combination mechanisms fir JSON-LD annotations of existing schemas. We were planning to revisit all our redirect and negotiation infrastructure in March/April.

Dx-connegp already has its own concept of implementation profiles, my feel here is that it may be possible to define a very lightweight one to meet these needs with a little relaxation of the conceptual model, such as making the discovery of available profiles optional.. its by far the most complex part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion spec:w3c tag-needs-resolution Issue the Technical Architecture Group has raised and looks for a response on.
Projects
Status: Discuss-Call
Development

No branches or pull requests

6 participants