Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define and use reserved percent-encode set #32

Merged
merged 2 commits into from
Feb 20, 2018

Conversation

mgiuca
Copy link
Collaborator

@mgiuca mgiuca commented Feb 8, 2018

This fixes the set of characters that are encoded when substituting into a placeholder, to include all the "reserved" and illegal characters from RFC 2396 (matching the behaviour of encodeURIComponent).

There is no appropriate set to use in the URL Standard. whatwg/url#369 discusses adding this (also needed for properly specifying registerProtocolHandler), but for now, I'm just putting it in the Web Share Target spec.

Closes #9.


Preview | Diff

@mgiuca mgiuca requested a review from marcoscaceres February 8, 2018 08:03
@mgiuca
Copy link
Collaborator Author

mgiuca commented Feb 8, 2018

Hey Marcos,

I have a bunch of upcoming changes to Web Share Target. Do you want me to run them by you, or should I just get some people at Google to review? I don't want to bother you.

Also @annevk --- this makes a start towards whatwg/url#369. I know this should be defined in URL Standard but I'd like to get WST working properly first. Then I'll work towards migrating it upstream.

Copy link
Member

@marcoscaceres marcoscaceres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fairly superficial review from me, as I don't know the technical details of exactly what you are trying to do here. I guess @annevk will say if the technical aspects are wrong wrt "reserved percent-encode set".

from [[ECMAScript]].
</p>
<p class="issue">
The <a>reserved percent-encode set</a> should be defined in [[!URL]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change this to: "Discussions are underway to define 'reserved percent-encode set' in the [[!URL]] spec. To follow the discussion, see issue#369".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best for issue descriptions to describe the problem (which won't change until it's resolved and then the issue box can be deleted), rather than give a status update on the discussion (which can get out of date). I think this is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense.

@marcoscaceres
Copy link
Member

Do you want me to run them by you, or should I just get some people at Google to review? I don't want to bother you.

Sure! It's not a bother at all - but I'm not completely up to speed on the spec. I'm always happy to do an initial review of things if you need another pair of eyes.

@mgiuca
Copy link
Collaborator Author

mgiuca commented Feb 8, 2018

OK I'll wait and see if @annevk has any thoughts about this. Note that I think we could spend awhile debating the exact set of characters to put into this set. Since I have other changes I'd like to make, I'd prefer to land this first and then discuss the character set later.

@mgiuca
Copy link
Collaborator Author

mgiuca commented Feb 9, 2018

@ericwilligers PTAL

@ericwilligers
Copy link
Contributor

Need to omit ! * ' ( ) as these are in the "unreserved set".

@mgiuca
Copy link
Collaborator Author

mgiuca commented Feb 16, 2018

@ericwilligers Thanks.

Need to omit ! * ' ( ) as these are in the "unreserved set".

For clarity: in the RFC 2396 "unreserved set".

The problem (as discussed offline, but just for anyone else's benefit) is that I made the list based on the RFC 3986 unreserved set, but I was intending to use the RFC 2396 set.

The reason to use the older RFC 2396 set is that encodeURIComponent uses that set, as does Chrome's generic "escape a string for a URL" function (implying that use of that set is pretty common). We don't need to use that function in Chrome, if we decide to use a different set here, but the path of least resistance seems to be using that (smaller) set. So I will remove those 5 characters.

@mgiuca mgiuca force-pushed the reserved-encode-set branch from 31b0de8 to 55f95c8 Compare February 16, 2018 06:04
@annevk
Copy link
Member

annevk commented Feb 19, 2018

If you actually want the URL parser to change to use this I think it would make more sense to a depth-first approach since that may or may not be successful. If you want to do this regardless of whether that can change I suppose this makes sense...

@mgiuca
Copy link
Collaborator Author

mgiuca commented Feb 20, 2018

@annevk I think you are saying that if I don't want a special exception here forever, I should make the change in URL Standard first, before doing it here.

That's fair, but since this is a draft (which will go through many rounds of review), I think it's fine to have it here now. The possible directions we go on after this are:

  1. The reserved set (possibly with changes) gets added to URL Standard, and then we refer to it from here, or
  2. We find a way to use an existing standard (perhaps we directly call ECMAScript's encodeURIComponent) so we don't need to define a separate set, or
  3. We decide to just leave the set in this spec as an isolated thing. (Though that is not ideal, and it should at least share the list of codepoints with registerProtocolHandler).

All three of those can happen, so I'll just land this for now.

@mgiuca mgiuca merged commit b1bd475 into w3c:master Feb 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[URL substitution approach] Need to define URL escape character set
5 participants