-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why invent a new URL template syntax? #31
Comments
Good question. Really it's two questions: a) what should the syntax be, and b) what should the processing model (inside the spec language) be? SyntaxThe reason we aren't using the "%s" placeholder syntax from We can't just use positional placeholders because the fields {title, text, url} have no strong ordering, are all optional, and we may want to expand them in the future with further optional values. So we need a named placeholder syntax. Options we considered are:
Given that you provide a name, Python's "s" seems redundant. So we originally settled on the Regardless of the specifics, we can't just use the simple "%s" from rPH. Processing modelGiven that we have a different syntax to rPH, it's a bit hard to share a processing model. Unfortunately, there is extra complexity with WST compared to rPH, due to:
So, I don't think we can precisely share syntax or model with rPH, but we do have leeway to change the syntax. |
I think you should at least use the ugly syntax as otherwise you cannot represent actual braces in these URLs. (Or you'd have to modify the URL parser to support templated URLs. Not sure we'd want to go that far though.) |
There is no such thing as a brace in a URL. Or do you mean, we won't be able to support URLs with %7b and %7d in them, because running the parser on the template will convert the template braces into %7b and %7d, and thus we won't be able to distinguish them at substitution time? I share that concern, but...
I don't think it's a great policy to choose a syntax which we both consider "the ugly one" over a cleaner and equally unambiguous syntax, just because it simplifies the implementation. I could be wrong though. I also agree we don't want to do a major refactor of the URL parser to support this case. |
Yes, you won't be able to distinguish at substitution time. At least for me that qualifies as much more substantive than a concern. |
Sorry, I wasn't clear. I'm not suggesting that the behaviour be that %7b and %7d are treated as template substitution fields. That would just be broken. In other words, the following test case needs to work correctly: "share_target": {"url_template": "?foo=%7btitle%7d&text={text}"} If invoked with text = "abc", this should navigate to "?foo=%7btitle%7d&text=abc" --- "%7btitle%7d" should not be substituted. So I'm not suggesting that we compromise the semantics. Rather, it's possible to implement the above behaviour correctly if we are careful not to percent-escape the braces before substitution. The current spec text doesn't have this problem, but it does have a problem of not detecting syntax errors in the template at parse time. I have a fix for this but it's a bit of a hack. It seems that fixing #25 without breaking percent-escaped braces is complicated, but certainly do-able. Or, we could change the syntax. I was saying above that it seems wrong to compromise on the preferred syntax just because it makes the implementation messier. But perhaps that is a trade-off we are willing to make. |
As I said before, the only way to properly make that distinction is by branching the URL parser. I'm not sure that's a good idea. |
@plinss raised on w3ctag/design-reviews#221 the existence of RFC 6570, an existing specification for URL template substitution. I raised a number of issues with directly referencing this RFC (in particular, it seems largely incompatible with URL Standard), but the existence of this standard, which uses the exact same template notation as I currently do ("{text}") makes me think we should try to keep this syntax rather than switching to "%(text)"; the only reason to do that being to make parsing easier. After more carefully reading URL Standard and experimenting, I think we can actually go ahead with the existing syntax, if we are also going to remove expansion in the path (which I think we want to do, for now, due to #30). Because '{' and '}' are actually allowed characters in the query and fragment (they are not in the query or fragment percent-encode set):
So we could specify this as just run the URL Parser, against the manifest base, then in addition, check that braces are balanced in the query and fragment parts. At launch time, run "replace placeholders" on the query and fragment parts. '{}' are thus treated as variable delimiters, while '%7b' and '%7d' are literal brace characters. The only thing is that it would not be possible for an unquoted '{' or '}' to appear in the final URL (which is something that is allowed by the URL Standard). I'm not really sure why '{' and '}' are allowed to be left unquoted in URL Standard, since in RFC 3986, they were prohibited in all parts of the URL. Now, this approach would prevent us from allowing placeholders to appear in the path at a later time. If we did want to do that, we could monkey-patch the URL Standard, by saying (from the Web Share Target spec), "Run the URL Parser, but with U+007B ({) and U+007D (}) removed from the path percent-escape set." And then apply the above technique. |
Please don't. Detail the complexity there. I'm still not a big fan of using a different syntax from the established precedent. |
FWIW I don't think the %s in registerProtocolHandler is a very strong precedent, being a two-browser API with no ability for multiple named placeholders. But yes, better to update the URL Standard with some special "URL template parsing mode" or similar, than to monkey-patch it. |
I think the other problem with partially adopting some syntax from elsewhere is developer confusion as it seems unlikely we'll be able to adopt RFC 6570 wholesale. So what looks familiar cannot actually be processed with the same kind of tooling. |
I doubt that developers will be confused upon encountering a curly brace syntax ("{text}") that is slightly different to that of RFC 6570. For one thing, I doubt many developers are even aware of this RFC (I wasn't). I originally based the syntax on a subset of Python's string format syntax. I don't think anyone is going to be confused due to RFC 6570 if we say, "put your placeholder in curly braces like this: /foo?bar={text}" (without mentioning that RFC).
Yes absolutely. I think while WST is a draft spec, it makes sense to express this as a monkey patch, but as we prepare to properly standardize it, we would migrate those changes into the URL standard. Having said that, no such change to URL standard is required if we don't allow placeholders in the path, because '{' and '}' are valid characters in the query and fragment. And I don't intend to allow placeholders in the path, due to #30. |
Spec has been updated to remove the URL template (for unrelated reasons); we now construct a URL from individual key/value pairs, as per form submission. |
Can't you reuse the processing model we have for
registerProtocolHandler()
? Maybe even abstract that and make it reusable?The text was updated successfully, but these errors were encountered: