Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patterns as a string #22

Closed
blakeembrey opened this issue Oct 15, 2020 · 10 comments
Closed

Patterns as a string #22

blakeembrey opened this issue Oct 15, 2020 · 10 comments

Comments

@blakeembrey
Copy link

blakeembrey commented Oct 15, 2020

Creating a meta proposal to bring together the discussion around #20, #19, #14, and #13. Feel free to correct me or close if I haven't thought about some edge cases in the individual locations that the pattern intends to be used, I'm less familiar with things like service worker scopes and CSP.

I'm making a couple of assumptions:

  1. Using a single string is the simplest API for developers
  2. Being incompatible with path-to-regexp is acceptable (on this point, it already is - just in a different way)
  3. You don't use * (wildcards) as the default behavior
  4. You enforce the scheme/hostname to be defined (e.g. http://, https?:// or *://)

With these assumptions, I think you could create a simpler API that mirrors the URL API. For example, new URLPattern('*', window.location) would be supported. You could always enforce a full URL instead of partial URLs, similar to the current URL API too.

In the service worker use-case, you can have it throw if the "origin" doesn't match the static prefix of the URLPattern. There was already some need for this sort of behavior due to the scope matching ordering: https://github.com/WICG/urlpattern/blob/master/explainer.md#scope-match-ordering.

Finally, on the path matching "magic", if you remove the magic prefix/suffix and make it explicit you can just do new URLPattern('/:foo{/bar}?', 'http://{:subdomain.}*.example.com').

@wanderview
Copy link
Member

Being incompatible with path-to-regexp is acceptable (on this point, it already is - just in a different way)

Yes, but I'd like to minimize changes. Over time I find myself removing differences from my proposal and just converging on path-to-regexp. For example, I plan to remove my magic "." suffix behavior for hostnames now.

And to be clear, I mean converge on path-to-regexp on a per-component basis. How we perform matches across the whole URL seems to sit atop the path-to-regexp matching engine in my mind.

With these assumptions, I think you could create a simpler API that mirrors the URL API

If we can get a string-only system to work, then that would be great. I'm still expecting to need a long form structured approach as a fallback, though, since there are a lot of URL corner cases that may be hard to work into a short form single string. For example, URLs can have port numbers, username:password@, etc. I think a short form could ignore those, but trying to make a general purpose signal string parsing approach that handles potential matches in all those less common cases could get quite complex.

Finally, on the path matching "magic", if you remove the magic prefix/suffix and make it explicit

In regards to new URLPattern('/:foo{/bar}?') instead of the prefix magic, this would unfortunately not work for the service worker use case. We can't easily divine that the regular expression here is safe for the service worker to process. Having the behavior in the optional/repeat modifier allows service worker scopes to take advantage of it.

@blakeembrey
Copy link
Author

this would unfortunately not work for the service worker use case.

Gotcha, I think I misunderstood this part. Supporting the magic prefix is explicitly to make service worker computation easier? Can you help me understand this part?

We can't easily divine that the regular expression here is safe for the service worker to process.

I'm not proposing supporting regular expressions FWIW, the /bar would be a literal string while /:foo would be what's supported in the current specification. I think maybe what I was suggestion may have been more clear written as /foo{/:bar}?, although I would imagine both are equally valid but I may be incorrect about that regarding the previous point on service workers.

@wanderview
Copy link
Member

Oh I see. Its a new syntax that scopes what the modifier applies to? I guess that could work for service workers.

That seems possible. I do hesitate to deviate too much from the existing path-to-regexp, but this does seem a lot less magical than the existing optional prefix behavior.

How would we evaluate if this is something that would work for developers?

@blakeembrey
Copy link
Author

I do hesitate to deviate too much from the existing path-to-regexp, but this does seem a lot less magical than the existing optional prefix behavior. How would we evaluate if this is something that would work for developers?

Totally makes sense, it is supported in the current release of v6 - I can try and find if anyone is using it in the wild yet.

@wanderview
Copy link
Member

FYI, I am starting from the latest path-to-regexp for my prototyping. So my initial implementation will include the {foo}? type grouping, but not implement the "optional prefix" behavior.

@wanderview
Copy link
Member

I think this issue mainly boils down to what I am calling "short form" where we will accept a string encompassing patterns for the entire URL. I haven't prototyped this yet. I still think its likely there are cases we won't be able to easily disambiguate, so having the structured long form will be a good backup.

@blakeembrey
Copy link
Author

One thing I remembered (and can't find if I wrote this in a separate issue) so the behavior of matching segments in non-pathname locations. For example, foo.:domain.com would probably be expected to break on . instead of matching foo.bar.baz.com. Without knowing the position in the URL, which I've seen you trying to figure out, you won't know which is the default character to break on.

@wanderview
Copy link
Member

My plan is to first break the string into URL components. Then each component is compiled as a pattern. This should allow us to use a different separator in the hostname vs the pathname, etc.

@wanderview
Copy link
Member

Note, I was able to successfully support string construction in the implementation. So you can do:

new URLPattern('http{s}?://*.example.com/foo/:image.jpg');

It also supports relative strings:

new URLPattern('/foo/:image.jpg', self.location);

@wanderview
Copy link
Member

This is now codified in the spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants