Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL does not match the web; does not allow changing protocol from about: #49319

Open
ljharb opened this issue Aug 25, 2023 · 34 comments
Open

URL does not match the web; does not allow changing protocol from about: #49319

ljharb opened this issue Aug 25, 2023 · 34 comments
Labels
whatwg-url Issues and PRs related to the WHATWG URL implementation.

Comments

@ljharb
Copy link
Member

ljharb commented Aug 25, 2023

Version

20.5.1

Platform

Darwin NAME.local 20.6.0 Darwin Kernel Version 20.6.0: Sun Nov 6 23:17:00 PST 2022; root:xnu-7195.141.46~1/RELEASE_X86_64 x86_64

Subsystem

URL

What steps will reproduce the bug?

Run Object.assign(new URL('about:blank'), { protocol: 'http' }).protocol === 'http:' in the REPL.

How often does it reproduce? Is there a required condition?

Always.

What is the expected behavior? Why is that the expected behavior?

true. It's expected because every browser does it, and because it works fine if the initial URL is an http/https one.

What do you see instead?

false

Additional information

No response

@ljharb
Copy link
Member Author

ljharb commented Aug 25, 2023

Note this bug dates back to node v10.0.0, so it'd be great to backport this back as far as is possible (to all non-EOL lines, ideally)

@Ethan-Arrowood
Copy link
Contributor

Since Node switch to ada, does this bug still occur? Is it potentially an issue with the spec?

@Ethan-Arrowood
Copy link
Contributor

cc @anonrig

@ljharb
Copy link
Member Author

ljharb commented Aug 25, 2023

the bug in node seems to be in every version that has URL; from 10.0.0 onwards, so it doesn't seem switching to ada caused or fixed it.

@ljharb
Copy link
Member Author

ljharb commented Aug 25, 2023

I doubt it's an issue with the spec, since it works fine if the original URL is http or https, and it works fine on Chrome and Firefox and Safari - meaning it's solely a bug in node's implementation.

@bakkot
Copy link
Contributor

bakkot commented Aug 25, 2023

This might actually be an issue with the spec: the protocol setter calls basic URL parse passing it scheme start state as state override, and that algorithm eventually hits

which suggests to me that you should not be able to change the protocol from a special scheme to a non-special scheme, or conversely. http is special, about is not. So that matches Node's behavior here.

But, generally speaking WHATWG specs are supposed to document reality, so if no browsers implement the behavior in the spec, I believe that would be regarded as an issue in the spec.

@ljharb
Copy link
Member Author

ljharb commented Aug 25, 2023

aha, interesting, thanks for that context

@ljharb
Copy link
Member Author

ljharb commented Aug 25, 2023

Some more fun quirks from browsers doing their own thing here: Object.assign(new URL('about:blank'), { protocol: 'http' }).toString() === 'http://blank/' but Object.assign(new URL('about:blank'), { protocol: 'http' }, { protocol: 'about' }).toString() === 'about://blank/'.

@wraithgar
Copy link
Contributor

I don't think browsers are the best thing to follow on this one.

Object.assign(new URL('git+file://a'), { protocol: 'file:'}).toString()
"file:///"

I can't imagine updating the spec to allow THAT.

@anonrig
Copy link
Member

anonrig commented Aug 25, 2023

This might actually be an issue with the spec: the protocol setter calls basic URL parse passing it scheme start state as state override, and that algorithm eventually hits

which suggests to me that you should not be able to change the protocol from a special scheme to a non-special scheme, or conversely. http is special, about is not. So that matches Node's behavior here.

But, generally speaking WHATWG specs are supposed to document reality, so if no browsers implement the behavior in the spec, I believe that would be regarded as an issue in the spec.

This is 100% correct.

@ljharb
Copy link
Member Author

ljharb commented Aug 25, 2023

True, but the entire point of node implementing URL was to follow browsers, iirc, and any deviations (even from weird behavior) contradicts the goal of increasing portable code.

@bakkot
Copy link
Contributor

bakkot commented Aug 25, 2023

I've asked about this in the WHATWG matrix channel; will report back if anything useful comes out of it.

@anonrig
Copy link
Member

anonrig commented Aug 25, 2023

True, but the entire point of node implementing URL was to follow browsers, iirc, and any deviations (even from weird behavior) contradicts the goal of increasing portable code.

What browsers follow is defined by the WHATWG specification, and they don't follow the specification (even the web-platform tests results show it). Unfortunately, we follow it strictly (and getting punished by it). Referring: https://wpt.fyi/results/url?label=experimental&label=master&aligned

@anonrig anonrig added the whatwg-url Issues and PRs related to the WHATWG URL implementation. label Aug 25, 2023
@anonrig
Copy link
Member

anonrig commented Aug 26, 2023

Closing the issue since this is an intended behavior of the whatwg url specification.

@anonrig anonrig closed this as not planned Won't fix, can't repro, duplicate, stale Aug 26, 2023
@ljharb
Copy link
Member Author

ljharb commented Aug 26, 2023

@anonrig closing doesn't feel appropriate - the specification is irrelevant; the only point of having URL in node is to match browsers.

@anonrig anonrig reopened this Aug 26, 2023
@ronag
Copy link
Member

ronag commented Aug 26, 2023

The question here is IMO. Are we aiming for spec or web compliance? wdyt @benjamingr

@anonrig
Copy link
Member

anonrig commented Aug 26, 2023

cc @nodejs/url @jasnell

@benjamingr
Copy link
Member

@ronag we follow the spec and if we want the spec changed then we should engage with WHATWG in good faith about it like @bakkot did and/or open an issue in whatwg/url

I don't think this issue is particularly high priority to be honest so waiting for WHATWG is fine?

@benjamingr
Copy link
Member

@ljharb open an issue in WHATWG to change the behavior or the spec?

@ljharb
Copy link
Member Author

ljharb commented Aug 27, 2023

I’ll certainly follow the whatwg discussion, and it’s fine if this is low priority, but issues aren’t closed due to being low priority usually - and it seems like “we follow the spec” vs “we follow the browser” for URL is a tsc consensus thing - has that been discussed?

@anonrig anonrig added the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Aug 27, 2023
@benjamingr
Copy link
Member

and it seems like “we follow the spec” vs “we follow the browser” for URL is a tsc consensus thing - has that been discussed?

I'm happy to (and it seems so is @anonrig ) to bring this up at the TSC meeting but I don't think that'd be super useful.

We operate by lazy consensus. This issue was opened, several collaborators commented. None of them are in actual disagreement about Node doing what the spec says from what I can tell. We can count TSC members too but that shouldn't matter. The collaborators "own" the code and the TSC mediates technical disagreements and votes on them if it comes to that.

From what I can tell unless someone who commented disagrees this isn't actionable to the TSC? (cc @Ethan-Arrowood @ronag @anonrig ) The TSC will 100% discuss it for 1-2 minutes and then say something like "we will monitor what the WHATWG effort ends up with".

Unless you want to bring "let's deviate from the spec when browsers do" as an item? In that case it should probably be a separate issue.

@ljharb
Copy link
Member Author

ljharb commented Aug 27, 2023

That all makes sense; I’m fine just leaving the issue open pending the whatwg discussion regardless.

@ljharb
Copy link
Member Author

ljharb commented Aug 27, 2023

As for the larger question, a ton of design decisions have been made in node for the rationale “we follow browsers, so code is more portable” - including ESM’s resolution compared to that of CJS, using URLs instead of file paths, etc. If matching browsers isn’t the goal, then those decisions and URL following the spec instead of browsers are contradicting each other.

@ronag
Copy link
Member

ronag commented Aug 27, 2023

I agree there is an inconsistency in our decision making depending what module/people are involved.

@anonrig
Copy link
Member

anonrig commented Aug 27, 2023

@benjamingr This is an issue that surfaces every X amount of time which is not specific to url itself. For example, couple of months ago I had a write a blog post to use it as a reference whenever a opaque-origin related issue is created on this repository where the issuer is stating Chrome and Firefox is doing the right thing and Node.js & Safari has a bug. Ref: https://www.yagiz.co/url-parsing-and-browser-differences

Personally, I think we need consensus towards what the users expect, and what the specification is. From what the documentation is saying URL class is under the section: "WHATWG URL API", but even that was not sufficient for this discussion. Therefore, I raise my concern to resolve this once in for all.

PS: Browsers are trying to conform to the web-platform-tests as part of Interop 2023, and eventually either the URL specification will change or the browsers will follow-up with the specification implementation.

@benjamingr
Copy link
Member

benjamingr commented Aug 27, 2023

@anonrig what actionable outcome do you expect from the discussion other than "we should follow the spec reasonably and consult with WinterCG when making server side specific deviations" (which isn't actionable)?

@ronag
Copy link
Member

ronag commented Aug 27, 2023

I think that's an acceptable outcome, no? Then at least we have something to fallback to when these conversations start...

@lemire
Copy link
Member

lemire commented Aug 27, 2023

The relevant part of the specification is this:

Capture d’écran, le 2023-08-27 à 18 04 01

Reference: https://url.spec.whatwg.org

We find explicit test cases that come with the specification such as ...

        {
            "href": "nonsense:///test",
            "new_value": "https",
            "expected": {
                "href": "nonsense:///test",
                "protocol": "nonsense:"
            }
        },

So it is very much deliberate.

In fact, there is an issue open on this very topic:

whatwg/url#674

cc @TimothyGu @annevk @achristensen07

@ljharb
Copy link
Member Author

ljharb commented Aug 28, 2023

(Note that whatwg/url#782 would obviate my use case for mutating an URL object, which would mean i don't care what this does as long as it matches browsers)

@anonrig
Copy link
Member

anonrig commented Aug 28, 2023

@anonrig what actionable outcome do you expect from the discussion other than "we should follow the spec reasonably and consult with WinterCG when making server side specific deviations" (which isn't actionable)?

A reasonable outcome would be to have a consensus on whether we are going to follow WHATWG URL specification (as we do right now), or try to match browser implementations, as suggested.

@GeoffreyBooth
Copy link
Member

A reasonable outcome would be to have a consensus on whether we are going to follow WHATWG URL specification (as we do right now), or try to match browser implementations, as suggested.

I think we wouldn’t want to change to match browsers only to then see browsers change to start following the spec; I think this needs to get resolved at the WHATWG level before we do anything. I expect that WHATWG will update the spec to match browsers and then we’ll update accordingly, but who knows.

@ljharb
Copy link
Member Author

ljharb commented Aug 28, 2023

I totally agree with that plan; it remains useful in general tho to clarify whether the thing to follow is "web reality" or "the html spec and friends".

@LiviaMedeiros
Copy link
Contributor

Out of curiosity, what's the goal of using Object.assign on URL instances? The order of operations is very subtle, consider at least this:

Object.assign(new URL('file://localserver/dev/null'), {
  username: 'livia',
  protocol: 'https:',
}).href === 'https://localserver/dev/null';

Object.assign(new URL('file://localserver/dev/null'), {
  protocol: 'https:',
  username: 'livia',
}).href === 'https://livia@localserver/dev/null';

If the goal is to have WHATWG URL-compliant url.format(urlObject) before any alternative is standardized, it will require to work around a lot of quirks since URL specs are not friendly to arbitrary protocol swaps.

@ljharb
Copy link
Member Author

ljharb commented Aug 29, 2023

It’s just a short way to do a string of assignments from an object.

@anonrig anonrig removed the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
Development

No branches or pull requests

10 participants