Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reject non-IPv4 hostnames that end in numbers. #619

Merged
merged 9 commits into from
Aug 5, 2021

Conversation

MattMenke2
Copy link
Contributor

@MattMenke2 MattMenke2 commented Jul 8, 2021

This is aimed at addressing issue #560. If the last component of a URL's hostname is numeric, it's parsed as an IPv4 hostname, and if that fails, the URL's host is rejected. e.g., "foo.0", "bar.0.09", "a.1.2.0x.", "1.2.3.4.5" were all previously considered valid non-IPv4 hostnames, but are now all rejected. See #560 for more discussion on why allowing these are potentially concerning.

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

url.bs Outdated Show resolved Hide resolved
@MattMenke2 MattMenke2 changed the title Reject non-numeric hostnames that end in numbers. Reject non-IPv4 hostnames that end in numbers. Jul 9, 2021
Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-editor review based on implementing jsdom/whatwg-url: LGTM with nits.

url.bs Outdated Show resolved Hide resolved
url.bs Outdated Show resolved Hide resolved
url.bs Outdated Show resolved Hide resolved
domenic added a commit to jsdom/whatwg-url that referenced this pull request Jul 14, 2021
MattMenke2 and others added 3 commits July 14, 2021 18:25
Co-authored-by: Domenic Denicola <d@domenic.me>
Co-authored-by: Domenic Denicola <d@domenic.me>
Co-authored-by: Domenic Denicola <d@domenic.me>
url.bs Outdated Show resolved Hide resolved
Co-authored-by: Timothy Gu <timothygu99@gmail.com>
Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like something we should be doing in name of security. Copying @valenting to be sure, but I think I can say Mozilla is supportive of this effort.

url.bs Outdated Show resolved Hide resolved
url.bs Outdated Show resolved Hide resolved
url.bs Outdated Show resolved Hide resolved
url.bs Outdated Show resolved Hide resolved
url.bs Outdated Show resolved Hide resolved
domenic pushed a commit to web-platform-tests/wpt that referenced this pull request Jul 26, 2021
Follows whatwg/url#619, and adds more coverage to this area generally.
@annevk
Copy link
Member

annevk commented Jul 26, 2021

The tests for this have merged. I'll push a fixup commit for the outstanding comments and merge this Wednesday in case folks have some late comments/concerns.

@annevk
Copy link
Member

annevk commented Jul 28, 2021

To be sure, @MattMenke2 do you agree with my changes here? Can you file browser bugs?

@MattMenke2
Copy link
Contributor Author

To be sure, @MattMenke2 do you agree with my changes here? Can you file browser bugs?

I agree with them, and I'm happy to file bugs, though not sure when I'll get to it.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Jul 30, 2021
…t end in numbers but are not IPv4 IPs, a=testonly

Automatic update from web-platform-tests
Reject hostnames that end in numbers but are not IPv4 IPs

Follows whatwg/url#619, and adds more coverage to this area generally.
--

wpt-commits: a85b7d62bd2788e36e6e83c043f86b3ba1e6a9ec
wpt-pr: 29666
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Jul 31, 2021
…t end in numbers but are not IPv4 IPs, a=testonly

Automatic update from web-platform-tests
Reject hostnames that end in numbers but are not IPv4 IPs

Follows whatwg/url#619, and adds more coverage to this area generally.
--

wpt-commits: a85b7d62bd2788e36e6e83c043f86b3ba1e6a9ec
wpt-pr: 29666
@valenting
Copy link
Collaborator

This seems like something we should be doing in name of security. Copying @valenting to be sure, but I think I can say Mozilla is supportive of this effort.

This seems like the right thing to do.
I have filed https://bugzilla.mozilla.org/show_bug.cgi?id=1723456 for the Firefox work.
Since it seems the WPT tests have already landed, I'll try to implement this as soon as possible.

jamienicol pushed a commit to jamienicol/gecko that referenced this pull request Aug 4, 2021
…t end in numbers but are not IPv4 IPs, a=testonly

Automatic update from web-platform-tests
Reject hostnames that end in numbers but are not IPv4 IPs

Follows whatwg/url#619, and adds more coverage to this area generally.
--

wpt-commits: a85b7d62bd2788e36e6e83c043f86b3ba1e6a9ec
wpt-pr: 29666
jamienicol pushed a commit to jamienicol/gecko that referenced this pull request Aug 4, 2021
…t end in numbers but are not IPv4 IPs, a=testonly

Automatic update from web-platform-tests
Reject hostnames that end in numbers but are not IPv4 IPs

Follows whatwg/url#619, and adds more coverage to this area generally.
--

wpt-commits: a85b7d62bd2788e36e6e83c043f86b3ba1e6a9ec
wpt-pr: 29666
@annevk
Copy link
Member

annevk commented Aug 5, 2021

I filed a WebKit bug, but cannot find a Chromium bug. @MattMenke2 @domenic did I miss something?

@MattMenke2
Copy link
Contributor Author

I filed a WebKit bug, but cannot find a Chromium bug. @MattMenke2 @domenic did I miss something?

https://crbug.com/1149194 is assigned to me and is technically due to this issue (and why I filed this issue), but it's not exactly obvious that it's the same issue. I've file another bug about this, and assigned it to me.

@annevk annevk merged commit ab0e820 into whatwg:main Aug 5, 2021
domenic added a commit to jsdom/whatwg-url that referenced this pull request Aug 5, 2021
@achristensen07
Copy link
Collaborator

Do we really want to make https://example.4294967295/ invalid but https://example.4294967296/ valid?

@MattMenke2
Copy link
Contributor Author

I'm not following - the spec doesn't seem to do that? If the last value parses as a number, we use the IPv4 parser, and if it fails, we reject the hostname. The number parser parses both 4294967295 and 4294967296. There's also the all ASCII digits fallback, but that's not needed here, anyways, since the IPv4 number parser doesn't have a limit of 2^32-1.

@achristensen07
Copy link
Collaborator

Oh, right. Sorry for the noise.

blueboxd pushed a commit to blueboxd/chromium-legacy that referenced this pull request Aug 26, 2021
The URL spec has been updated to parse any hostname ending with a
numeric component as IPv4 unconditionally, with no fallback to
considering it a domain name if it fails to parse as an IPv4 address.
See whatwg/url#619.

This CL updates the url/ parser accordingly.

Bug: 1237032
Change-Id: Ieb0ee3073f65931f97a690b0d852ab9761c328ae
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3111118
Reviewed-by: Ryan Sleevi <rsleevi@chromium.org>
Reviewed-by: Xinghui Lu <xinghuilu@chromium.org>
Reviewed-by: Chris Fredrickson <cfredric@chromium.org>
Reviewed-by: Mike West <mkwst@chromium.org>
Commit-Queue: Matt Menke <mmenke@chromium.org>
Cr-Commit-Position: refs/heads/main@{#915327}
webkit-commit-queue pushed a commit to WebKit/WebKit that referenced this pull request Sep 2, 2021
https://bugs.webkit.org/show_bug.cgi?id=228826

Patch by Alex Christensen <achristensen@webkit.org> on 2021-09-02
Reviewed by Tim Horton.

LayoutTests/imported/w3c:

* web-platform-tests/html/infrastructure/safe-passing-of-structured-data/structured-cloning-error-stack-optional.sub.window-expected.txt:
This test has its expectations change because, in our test infrastructure, it loads a url with host "www1.127.0.0.1" which this change now rejects.
When run from wpt.fyi this is not the case.
* web-platform-tests/url/a-element-expected.txt:
* web-platform-tests/url/a-element-origin-expected.txt:
* web-platform-tests/url/a-element-origin-xhtml-expected.txt:
* web-platform-tests/url/a-element-xhtml-expected.txt:
* web-platform-tests/url/failure-expected.txt:
* web-platform-tests/url/resources/urltestdata.json:
* web-platform-tests/url/url-constructor.any-expected.txt:
* web-platform-tests/url/url-constructor.any.worker-expected.txt:
* web-platform-tests/url/url-origin.any-expected.txt:
* web-platform-tests/url/url-origin.any.worker-expected.txt:

Source/WTF:

This implements a recent change to the URL specification at whatwg/url#619
Chrome has made the same change in https://crbug.com/1237032

Since there are no TLDs that are only numbers and some think it might be confusing to have a valid URL
like http://example.com.127.0.0.1/ we prevent URLs that end in a segment between dots that contains only
numbers from parsing successfully.

* wtf/URLParser.cpp:
(WTF::URLParser::parse):
(WTF::dnsNameEndsInNumber):
(WTF::URLParser::parseHostAndPort):
I give more information in the return type so one can tell what the code is doing.
We only check if it is valid or not, but for documentation purposes I think it's useful to return more information.
* wtf/URLParser.h:

Tools:

* TestWebKitAPI/Tests/WTF/URLParser.cpp:
(TestWebKitAPI::TEST_F):
Update expectations for a few strange URLs that are no longer valid.

LayoutTests:

* fast/url/ipv4-expected.txt:
* fast/url/ipv4.html:

Canonical link: https://commits.webkit.org/241270@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@281963 268f45cc-cd09-0410-ab3c-d52691b4dbfc
bertogg pushed a commit to Igalia/webkit that referenced this pull request Sep 6, 2021
https://bugs.webkit.org/show_bug.cgi?id=228826

Patch by Alex Christensen <achristensen@webkit.org> on 2021-09-02
Reviewed by Tim Horton.

LayoutTests/imported/w3c:

* web-platform-tests/html/infrastructure/safe-passing-of-structured-data/structured-cloning-error-stack-optional.sub.window-expected.txt:
This test has its expectations change because, in our test infrastructure, it loads a url with host "www1.127.0.0.1" which this change now rejects.
When run from wpt.fyi this is not the case.
* web-platform-tests/url/a-element-expected.txt:
* web-platform-tests/url/a-element-origin-expected.txt:
* web-platform-tests/url/a-element-origin-xhtml-expected.txt:
* web-platform-tests/url/a-element-xhtml-expected.txt:
* web-platform-tests/url/failure-expected.txt:
* web-platform-tests/url/resources/urltestdata.json:
* web-platform-tests/url/url-constructor.any-expected.txt:
* web-platform-tests/url/url-constructor.any.worker-expected.txt:
* web-platform-tests/url/url-origin.any-expected.txt:
* web-platform-tests/url/url-origin.any.worker-expected.txt:

Source/WTF:

This implements a recent change to the URL specification at whatwg/url#619
Chrome has made the same change in https://crbug.com/1237032

Since there are no TLDs that are only numbers and some think it might be confusing to have a valid URL
like http://example.com.127.0.0.1/ we prevent URLs that end in a segment between dots that contains only
numbers from parsing successfully.

* wtf/URLParser.cpp:
(WTF::URLParser::parse):
(WTF::dnsNameEndsInNumber):
(WTF::URLParser::parseHostAndPort):
I give more information in the return type so one can tell what the code is doing.
We only check if it is valid or not, but for documentation purposes I think it's useful to return more information.
* wtf/URLParser.h:

Tools:

* TestWebKitAPI/Tests/WTF/URLParser.cpp:
(TestWebKitAPI::TEST_F):
Update expectations for a few strange URLs that are no longer valid.

LayoutTests:

* fast/url/ipv4-expected.txt:
* fast/url/ipv4.html:

git-svn-id: http://svn.webkit.org/repository/webkit/trunk@281963 268f45cc-cd09-0410-ab3c-d52691b4dbfc
mjfroman pushed a commit to mjfroman/moz-libwebrtc-third-party that referenced this pull request Oct 14, 2022
The URL spec has been updated to parse any hostname ending with a
numeric component as IPv4 unconditionally, with no fallback to
considering it a domain name if it fails to parse as an IPv4 address.
See whatwg/url#619.

This CL updates the url/ parser accordingly.

Bug: 1237032
Change-Id: Ieb0ee3073f65931f97a690b0d852ab9761c328ae
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3111118
Reviewed-by: Ryan Sleevi <rsleevi@chromium.org>
Reviewed-by: Xinghui Lu <xinghuilu@chromium.org>
Reviewed-by: Chris Fredrickson <cfredric@chromium.org>
Reviewed-by: Mike West <mkwst@chromium.org>
Commit-Queue: Matt Menke <mmenke@chromium.org>
Cr-Commit-Position: refs/heads/main@{#915327}
NOKEYCHECK=True
GitOrigin-RevId: f21b724918b277919a898464926568d79d595016
@blattersturm
Copy link

blattersturm commented Oct 23, 2022

Why'd you folks do this without offering any means to disable this behavior?

This breaks compatibility irreparably for our software, see a forum post with a user noticing this being broken (in fact, due to GitHub's 'download ZIP' feature exporting folders ending with -0.1.0 etc.).

I genuinely keep being amazed by how web browser vendors love to break compatibility at random. From the issue you linked earlier, metrics from 'Chrome' don't contain use cases from downstream users of Chromium, nor do we have a way to report such metrics to 'Chrome' in the first place or manpower to track every single little change you people are making.

Not all of us are working for big US Silicon Valley tech firms like e.g. this @MattMenke2 person is, with their Google trillions backing every decision they make. In fact, why's the people 'approving' this (@domenic, for example) also working for the same company? I'm not seeing anyone in this discussion who's not working for a US tech firm with a 1T valuation.

@MattMenke2
Copy link
Contributor Author

Why'd you folks do this without offering any means to disable it?

This breaks compatibility irreparably for our software, see a forum post in this regard.

I genuinely keep being amazed by how web browser vendors love to break compatibility at random. From the issue you linked earlier, metrics from 'Chrome' don't contain use cases from downstream users of Chromium, nor do we have a way to report such metrics to 'Chrome' in the first place or manpower to track every single little change you people are making.

See issue #560, linked in the first post for more details. The specs around the host parser, the public suffix list, and the definition of site, are not very compatible in this case. This was a fix for potential security issues, due to https://foo.127.1 having an eTLD+1 of 127.1, and thus a schemeful site of https://127.1, which as a URL, maps to https://127.0.0.1/.

@blattersturm
Copy link

blattersturm commented Oct 23, 2022

The specs around the host parser, the public suffix list, and the definition of site, are not very compatible in this case.

Whatever this may mean, sure, but you should be aware web technologies are also often embedded as libraries in third-party applications, where these use cases may not necessarily apply. Breaking compatibility based only on 'web technology used in a web browser' is shortsighted.

The project I'm running is in no position to work around this, sadly, either, which puts us in a fairly difficult position, something I don't think you folks working for $1T+ companies can ever dream to imagine.

We'd continue backporting security fixes to versions of Chromium that aren't affected here except each time you guys fix a 0-day exploit the details of the issue seem to lead to 'access denied' pages as well for anyone not in your Big Tech Bubble so it's a bit difficult to do so other than by periodically bumping browser releases and then getting hit by whimsical changes like this which wouldn't have been an issue if a) GitHub didn't have these weird suffixed directories or b) we would've had perfect foresight or the funding to plan ahead and know that people would place raw directories obtained from GitHub and that web browsers would change URL parsing rules at random breaking it even when we did know the former would happen.

Paraphrasing the response to the end user report on our end to show the complexity involved here:

Possible strategies:

  1. revert Chromium commit: problematic (huge commit), effort, will make future upgrades more difficult. most viable option though, sigh.
  2. 'just accept it being broken': impossible, lots of resources that are downloaded from GH like this :/ even informing users won't work
  3. work around it by removing weird chars from resource name in NUI on client-side: would break concatenation
  4. work around it by removing weird chars from resource name on client side in general: would break weird 'anticheat scripts'
  5. work around it by removing weird chars from resource name server side: would require a server update to not break, and could still confuse users/code/.. that assume resource names aren't canonicalized :/

Edited a bit later:

Anyway, I'm sorry for barging in here like this and questioning this decision - been fairly stressed personally as of late and hitting blocking regressions with getting a way-delayed changeset out the door is kind of wearing me out at some point. I'll probably end up backing out the change to //url/ tomorrow and deal with having it be in our embedded Chromium patchset for the rest of this project's lifespan.

At least, I guess this can serve as a reminder of how little changes can still break downstream compatibility - see also the classical xkcd about 'every change breaks someone's workflow'. 😓

@MattMenke2
Copy link
Contributor Author

I'm sorry that this hit you, but end user security is our highest priority. We don't like breaking things for developers - I mean, we are human, and we don't like making folks (justifiably) unhappy. Breakages are sometimes inevitable, unfortunately, due to the cobbled-together nature of web standards, which often didn't take security into consideration, or overlook how features interact with each other.

In terms of modifying Chromium: Note that the commit that changed behavior had to completely refactor a method, because the new behavior required extra checking not present in the old code. So you could perhaps come up with a smaller change in terms of production code to make the new code behave like the old code (I think it's the "if (family != CanonHostInfo::IPV4)" block? It's pretty ugly code, unfortunately). The rest of the CL was tests, which, yes, you'd need to revert, if you run tests and want them to pass.

blattersturm added a commit to citizenfx/chromium that referenced this pull request Nov 1, 2022
The changes in whatwg/url#619 broke NUI URLs such as
`https://cfx-nui-dr-scratching-3.1.0/`, which is a compat break for our
use case.

Since we don't rely on URL display to end users as much as browsers do,
the security risks leading to this change do not apply here.

This both reverts the behavior from f21b724,
as well as 6998eb4 which is required to
not fail mojo deserialization on IP-style hosts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants