Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We probably should not change the case of the domain #129

Closed
lemire opened this issue Jan 25, 2023 · 5 comments
Closed

We probably should not change the case of the domain #129

lemire opened this issue Jan 25, 2023 · 5 comments

Comments

@lemire
Copy link
Member

lemire commented Jan 25, 2023

When you receive a domain name or label, you should preserve its case. The rationale for this choice is that we may someday need to add full binary domain names for new services; existing services would not be changed.

RFC 1034 : https://www.rfc-editor.org/rfc/rfc1034

I do not find anything at https://url.spec.whatwg.org/#url-parsing saying that we should lower the case. They refer to case-insensitve matching, but that can be accomplished by lowercasing the strings, but that's not the same thing as storing the lowercase version of the domain.

We should, similarly, check whether other strings that we manipulate should be stored with their case changed.

@anonrig
Copy link
Member

anonrig commented Feb 2, 2023

Is there a specific test case that's holding us back from resolving this issue? If there isn't, happy to resolve this. @lemire

@lemire
Copy link
Member Author

lemire commented Feb 2, 2023

It depends on whether RFC 1034 is still considered valid.

@karwa
Copy link

karwa commented Feb 9, 2023

Lowercasing domains is part of IDNA. It has also been a part of previous URL standards, so it is established practice. I suppose it means that specific advice from RFC-1034 is obsolete (I can't speak to whether the rest of it is still relevant).

@miguelteixeiraa
Copy link
Contributor

Also:
https://github.com/web-platform-tests/wpt/blob/master/url/resources/IdnaTestV2.json#L212

  {
    "input": "www.eXample.cOm",
    "output": "www.example.com"
  }

So I think we are probably good changing the case of the domain.

@lemire
Copy link
Member Author

lemire commented Feb 28, 2023

So I think we are probably good changing the case of the domain.

Well. Yeah. I am going to close this issue, but the fact that the output is in lower case does not imply that we ought to discard the case.

Still, I think that we will discard it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants