Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urllib.parse.splituser has no suitable replacement #80072

Open
jaraco opened this issue Feb 3, 2019 · 2 comments
Open

urllib.parse.splituser has no suitable replacement #80072

jaraco opened this issue Feb 3, 2019 · 2 comments
Labels
3.8 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@jaraco
Copy link
Member

jaraco commented Feb 3, 2019

BPO 35891
Nosy @jaraco

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-02-03.15:11:00.002>
labels = ['3.8', 'type-bug', 'library']
title = 'urllib.parse.splituser has no suitable replacement'
updated_at = <Date 2019-02-03.15:11:09.508>
user = 'https://github.com/jaraco'

bugs.python.org fields:

activity = <Date 2019-02-03.15:11:09.508>
actor = 'jaraco'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-02-03.15:11:00.002>
creator = 'jaraco'
dependencies = []
files = []
hgrepos = []
issue_num = 35891
keywords = []
message_count = 1.0
messages = ['334793']
nosy_count = 1.0
nosy_names = ['jaraco']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue35891'
versions = ['Python 3.8']

@jaraco
Copy link
Member Author

jaraco commented Feb 3, 2019

The removal of splituser (bpo-27485) has the undesirable effect of leaving the programmer without a suitable alternative. The deprecation warning states to use urlparse instead, but urlparse doesn't provide the access to the credential or address components of a URL.

Consider for example:

>>> import urllib.parse
>>> url = 'https://user:password@host:port/path'
>>> parsed = urllib.parse.urlparse(url)
>>> urllib.parse.splituser(parsed.netloc)
('user:password', 'host:port')

It's not readily obvious how one might get those two values, the credential and the address, from parsed. Sure, you can get username and password. You can get hostname and port. But if what you want is to remove the credential and keep the address, or extract the credential and pass it unchanged as a single string to something like an _encode_auth handler, that's no longer possible without some careful handling--because of possible None values, re-assembling a username/password into a colon-separated string is more complicated than simply doing a ':'.join.

This recommendation and limitation led to issues in production code and ultimately the inline adoption of the deprecated function, summarized here.

I believe if splituser is to be deprecated, the netloc should provide a suitable alternative - namely that a urlparse result should supply address and userinfo. Such functionality would make it easier to transition code that currently relies on splituser for more than to parse out the username and password.

Even better would be for the urlparse result to support _replace operations on these attributes... so that one wouldn't have to construct a netloc just to construct a URL that replaces only some portion of the netloc, so one could do something like:

>> parsed = urllib.parse.urlparse(url)
>> without_userinfo = parsed._replace(userinfo=None).geturl()
>> alt_port = parsed._replace(port=443).geturl()

I realize that because of the nesting of abstractions (namedtuple for the main parts), that maybe this technique doesn't extend nicely, so maybe the netloc itself should provide this extensibility for a usage something like this:

>> parsed = urllib.parse.urlparse(url)
>> without_userinfo = parsed._replace(netloc=parsed.netloc._replace(userinfo=None)).geturl()
>> alt_port = parsed._replace(netloc=parsed.netloc._replace(port=443)).geturl()

It's not as elegant, but likely simpler to implement, with netloc being extended with a _replace method to support replacing segments of itself (and still immutable)... and is dramatically less error-prone than the status quo without splituser.

In any case, I don't think it's suitable to leave it to the programmer to have to muddle around with their own URL parsing logic. urllib.parse should provide some help here.

@jaraco jaraco added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error 3.8 (EOL) end of life labels Feb 3, 2019
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@Avasam
Copy link

Avasam commented Oct 30, 2024

_NetlocResultMixinStr (or the proper superclass), could have properties userinfo and hostinfo that are essentially just:

class _NetlocResultMixinStr(...):
    ...
	@property
    def userinfo(self):
		return ":".join([info for info in self._userinfo if info is not None])
	@property
    def hostinfo(self):
		return ":".join([info for info in parsed._hostinfo if info is not None])
		
	# or even
	
    @property
    def userinfo(self):
        if self.username:
            return self.username + (f":{self.password}" if self.password else "")
        return None
    @property
    def hostinfo(self):
        if self.hostname:
            return self.hostname + ("" if self.port is None else f":{self.port}")
        return None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants