Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-101000: Add os.path.splitroot() #101002

Merged
merged 31 commits into from
Jan 27, 2023
Merged

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Jan 12, 2023

This PR introduces os.path.splitroot(). See #101000 for motivation.

In ntpath, the implementation derives from splitdrive(). The splitdrive() function now calls splitroot(), and returns drive, root + tail. Other functions now call splitroot() rather than splitdrive(). In most cases this replaces their own parsing of path roots. It also avoids adding a stack frame.

In posixpath, the normpath() function now calls splitroot() rather than parsing path roots itself.

In pathlib, path constructors now call splitroot() rather than using a slow OS-agnostic implementation. Performance:

$ ./python -m timeit -s 'from pathlib import PureWindowsPath' 'PureWindowsPath("C:/", "foo", "bar")'
50000 loops, best of 5: 6.04 usec per loop  # before
50000 loops, best of 5: 4.03 usec per loop  # after
$ ./python -m timeit -s 'from pathlib import PurePosixPath' 'PurePosixPath("/", "etc", "hosts")'
100000 loops, best of 5: 3.11 usec per loop  # before
100000 loops, best of 5: 2.77 usec per loop  # after

Future work:

  • Improve performance by using native nt._path_splitroot()

@barneygale barneygale marked this pull request as ready for review January 12, 2023 22:57
@barneygale barneygale requested a review from eryksun January 12, 2023 22:57
Copy link

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by comments from seeing this PR on discourse, feel free to disregard if I'm saying/asking something stupid.

Lib/posixpath.py Outdated Show resolved Hide resolved
Lib/posixpath.py Show resolved Hide resolved
Lib/ntpath.py Show resolved Hide resolved
Lib/ntpath.py Outdated Show resolved Hide resolved
Lib/ntpath.py Show resolved Hide resolved
Lib/ntpath.py Outdated Show resolved Hide resolved
@barneygale
Copy link
Contributor Author

Drive-by comments from seeing this PR on discourse, feel free to disregard if I'm saying/asking something stupid.

Thanks for the review! All good feedback I think!

Lib/ntpath.py Outdated Show resolved Hide resolved
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm persuaded that this is a good idea. Thanks for working on this!

Here's a docs review. Haven't got to looking at the implementation yet (will do soon).

Doc/library/os.path.rst Outdated Show resolved Hide resolved
Doc/library/os.path.rst Outdated Show resolved Hide resolved
Doc/library/os.path.rst Outdated Show resolved Hide resolved
Doc/library/os.path.rst Outdated Show resolved Hide resolved
Doc/library/os.path.rst Outdated Show resolved Hide resolved
Doc/library/os.path.rst Show resolved Hide resolved
@AlexWaygood AlexWaygood self-requested a review January 15, 2023 19:24
Doc/library/os.path.rst Outdated Show resolved Hide resolved
Doc/library/os.path.rst Outdated Show resolved Hide resolved
Doc/library/os.path.rst Outdated Show resolved Hide resolved
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Doc/library/os.path.rst Outdated Show resolved Hide resolved
barneygale and others added 2 commits January 16, 2023 18:43
... and not belabour the fact that the empty string may be returned as
any/all items.
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are currently no tests that os.path.splitroot works with os.PathLike objects. Just trivial tests like this should do fine, but we should make sure it's tested:

def test_path_splitdrive(self):
self._check_function(self.path.splitdrive)

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good to me now, and I'm very close to hitting "approve". My only concern (other than my comment about the NEWS entry) is that I'm still not sure the test coverage is quite there. It looks like the tests for splitdrive() account for a lot of edge cases that aren't really tackled in the tests for splitroot() yet, e.g.

# Issue #19911: UNC part containing U+0130
self.assertEqual(ntpath.splitdrive('//conky/MOUNTPOİNT/foo/bar'),
('//conky/MOUNTPOİNT', '/foo/bar'))

and

tester('ntpath.splitdrive("//?/VOLUME{00000000-0000-0000-0000-000000000000}/spam")',
('//?/VOLUME{00000000-0000-0000-0000-000000000000}', '/spam'))

It's true that, since splitdrive() now uses splitroot(), these edge cases are in some sense already covered -- the tests for splitdrive() will start failing if a bug is introduced to splitroot() at some later date in the future. But it will be highly confusing if the tests for splitdrive() start failing, yet the tests for splitroot() all still pass, when the bug is actually in the implementation for splitroot().

@barneygale
Copy link
Contributor Author

Hm. I could rename test_splitdrive to test_splitroot and adjust all the test cases - would that address your concern? (I'd add a new set of tests for splitdrive() that would cover just the basics)

@AlexWaygood
Copy link
Member

Hm. I could rename test_splitdrive to test_splitroot and adjust all the test cases - would that address your concern? (I'd add a new set of tests for splitdrive() that would cover just the basics)

Yeah, I think that would make sense!

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. Thanks, as ever, for your patience and perseverance!

@eryksun, Any further comments from you? :)

@AlexWaygood AlexWaygood added type-feature A feature request or enhancement topic-pathlib labels Jan 23, 2023
@AlexWaygood
Copy link
Member

(Planning to merge in a few days, unless @eryksun has any further feedback :)

@AlexWaygood AlexWaygood merged commit e5b08dd into python:main Jan 27, 2023
mdboom pushed a commit to mdboom/cpython that referenced this pull request Jan 31, 2023
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-pathlib type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants