Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-90208: Suppress OSError exceptions from pathlib.Path.glob() #104141

Merged

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented May 3, 2023

pathlib.Path.glob() now suppresses all OSError exceptions, except those raised from calling is_dir() on the top-level path.

Previously, glob() suppressed ENOENT, ENOTDIR, EBADF and ELOOP errors and their Windows equivalents. PermissionError was also suppressed unless it occurred when calling is_dir() on the top-level path. However, the selector would abort prematurely if a PermissionError was raised, and so glob() could return incomplete results.

Rationale

Globbing is a shell feature, per the glob module docs:

The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell

When implemented by shells, globbing matches as much as possible without aborting.

In fact, other Python functions that walk a directory tree usually suppress errors, e.g.:

  • glob.glob()
  • os.walk() and os.fwalk() (configurable)
  • pathlib.Path.walk() (configurable)

The rationale is the same: an error from one file/directory isn't fatal for the walk, and we can continue walking elsewhere.

Finally, as noted above, pathlib.Path.glob() already ignores the most common kinds of OSError. This shows some recognition of the above rationale, though incomplete.

`pathlib.Path.glob()` now suppresses all OSError exceptions, except
those raised from calling `is_dir()` on the top-level path.

Previously, `glob()` suppressed ENOENT, ENOTDIR, EBADF and ELOOP
errors and their Windows equivalents. PermissionError was also
suppressed unless it occurred when calling `is_dir()` on the
top-level path. However, the selector would abort prematurely
if a PermissionError was raised, and so `glob()` could return
incomplete results.
@barneygale
Copy link
Contributor Author

FAO @AlexWaygood, who pointed out the messy exception handling and poor test coverage in #104103

@barneygale barneygale changed the title GH-90208 - Suppress OSError exceptions from pathlib.Path.glob() GH-90208: Suppress OSError exceptions from pathlib.Path.glob() May 4, 2023
@AlexWaygood
Copy link
Member

AlexWaygood commented May 6, 2023

(I'm not 100% confident I understand all the subtleties here, so I'd rather leave this one for Eryk or Steve to review, if that's okay!)

@barneygale
Copy link
Contributor Author

Marking as a draft because it would be good to land #104292 first.

@barneygale barneygale marked this pull request as ready for review May 10, 2023 17:21
@barneygale
Copy link
Contributor Author

@JelleZijlstra would you be interested in reviewing this? Thank you!

Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale looks solid, but this is enough of a behavior change that I'd probably not backport.

Your description made me ask why we ignore only that specific set of exceptions (ENOENT, ENOTDIR, EBADF, ELOOP). It looks like that set is used throughout pathlib in places where it makes more sense than here, so there may not have been a strong intent to ignore only those specific errors for globs.

@barneygale
Copy link
Contributor Author

Thanks! I agree on not backporting.

@barneygale barneygale merged commit 94f30c7 into python:main May 11, 2023
carljm added a commit to carljm/cpython that referenced this pull request May 11, 2023
* main: (27 commits)
  pythongh-87849: fix SEND specialization family definition (pythonGH-104268)
  pythongh-101819: Adapt _io.IOBase.seek and _io.IOBase.truncate to Argument Clinic (python#104384)
  pythongh-101819: Adapt _io._Buffered* methods to Argument Clinic (python#104367)
  pythongh-101819: Refactor `_io` futher in preparation for module isolation (python#104369)
  pythongh-101819: Adapt _io.TextIOBase methods to Argument Clinic (python#104383)
  pythongh-101117: Improve accuracy of sqlite3.Cursor.rowcount docs (python#104287)
  pythonGH-92184: Convert os.altsep to '/' in filenames when creating ZipInfo objects (python#92185)
  pythongh-104357: fix inlined comprehensions that close over iteration var (python#104368)
  pythonGH-90208: Suppress OSError exceptions from `pathlib.Path.glob()` (pythonGH-104141)
  pythonGH-102181: Improve specialization stats for SEND (pythonGH-102182)
  pythongh-103000: Optimise `dataclasses.asdict` for the common case (python#104364)
  pythongh-103538: Remove unused TK_AQUA code (pythonGH-103539)
  pythonGH-87695: Fix OSError from `pathlib.Path.glob()` (pythonGH-104292)
  pythongh-104263: Rely on Py_NAN and introduce Py_INFINITY (pythonGH-104202)
  pythongh-104010: Separate and improve docs for `typing.get_origin` and `typing.get_args` (python#104013)
  pythongh-101819: Adapt _io._BufferedIOBase_Type methods to Argument Clinic (python#104355)
  pythongh-103960: Dark mode: invert image brightness (python#103983)
  pythongh-104252: Immortalize Py_EMPTY_KEYS (pythongh-104253)
  pythongh-101819: Clean up _io windows console io after pythongh-104197 (python#104354)
  pythongh-101819: Harden _io init (python#104352)
  ...
carljm added a commit to carljm/cpython that referenced this pull request May 11, 2023
* main:
  pythongh-87849: fix SEND specialization family definition (pythonGH-104268)
  pythongh-101819: Adapt _io.IOBase.seek and _io.IOBase.truncate to Argument Clinic (python#104384)
  pythongh-101819: Adapt _io._Buffered* methods to Argument Clinic (python#104367)
  pythongh-101819: Refactor `_io` futher in preparation for module isolation (python#104369)
  pythongh-101819: Adapt _io.TextIOBase methods to Argument Clinic (python#104383)
  pythongh-101117: Improve accuracy of sqlite3.Cursor.rowcount docs (python#104287)
  pythonGH-92184: Convert os.altsep to '/' in filenames when creating ZipInfo objects (python#92185)
  pythongh-104357: fix inlined comprehensions that close over iteration var (python#104368)
  pythonGH-90208: Suppress OSError exceptions from `pathlib.Path.glob()` (pythonGH-104141)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants