Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flake8-use-pathlib] Recommend Path.iterdir() over os.listdir() (PTH208) #14509

Merged
merged 5 commits into from
Nov 27, 2024

Conversation

InSyncWithFoo
Copy link
Contributor

Summary

Resolves #14490.

Test Plan

cargo nextest run and cargo insta test.

Copy link
Contributor

github-actions bot commented Nov 21, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+52 -0 violations, +0 -0 fixes in 3 projects; 52 projects unchanged)

apache/airflow (+14 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ dev/breeze/src/airflow_breeze/commands/release_management_commands.py:1134:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/breeze/src/airflow_breeze/commands/release_management_commands.py:2181:29: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/breeze/src/airflow_breeze/commands/sbom_commands.py:299:33: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/breeze/src/airflow_breeze/utils/cdxgen.py:142:22: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/check_files.py:204:13: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/check_files.py:217:13: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/check_files.py:230:13: PTH208 Use `pathlib.Path.iterdir()` instead.
+ docs/build_docs.py:457:20: PTH208 Use `pathlib.Path.iterdir()` instead.
+ providers/src/airflow/providers/amazon/aws/hooks/sagemaker.py:177:63: PTH208 Use `pathlib.Path.iterdir()` instead.
+ providers/tests/openlineage/plugins/test_execution.py:60:81: PTH208 Use `pathlib.Path.iterdir()` instead.
+ providers/tests/sftp/hooks/test_sftp.py:177:39: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/ci/pre_commit/version_heads_map.py:47:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests_common/test_utils/system_tests_class.py:103:17: PTH208 Use `pathlib.Path.iterdir()` instead.
... 1 additional changes omitted for project

bokeh/bokeh (+11 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ examples/server/app/simple_hdf5/main.py:19:28: PTH208 Use `pathlib.Path.iterdir()` instead.
+ src/bokeh/command/subcommands/__init__.py:54:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ src/bokeh/sphinxext/bokeh_gallery.py:134:18: PTH208 Use `pathlib.Path.iterdir()` instead.
+ src/bokeh/sphinxext/bokeh_gallery.py:160:21: PTH208 Use `pathlib.Path.iterdir()` instead.
+ src/bokeh/sphinxext/bokeh_releases.py:76:47: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/support/util/examples.py:186:33: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/unit/bokeh/command/subcommands/test___init___subcommands.py:48:13: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/unit/bokeh/command/subcommands/test_json__subcommands.py:111:54: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/unit/bokeh/command/subcommands/test_json__subcommands.py:123:50: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/unit/bokeh/command/subcommands/test_json__subcommands.py:135:50: PTH208 Use `pathlib.Path.iterdir()` instead.
... 1 additional changes omitted for project

zulip/zulip (+27 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ corporate/tests/test_stripe.py:137:18: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/run_hooks.py:63:38: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/setup_venv.py:165:20: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:143:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:308:21: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:351:27: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:370:64: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:657:26: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tools/documentation_crawler/documentation_crawler/spiders/check_help_documentation.py:34:29: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tools/lib/test_script.py:102:34: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tools/setup/generate_landing_page_images.py:29:21: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tools/setup/generate_zulip_bots_static_files.py:48:20: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/data_import/mattermost.py:879:8: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/data_import/slack.py:1431:8: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/data_import/slack.py:818:22: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/lib/sounds.py:10:22: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/compilemessages.py:98:23: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/convert_mattermost_data.py:61:12: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/convert_rocketchat_data.py:39:12: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/export.py:138:20: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/export_single_user.py:41:47: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/tests/test_delete_unclaimed_attachments.py:75:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/tests/test_delete_unclaimed_attachments.py:94:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/tests/test_import_export.py:268:30: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/tests/test_urls.py:37:20: PTH208 Use `pathlib.Path.iterdir()` instead.
... 2 additional changes omitted for project

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
PTH208 52 52 0 0 0

@MichaReiser MichaReiser added rule Implementing or modifying a lint rule preview Related to preview mode features labels Nov 22, 2024
@MichaReiser
Copy link
Member

I noticed two common patterns when reviewing the ecosystem checks:

if os.listdir("dir"): ....

if "file" in os.listdir("dir"):

The first requires using len, list, or any because using the Path.iterdir directly always returns true.

The second mainly becomes more verbose. I'm interested in more opinions if we should exclude them. Wdyt @sbrugman

https://github.com/apache/airflow/blob/440c224af5592f9007eef43d1dbe9025aa34e177/docs/build_docs.py#L456

https://github.com/bokeh/bokeh/blob/829b2a75c402d0d0abd7e37ff201fbdfd949d857/examples/server/app/simple_hdf5/main.py#L19
https://github.com/zulip/zulip/blob/65f05794ee59d638ad054ae6602d8ebc980fb637/scripts/lib/zulip_tools.py#L657
https://github.com/zulip/zulip/blob/65f05794ee59d638ad054ae6602d8ebc980fb637/zerver/data_import/mattermost.py#L879

@InSyncWithFoo
Copy link
Contributor Author

The first requires using len, list, or any

To be pedantic, len() can't be used on an iterator, so only the other two should be suggested in that case.

@sbrugman
Copy link
Contributor

sbrugman commented Nov 25, 2024

The pathlib rules should flag all os.path cases. When these rules are active I assume users made the decision to favour pathlib over os.path, and partially excluding some examples will be unexpected.

It could be good to already include these cases in the tests to make sure they are covered when autofix is implemented later. The complexity here is in the fix, not in the detection of the violation.

Going over the ecosystem results I realise that os.scandir should also be flagged (and is closer to Path.iterdir as it produces a generator). @InSyncWithFoo it's probably worth adding this as PTH209. The non-trivial fixes stem from unidiomatic use of os.path in the first place imo.

Using listdir for checking that a file does not exist is even a candidate for it's own rule as this first lists all files in a directory, and then only checks one:

'demo_data.hdf5' not in os.listdir(app_dir)

Idiomatic os.path solution:

not os.path.exists(os.path.join(app_dir), 'demo_data.hdf5')

Pathlib equivalent:

not (app_dir / "demo_data.hdf5").exists()

Checking if a directory is empty with os.listdir is also wasteful:

not os.listdir(api_dir)

Users should probably write something like:

next(os.scandir(api_dir), None) is None

Pathlib equivalent:

next(api_dir.iterdir(), None) is None

@InSyncWithFoo
Copy link
Contributor Author

it's probably worth adding this as PTH209.

Got it. I'll get to that the day after tomorrow.

@InSyncWithFoo
Copy link
Contributor Author

InSyncWithFoo commented Nov 25, 2024

@sbrugman On second thought... should PTH208 also check for os.scandir? These two seem sufficiently similar in both functionality and rule implementation.

@sbrugman
Copy link
Contributor

No, adding a separate rule is consistent with other rules:

https://docs.astral.sh/ruff/rules/os-remove/
https://docs.astral.sh/ruff/rules/os-unlink/

The distinction could be useful for users who would like to fix os.scandir (since its replacement is also a generator and requires little effort), but not os.listdir for instance.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sbrugman. That makes sense to me. @InSyncWithFoo happy to merge this rule once we added tests demonstrating the patterns linked in #14509 (comment)

@InSyncWithFoo
Copy link
Contributor Author

@MichaReiser Done. I added them to the rule's documentation as well.

@MichaReiser MichaReiser enabled auto-merge (squash) November 27, 2024 09:50
@MichaReiser MichaReiser merged commit 187974e into astral-sh:main Nov 27, 2024
20 checks passed
@InSyncWithFoo InSyncWithFoo deleted the PTH208 branch November 27, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New flake8-pathlib rule: os.listdir (PTH208)
3 participants