Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch sys.path on a per-file basis #7357

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

DanielNoord
Copy link
Collaborator

  • Add yourself to CONTRIBUTORS if you are a new contributor.
  • Add a ChangeLog entry describing what your PR does.
  • If it's a new feature, or an important bug fix, add a What's New entry in
    doc/whatsnew/<current release.rst>.
  • Write a good description on what the PR does.

Type of Changes

Type
🐛 Bug fix

Description

Closes #7339

I agree with @gcbirzan-plutoflume that the updated test outputs seem like they are actually beter. The change itself also makes sense imo, as Python also fixes sys.path once. See the issue for some further discussion.

@DanielNoord DanielNoord added Bug 🪲 Needs backport Needs to be cherry-picked on the current patch version by a pylint's maintainer Import system labels Aug 25, 2022
@DanielNoord DanielNoord added this to the 2.15.1 milestone Aug 25, 2022
@coveralls
Copy link

coveralls commented Aug 25, 2022

Pull Request Test Coverage Report for Build 2948717043

  • 18 of 19 (94.74%) changed or added relevant lines in 3 files are covered.
  • 16 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.01%) to 95.318%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pylint/lint/utils.py 11 12 91.67%
Files with Coverage Reduction New Missed Lines %
pylint/checkers/variables.py 16 97.36%
Totals Coverage Status
Change from base Build 2926961300: 0.01%
Covered Lines: 16939
Relevant Lines: 17771

💛 - Coveralls

tests/test_functional.py Outdated Show resolved Hide resolved
pylint/lint/utils.py Show resolved Hide resolved
from pylint.lint.expand_modules import get_python_path


def get_python_path(filepath: str) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we cache this and/or fix_import_path now that we launch the function a lot ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix_import_path is an iterator that changes sys.path so I don't think we can cache that.

I'll cache get_python_path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh never mind, I don't think get_python_path ever gets called with the same argument, as it is part of the exploration of modules and we don't "explore the same module twice".

Copy link
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this looks promising. Some trivial cosmetic things.

tests/test_functional.py Outdated Show resolved Hide resolved
tests/test_functional.py Outdated Show resolved Hide resolved
tests/lint/unittest_lint.py Outdated Show resolved Hide resolved
@jacobtylerwalls
Copy link
Member

@DanielNoord Have you tested this with namespace packages? Will it be a problem that is_namespace() is cached in astroid?

@DanielNoord
Copy link
Collaborator Author

@DanielNoord Have you tested this with namespace packages? Will it be a problem that is_namespace() is cached in astroid?

Hmm, I'm not sure. Do you think it should?

The primer failures show that this has significant performance regressions so I'm going to investigate (and fix) those before merging.

Co-authored-by: Jacob Walls <jacobtylerwalls@gmail.com>
@jacobtylerwalls
Copy link
Member

I guess I was wondering if we should be doing less caching: if a module name that was cached by astroid's is_namespace may or may not be a namespace depending on whether sys.path was patched in pylint.

@DanielNoord
Copy link
Collaborator Author

I guess I was wondering if we should be doing less caching: if a module name that was cached by astroid's is_namespace may or may not be a namespace depending on whether sys.path was patched in pylint.

If anything I think this should reduce the occurrence of this issue (the original issue describes a similar issue actually).

My thinking of this is as follows.
Take file A1, A2 and B1. A en B represent two packages which are both supplied on the command line, and after expand_modules A turns out to have two files.
Currently we would patch sys.path with both A and B, which is what is creating the issue as described in the original issue. As far as I can see (for now) whether or not A is a namespace package only depends on the file structure of package A. The patching of B onto sys.path should never correctly make A namespace and can only do so incorrectly as in the issue.
Since we know patch sys.path with A or A1/A2 if we can be even more specific (something,) vs. (file.filepath,) we never lint A1/A2 while patching B. This should avoid the previous issues.

The only thing I can think of that would be problematic is if package A imports package B before we lint package B and we determine during that import that B is namespace. However, I think that if we determine that it is namespace then it is likely correct: if you execute package A normally the interpreter also wouldn't magically patch package B onto your sys.path I think. So even if we cache some package as namespace before actually linting it is likely correct?

Not sure if I have explained myself enough here.

Also, lru_cache doesn't seem to work sadly 😢

@github-actions
Copy link
Contributor

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit 008b348

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.15.1, 2.15.2 Sep 1, 2022
@DanielNoord
Copy link
Collaborator Author

Not really sure what to do with this PR. It's clear that creating the contextmanager for every file has significant performance impact, but I also can't think of a solution without doing so...

The contextmanager itself also isn't that large so there isn't really anything to refactor to make it quicker to set up..

@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.15.4 milestone Sep 16, 2022
@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.15.4, 2.15.5 Oct 10, 2022
@cdce8p cdce8p modified the milestones: 2.15.5, 2.15.6 Oct 21, 2022
@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.15.6, 2.15.7 Nov 14, 2022
@Pierre-Sassoulas Pierre-Sassoulas removed this from the 2.15.7 milestone Nov 23, 2022
@Pierre-Sassoulas Pierre-Sassoulas added the Needs decision 🔒 Needs a decision before implemention or rejection label Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🪲 Import system Needs backport Needs to be cherry-picked on the current patch version by a pylint's maintainer Needs decision 🔒 Needs a decision before implemention or rejection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modules of files specified on the command line can be cached with the wrong name
5 participants