Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve backend loading from backend-path #165

Merged
merged 14 commits into from
Aug 19, 2023

Conversation

abravalheri
Copy link
Contributor

@abravalheri abravalheri commented Feb 22, 2023

Closes #164
Closes #45

My understanding is that these changes are worth implementing (to increase robustness) even if pip improves its environment isolation.

Approach

Instead of changing sys.path and relying on importlib.import_module to do the right thing, this PR proposes to use importlib.machinery.PathFinder.find_spec to actively restrict the paths to be searched when looking for the backend.

Other details

Since the description of BackendUnavailable1 seems more appropriate than the description for BackendInvalid2 for the circumstances described in the #164, I modified (in a backwards compatible way) how the errors are reported. I think this makes sense with the rest of the alterations...

Footnotes

  1. docstring: Will be raised if the backend cannot be imported in the hook process.

  2. docstring: Will be raised if the backend is invalid.

@abravalheri abravalheri marked this pull request as ready for review February 22, 2023 23:35
@abravalheri abravalheri changed the title Improve backend loading from backend-path Improve/fix? backend loading from backend-path Feb 23, 2023
spec = _find_spec_in_path(module_name, pathitems)
module = module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will execute the module with sys.meta_path unmodified. Even if you fixed loading this particular module from the right place, modules it imports during execution will be found by the loaders in sys.meta_path, which is even more dangerous than loading the specified backend module from the wrong place IMO.

# Parent packages need to be imported to ensure everything comes from pathitems.

Exactly, but children and packages with a different top-level name too!

IMO the correct way to handle this is to simply prepend your PathFinder to sys.meta_path while we perform the build, and just import as normal, instead of this custom import hack 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the review, Filipe.

In f14e5df I tried to implement the suggested approach. Please let me know if I misunderstood your comment (I can always revert and try something different).

Copy link
Member

@FFY00 FFY00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @abravalheri!

@abravalheri
Copy link
Contributor Author

Hi @takluyver, I was wondering if you could review this PR (if you have the time) or share your thoughts about the issue.

I don't know if there would be time for a pyproject-hooks release before pip's next release (very unlikely, but who knows...).

Previously `_in_process` would only modify `sys.path`, use
`importlib.import_module` and hope the backend is loaded from the right
path.

This would lead to misleading error messages in some cases when the
environment isolation is not perfect.

This change modifies the behavior by using
`importlib.machinery.PathFinder.find_spec` to locate the backend module
within a set of path entries.
… when error regards import problems.
This should address review comments and increase the robustness of the
solution.
@abravalheri
Copy link
Contributor Author

@pradyunsg, I rebased the PR on top of the latest changes in main. Please let me know if any other changes are necessary.

@abravalheri
Copy link
Contributor Author

Hello @pypa/pyproject-hooks-committers, I wonder if this direction for solving the issue is something that the project would be interested, or if it would be better for me to close the PR and assume that the issue would be handled with a different approach.

@takluyver
Copy link
Member

I always find it tough to understand the various bits of custom import machinery, so I don't relish trying to understand either the problem or the solution here (and I'm too tired to tackle it right now, when I happen to have a few minutes to look at it).

I see you've added tests, and @FFY00 was happy with it, so I'll give people a few more days to make any comments, and if there are none, I'll trust you and merge this. Feel free to ping me again in a week or so if I forget (which is all too possible 😉 )

self.traceback = traceback
super().__init__(message or "Error while importing backend")


class BackendInvalid(Exception):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this exception, since this is no longer being caught and used?

Copy link
Contributor Author

@abravalheri abravalheri Aug 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the review @pradyunsg.

I was considering that removing BackendInvalid would cause backward compatibility problems (e.g. it seems to be in __init__.py's __all__, some folks might consider that public API, so it might break someone's script if they do a from pyproject_hooks import BackendInvalid).

I am happy to remove the class if the project is OK with taking the (potential) backwards incompatibility. Alternative, also happy to change the docstring to "deprecated" or something like that. What is your take on that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added 7bb74e2 in response to this review comment.
But I am also happy to completely remove the class or do something different.

Just let me know and I will implement the change.

Copy link
Member

@pradyunsg pradyunsg Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... I don't see anyone using this on GitHub so it seems reasonably safe to remove.

We can bump the major version since this is backwards-incompatible (and version numbers are cheap), and then we don't have to bother with having dead code in the codebase that no one relies on. The fix from the users' perspective is relatively straightforward here too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to be in __init__.py's __all__, some folks might consider that public API,

FWIW, I'd be surprised if anyone doesn't consider it public API. 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problems, in a318f8e, I removed the unused exception class.

In the case a deprecation cycle is preferred we can just revert this commit.

@pradyunsg
Copy link
Member

Closes (potentially?) #45

This definitely closes that, no? I'll let you edit the description, but my reading is that it definitely does change this behaviour.

@abravalheri
Copy link
Contributor Author

This definitely closes that, no? I'll let you edit the description, but my reading is that it definitely does change this behaviour.

Perfect, I updated the description. Thank you very much for the confirmation @pradyunsg.

@pradyunsg pradyunsg changed the title Improve/fix? backend loading from backend-path Improve backend loading from backend-path Aug 8, 2023
@abravalheri
Copy link
Contributor Author

abravalheri commented Aug 18, 2023

I have updated the PR to take into consideration external feedback I received about PathFinder (i.e. to avoid using PathFinder with nested modules).

Luckily, importlib machinery is smart enough to find nested modules based on the parent's path.
I copied the if "." in fullname trick from editables.

@pradyunsg pradyunsg merged commit 084b02e into pypa:main Aug 19, 2023
18 checks passed
@pradyunsg
Copy link
Member

Thanks @abravalheri! ^.^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants