-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pip resolver backtracking deepness option #10235
Comments
To clarify - are you OK with pip failing after it hits the limit you specify? More precisely:
We've had similar suggestions in the past, but generally people aren't comfortable with one or more of the points noted above, so it's not actually as useful as it seems. Think of it this way. If a "number of backtracks" limit is useful to you, you can get something similar by killing pip after a certain period of time. If you can explain why that's not suitable, you'll probably have an answer as to why a backtrack limit option isn't useful either... |
@pfmoore Let me describe what I feel like what most people want, and OP can correct me if I'm wrong: Let's day Package A and Package B are specified in the requirements. Package A has 1000 possible versions (v1 to v1000) and Package B has 10 possible versions (v1 to v10). A valid install is Package A v998 and Package B v8. But for whatever reason Pip currently backtracks through package A first so checks Package A v1 to v1000 before back tracking through Package B and eventually finding the valid solution. This feels to the user like a waste of time because they know that pip doesn't need to check all 1000 versions of Package A. The idea would then by if I set a "maximum backtrack depth" to 5, then instead the behavior changes so that Pip only checks Package A from v1000 to v996 and only checks Package B from v10 to v6. If a valid solution now exists in that range then pip installs, if a valid solution doesn't exist then pip reports an error that it couldn't install given the requirements and the limits on backtracking. Not sure how easy something like that would be to implement in the current pip architecture, but that's what I think most people are asking for, FYI conda's default package resolution kind of does something similar automatically, they have a "current_repodata.json" which contains all the metadata for only the latest version of packages (a backtracking depth of 1), and if that fails they then download "repodata.json" and backtrack on that (first using a restrictive search to limit the size of the possible dependency tree, and then if that fails using a more flexible search to search all possible satisfiable solutions to the requirements). |
@pfmoore It should be added as optional flag, so of I know that a package that I'm going to install could be a subdependencies mess, The current approach in many situations is not viable and killing pip is not an alternative. The approach I'm talking about would give more control on the installation procedure, avoiding many problems that new resolver with endless attempts to fix incoherent requirements is causing, that in many cases are caused because dependencies of dependencies have requirements that actually not even impact the correct functioning of the target package you want to install, because are used very specific features of dependencies not even affected by changes introduced in a particular version of problematic subdependencies requirements. |
@notatallshaw Thanks for the explanation. I think you're asking for something different, which is an "only consider the latest N versions of any package" flag. That might be useful, I don't think that's something we've tried. It's probably something you could prototype using a proxy index, reading the PyPI pages and removing all but the latest N versions before forwarding them on. That might be a useful experiment (and might be helpful for people in general, even before anything gets implemented in pip). @SMH17 I'm still not sure I follow precisely. You may want to research a bit more into how resolvelib works, as your description doesn't quite match the actual process (we don't do a breadth-first search, precisely, so
Can you be more precise? We know that extended install times aren't viable, but because dependency resolution is NP-hard and because of the way Python packages store their dependencies (a problem conda avoids, to an extent) we can only really work in terms of trade-offs - we can't, much as we would like to, solve this for every use case - there will always be some that will trigger worst-case behaviour. Also, if you can clarify why killing pip isn't an alternative, I'd appreciate it. After all, any "stop after you've tried this many options" setting is just as much a way of killing pip before it completes as a "stop after 20 minutes" flag, or an external "run pip but kill after 20 minutes" script. I don't frankly see why any one is worse than another.
Well, yes. If the requirements are "incoherent", pip will have problems. So will any tool.
We can't do anything about that. If package A says it needs package B, we have to install B. We can't say "well, it looks like you're not using the part of A that needs B, so we'll just ignore that requirement". If you want finer-grained requirements, talk to the authors of A and get them to add extras that let you say you don't need B. But pip has to accept what you (and the package authors) say you need. |
The point is that in checking requirements, more you go deep in dependency tree of a target package you want to install, less is likely that incoherent requirements affect the functions you actually need, so although it's impossible a perfect solution to address the dependencies incoherence, a control on deepness of inspection can give a better compromise. |
It sounds to me like you're somehow hoping that if we have a "deepness" flag, when we hit it, pip would still complete the install, rather than failing. I don't see how that would work. There may be a viable strategy in there, but I can't see what it is, sorry. If someone wants to flesh this out into an actual PR, maybe it would be clearer, but otherwise I'm still confused as to what you're proposing. |
@pfmoore It's not about hoping something. Dependencies requirements incoherence is a very common situation. Now you can let resolver cycle indefinitely for hours just to see it fails to find a configuration that can satisfy all constraints, or don't let resolver try to resolve anything. That isn't the most proper way to handle requirements' incoherence. |
Hmm, I don't think pip's going to do that (if I understand you). It sounds like a fundamental change in the assumptions and expectations on which pip is based. But I may be misunderstanding you. Feel free to provide code demonstrating what you mean (either as a pip PR or as a standalone utility/library) if you think it's an approach with a realistic chance of doing better than pip's current approach. I'm going to add the "awaiting PR" label to this issue to indicate that I think we need working code to make progress here, not just abstract design discussions. |
The new pip backtracking resolver is not viable in many real scenarios causing huge waste of time, because when you try to install a package, it tries to solve dependencies for each dependency requested by the package and this leads to continuous attempts to search most appropriate dependency that in many situations cannot be solved due to incoherent requirements between the package that you want to install and the requirements of some dependency.
It is needed a flag in order to specify the max level the resolver backtracking is allowed to try to fix dependencies conflicts.
The text was updated successfully, but these errors were encountered: