-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modern installer eats a lot of RAM when installing big wheels #7983
Comments
Found the culprit being in validate_record. More specifically, it looks like the garbage collector can't stand behind self._zipfile.read(item). The RAM increases while the line is executed but it also seems to go quite down by placing a breakpoint on the next line. I'm ensure what's the best fix for this, although a quick workaround for Poetry would be to call |
If I'm not mistaken, Poetry already computes a hash on each wheel record in a more memory-friendly way (whether that's validated or not is yet unclear to me), so it looks like it's executing operations twice. Setting My proposal is then to permanently set |
I would think that pypa/installer can be taught not to eat all that RAM. It reads the files in the wheel for two reasons
The size can be read directly from the zipfile's ZipInfo, there's no need to open the file at all for that. The hash can be calculated incrementally, there's no need to read the whole file into memory. |
It might be ok to turn it off until there is an improvement in
Interesting. IIUC, the calculation is done there to create a RECORD file from the content of the wheel (because we can't rely on the included RECORD file 😉). I suppose it's not validated. Either, we are using |
That's exactly what's happening. But I also confirm that the sha256 of files inside the archive is currently computed twice:
To get the fastest installation, I would keep hash computation disabled on validation and find a way to use the computed hashes during install to raise warnings. It should be possible hacking around the |
let's prefer to try and propose improvements to pypa/installer that work for poetry. That's playing a longer game, but gets to a better place. eg perhaps it could expose an API for wheel installation that trusted the RECORD already in the wheel - that would seem a very reasonable thing to do for users who have validated that the RECORD is correct (Perhaps the second calculation is nearly-free anyway, given that it happens while already copying the bytes around?) |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
-vvv
option) and have included the output below.Issue
Poetry's modern installer is using a lot of RAM when installing big wheels, such as the PyTorch one (>1 GB) that I provided in the example. The RAM usage starts increasing while Poetry is
Installing...
the wheel, and it spikes over 10 GB of RAM used while performing the operation. The RAM usage goes back to acceptable values before theInstalling...
operation is done.Deactivating the modern installer (
poetry config installer.modern-installation false
) fixes the problem. The RAM usage does not go beyond 1 GB there.I haven't debugged deeply enough to tell if it's a Poetry issue or an
pypa/installer
issue, but for sure Poetry is affected by it. This is bad for CI environments where resources are more limited and currently causepoetry install
to go OOM.cc #6409
The text was updated successfully, but these errors were encountered: