Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of data files when creating venvs. #1632

Merged
merged 6 commits into from
Feb 28, 2022

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Feb 28, 2022

Previously, Pex failed to install distributions containing data files
correctly. The data files were located under site-packages (which is
where package data, aka resources, belong) instead of at the
sys.prefix as mandated by distutils.

See: https://docs.python.org/3/distutils/setupscript.html#distutils-additional-files

Fixes #1630

This allows vendoring to perform the same fixups we do for resolves and
leave vendored distributions with whole metadata.
This uses the RECORD as the basis for determining which files to copy
and where they belong, fixing the issues with data files not being
properly located.
@jsirois
Copy link
Member Author

jsirois commented Feb 28, 2022

Thanks in advance - this is large.

It's not quite as large as it looks though. 1/2 the delta-lines are taken up by the re-vendor.

buffer.write(cast(bytes, line))
first_non_shebang_line = None

def _fixup_record(self, modified_scripts=None):
Copy link
Member Author

@jsirois jsirois Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified_scripts = list(self._fixup_scripts())
self._fixup_record(modified_scripts=modified_scripts)

def _fixup_scripts(self):
Copy link
Member Author

@jsirois jsirois Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# https://github.com/pypa/pip/issues/1150
#
# We also deal with this, accepting either `-` or `_` when a `-` is expected.
version_pattern = re.sub(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N.B.: An existing unit test and another IT that both started failing drove / forced this fix.


# The RECORD is a csv file with the path to each installed file in the 1st column.
# See: https://www.python.org/dev/peps/pep-0376/#record
for line, (path, fingerprint, file_size) in enumerate(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the janky state of some PyPI artifacts, I'm guessing malformed RECORD files may exist, and this unpacking may fail? Possibly worth checking the length of the parsed line before attempting to unpack and skipping instead of failing on any messed up lines?

Copy link
Member Author

@jsirois jsirois Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankfully Pex is immune from all but vendored Pip`s quirks here because Pex uses it to do the install; i.e. vendored Pip writes the RECORD.

"""Represents the Pip installation scheme used for installing a wheel.

For more about installation schemes, see:
https://docs.python.org/3/install/index.html#alternate-installation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right link to put here? The link refers to distutils but these schemes are pip flags, so that link is a bit confusing (to me at least) in this context. E.g., naively, I don't see --root or --target mentioned there but I do see --home ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the definition at any rate. Pip comes up with custom schemes but those are only defined in code.

"""Re-installs the installed wheel in a venv.

N.B.: A record of reinstalled files is returned in the form of an iterator that must be
consumed to drive the installation to completetion.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
consumed to drive the installation to completetion.
consumed to drive the installation to completion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to get this in a follow-up that touches this file. I routinely pass on typo fixes in non-public docs since it burns trees and the proof the typo is not severe is already there in the fact it could be understood enough to be corrected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think typos are worth a CI round, but can be incidentally fixed in other changes.

@jsirois
Copy link
Member Author

jsirois commented Feb 28, 2022

Thanks Benjy. I'll be circling back here as part of #1631 and I'll address your typo feedback and any further feedback from others there or separately.

@jsirois jsirois merged commit 70a2cad into pex-tool:main Feb 28, 2022
@jsirois jsirois deleted the issues/1630 branch February 28, 2022 06:01
@jsirois jsirois mentioned this pull request Feb 28, 2022
2 tasks
jsirois added a commit to jsirois/pex that referenced this pull request Mar 1, 2022
The fix in pex-tool#1632 introduced a regression for the narrow case of `--venv`
mode PEXes using `--no-venv-site-packages-copies` (symlinks) and running
with a `PEX_PATH` including other PEXes with duplicate distributions
where those distributions provide only top-level modules (as opposed to
packages). Fix this issue in `Record.reinstall` and also fix duplicate
handling in `PEX.resolve` and `PEXEnvironment.activate`. Add tests that
cover all of this.

Fixes pex-tool#1637
jsirois added a commit that referenced this pull request Mar 1, 2022
The fix in #1632 introduced a regression for the narrow case of `--venv`
mode PEXes using `--no-venv-site-packages-copies` (symlinks) and running
with a `PEX_PATH` including other PEXes with duplicate distributions
where those distributions provide only top-level modules (as opposed to
packages). Fix this issue in `Record.reinstall` and also fix duplicate
handling in `PEX.resolve` and `PEXEnvironment.activate`. Add tests that
cover all of this.

Fixes #1637
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pex fails to layout venvs properly.
2 participants