Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Pex installed wheels to --prefix scheme. #1661

Merged
merged 5 commits into from
Mar 12, 2022

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Mar 10, 2022

Previously Pex used Pip's --target scheme which had both known bugs
(pypa/pip#7658) and unknown quirks that Pex
was failing to fully be able to work around. Switch to the --prefix
scheme which exactly mirrors the scheme venvs use so that venvs can be
created with content of all sorts placed where it belongs.

This removes fragile parsing and interpretation of the RECORD; now Pex
only creates a RECORD, which is much more straight forward, when
building a venv.

Partially addresses #1655 by switching to sha256 for all external
artifact hashing. Only internal hashing remains for:

  1. interpreters / INTERP-INFO
  2. venvs and unzipped_pexes / PEX-INFO pex_hash (but this is a hash
    that includes all distributions' sha256 hashes).

Fixes #1656
Closes #1662

@jsirois
Copy link
Member Author

jsirois commented Mar 10, 2022

The existing tests proved an excellent gauntlet for all the shuffling in pex/pep_376.py. That said, a few new helpers deserve their own unit tests which I'll be adding here before un-drafting. I wanted to get a CI burn. All 18 test shards (9 IT, 9 Unit) pass on my machine, but there's always macOS and I'd like to suss things out before bugging folks to look at this fairly huge PR.

jsirois added 2 commits March 10, 2022 16:46
Now that PEXes only run in unzipped mode and venv mode, they always
will be isolating from loose sources and the resources copy mechanism
is no longer needed.
Previously Pex used Pip's `--target` scheme which had both known bugs
(pypa/pip#7658) and unknown quirks that Pex
was failing to fully be able to work around. Switch to the `--prefix`
scheme which exactly mirrors the scheme venvs use so that venvs can be
created with content of all sorts placed where it belongs.

This removes fragile parsing and interpretation of the RECORD; now Pex
only creates a RECORD, which is much more straight forward, when
building a venv.

Fixes pex-tool#1656
@jsirois jsirois marked this pull request as ready for review March 11, 2022 21:03
@jsirois
Copy link
Member Author

jsirois commented Mar 11, 2022

Alright - this is good to go for review. Thanks in advance - it's big.

# a custom scheme - we just delete the file and create it on-demand for venv re-installs.
os.unlink(os.path.join(self.prefix_dir, self.rel_base_dir, self.relative_path))

# An example of the installed wheel chroot we're aiming for:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of path gymnastics in this PR. They all center around this final layout. The pip install --prefix scheme is the same as a venv layout:

venv/
  bin/
  include/
  lib/
    pythonX.Y/
      site-packages/
  <data>

The main trick is to use that scheme, since its non-buggy unlike pip install --target, but preserve the zipapp mode semantics by moving all the site-packages to the root and shifting everything remaining under .prefix since dotted names are never Python importable. At venv creation time this re-shuffle is reversed and a valid RECORD written out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I think this would be helpful to include in the comments, including the trick about .prefix being non-importable

).decode("utf-8")


def test_old_venv_tool_vs_new_pex(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test passes before and after this change.

assert b"4.0\n" == subprocess.check_output(args=[os.path.join(venv, "pex"), "--version"])


def test_new_venv_tool_vs_old_pex(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test requires pex.venv.pex falling back when it fails to InstalledWheel.load(...) fails. Fails without that.

assert b"4.0\n" == subprocess.check_output(args=[os.path.join(venv, "pex"), "--version"])


def test_mixed_pex_root(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails without the switch from using sha1 hashes for ~/.pex/installed_wheels to sha256 hashes.

@jsirois
Copy link
Member Author

jsirois commented Mar 11, 2022

N.B..: The only incompatibility here is forward compatiblity for pex-tools venv for versions 2.1.{68,69,70} operating against newer PEXes. There is simply no way to patch that up. All other forward and backward compatibility matrix items are good to go.

@@ -52,6 +53,30 @@ class CompatibilityTags(object):
See: https://www.python.org/dev/peps/pep-0425/#use
"""

@classmethod
def from_wheel(cls, wheel):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, meant to unit test this. I'll hit that in a follow up unless there is other feedback to address since its all heavily IT'd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, enough good feedback to make this worth adding in too.

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's big, but as always, well written and tested. Thank you for diving into the weeds

pex/resolver.py Show resolved Hide resolved
pex/pep_376.py Show resolved Hide resolved
pex/pep_376.py Outdated Show resolved Hide resolved
pex/pep_376.py Show resolved Hide resolved
pex/pep_376.py Outdated Show resolved Hide resolved
pex/pep_376.py Show resolved Hide resolved
pex/pep_376.py Outdated Show resolved Hide resolved
Comment on lines +359 to +370
# We only try to link regular files since linking a symlink on Linux can
# produce another symlink, which leaves open the possibility the src_entry
# target could later go missing leaving the dst_entry dangling.
if link and not os.path.islink(src_entry):
try:
os.link(src_entry, dst_entry)
continue
except OSError as e:
if e.errno != errno.EXDEV:
raise e
link = False
shutil.copy(src_entry, dst_entry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use a helper function for this highlighted code block since it's duplicated with the above. Ack on the link variable being in an outer scope; could maybe have the helper return a bool, then caller updates link based on that.

(FYI these two functions have been the hardest for me to understand - symlinks in general trip me up still.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't understand any of this at all! I run this stuff and poke around in the ~/.pex/ root to view the structure and figure out what's wrong, fix it up. It's amazingly complex to browse, but very instructive. Especially a --venv mode ~/.pex/venvs/ ... structure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see about the factoring. It bothers me too, but its also super-titchy and I spend hours fixing this up after tweaks. Luckily tests bleed volumes when any of this is wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its also super-titchy

Definitely. I at first was going to propose DRYing the rest like the finally structure, but that seemed too complex. These 10 lines looked like the only place that is identical in both functions.

I'll see about the factoring.

Sg. My comment is not blocking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to defer this to a follow up.

# a custom scheme - we just delete the file and create it on-demand for venv re-installs.
os.unlink(os.path.join(self.prefix_dir, self.rel_base_dir, self.relative_path))

# An example of the installed wheel chroot we're aiming for:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I think this would be helpful to include in the comments, including the trick about .prefix being non-importable

@jsirois
Copy link
Member Author

jsirois commented Mar 11, 2022

Makes sense. I think this would be helpful to include in the comments, including the trick about .prefix being non-importable

I guess. This is a very old and very central trick for PEX though! Have you not wondered why .bootstrap and .deps were named as they are?!

@Eric-Arellano
Copy link
Contributor

Have you not wondered why .bootstrap and .deps were named as they are?!

Ah! Wow. I figured it had something to do with being private, I did not make the connection that it disables Python imports.

@jsirois jsirois merged commit d5eaaac into pex-tool:main Mar 12, 2022
@jsirois jsirois deleted the issues/1656 branch March 12, 2022 01:03
jsirois added a commit to jsirois/pex that referenced this pull request Mar 22, 2022
The switch from sha1 to sha256 hashes in pex-tool#1661 did not cover this
feature. Simplify and use the existing machinery to get at a PEX's
distributions.

Fixes pex-tool#1683
@jsirois jsirois mentioned this pull request Mar 22, 2022
jsirois added a commit that referenced this pull request Mar 22, 2022
The switch from sha1 to sha256 hashes in #1661 did not cover this
feature. Simplify and use the existing machinery to get at a PEX's
distributions.

Fixes #1683
jsirois added a commit to jsirois/pex that referenced this pull request Nov 10, 2022
This was all accounted for with `--no-build-isolation` and tox venv
pinning with an extensive comment about all this to boot. For some
reason unclear to me, I undid `--no-build-isolation` in
pex-tool#1661 though.

This just re-instates that flag and re-runs vendoring to get back to
stable wheel versions.
jsirois added a commit that referenced this pull request Nov 10, 2022
This was all accounted for with `--no-build-isolation` and tox venv
pinning with an extensive comment about all this to boot. For some
reason unclear to me, I undid `--no-build-isolation` in
#1661 though.

This just re-instates that flag and re-runs vendoring to get back to
stable wheel versions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Secure Pex against sha1 collision attacks. Problems building venvs from certain distributions.
2 participants