Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpreter constraints do not play well with lock files. #12200

Closed
jsirois opened this issue Jun 14, 2021 · 19 comments
Closed

Interpreter constraints do not play well with lock files. #12200

jsirois opened this issue Jun 14, 2021 · 19 comments

Comments

@jsirois
Copy link
Contributor

jsirois commented Jun 14, 2021

Say I have interpreter constraints "CPython>=3.6,<4" in-play and I want to generate a lockfile for a resolve. There needs to be a lockfile per-interpreter these constraints select since requirements can have environment markers and these can cause an interpreter-specific resolve. Since environment markers are so broad in scope, patch version of interpreters can have different resolves and even the same patch version and same platform can have a different resolve due to fields like platform_version and platform_release: https://www.python.org/dev/peps/pep-0508/#environment-markers

Put more simply, clearly if I have a concrete interpreter in-hand I can run a resolve for it and then generate a lock file for it. If I don't, I can't *.

With interpreter constraints I may only have a subset of possible interpreters though and so I can only generate a subset of the needed lockfiles. In the leading example, say I try to create the lockfile on a machine with just CPython 3.6.5 on June 13th 2021. I will generate a lockfile for that interpreter and check it in. Now, say, 2 months later I go back to that commit to re-build things, but on a machine with only CPython 3.9.1. I have no lockfile, and so it will need to be regenerated. The problem here is that the lockfile may not be close at all to the one generated 2 months ago. Many new versions of distributions may have been published and this can result in the new CPython 3.9.1 lockfile behaving quite differently than the CPython 3.6.5 lock file. Towards the worst end of this, the new behavior could be buggy or broken.

  • Technically there may be a way to perform a much too large resolve that ignores environment markers and some wheel tags to collect all possible distributions needed for all possible interpreters in an IC range.
@jsirois
Copy link
Contributor Author

jsirois commented Jun 14, 2021

@Eric-Arellano this struck me and it seemed good to write it down. You may or may not want to include this in your design doc, but it clearly seems a problem to contend with in some way. Whether that means outlawaing constraints being used with lockfiles or ... I'm not sure.

@stuhood
Copy link
Member

stuhood commented Jun 14, 2021

Hm, yea... this is an interesting case.

Working in our favor is the possibility that users might view this as "not having the 'right'/'consistent' interpreter" for their lockfile, rather than the other way around?

@jsirois
Copy link
Contributor Author

jsirois commented Jun 16, 2021

Working in our favor is the possibility that users might view this as "not having the 'right'/'consistent' interpreter" for their lockfile, rather than the other way around?

I'm not so sure. It was the user after all that wrote down the "CPython>=3.6,<4" interpreter constraints. Presumably that meant they considered CPython 3.6 through CPython 3.9 were all the right interpreters. Assuming they meant what they said I'd guess they have the opposite reaction: I told Pants this should work for "CPython>=3.6,<4" and yet, when I asked Pants to generate the lockfile it did 1/4 of the job without warning or failure. So I checked in the lockfile assuming Pants did the whole job at the time and got burned 2 months later. (Here 1/4 is generous since 3.6, 3.7, 3.8, 3.9 each subdivide further per the patch version (environment marker python_full_version) and the other more esoteric environment markers a resolve can divide on).

Afaict this is not an easy problem unless maybe we can side-step it in some creative (???) or brute way (disallow ICs + lockfile combo). It really seems like the Python ecosystem has shot itself in the foot with environment markers being overly broad. There really is no way to generate a lockfile usable with anything other than 1 exact interpreter. I think in practice you mostly can, but you definitely cannot on paper unless you do the asterisk thing above and perform an over-resolve.

@stuhood
Copy link
Member

stuhood commented Jul 2, 2021

I talked more with @Eric-Arellano about this yesterday. My conclusion at least was that Pants should effectively treat a range as a declaration by the user that all of the versions that can be matched by that range are "effectively equivalent" for them. Yes, it's been very common for people to do things like CPython>=3.6,<4, but I think that users are learning that Python is changing fast enough that open ended ranges like that are foolhardy.

Instead, if folks are trying to test/package with multiple versions that they know aren't equivalent, they'll need to create multiple binaries/tests (optionally via a macro, e.g. https://blog.pantsbuild.org/python-3-migrations#running-python-2-and-python-3-in-parallel).

So: I think that this is fine. Perhaps we can do something to detect and warn about overly wide ranges, but it doesn't seem like we need to execute multiple resolves per range.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 3, 2021

My conclusion at least was that Pants should effectively treat a range as a declaration by the user that all of the versions that can be matched by that range are "effectively equivalent" for them.

That makes sense I think for all? uses of ICs except lock files. A lock file says we have a consistent resolve independent of the time we go to resolve things or the (valid) interpreter we're using at the moment. If that lock file is only written for one minor version in a range that covers >1 minor version, well, we've failed the whole lock file concept and not in fact locked things.

It seems to me the only right thing to do with an IC range that covers more than one minor version is to fail to generate the lock file on any machine that doesn't cover the range. If you want to lock things for a project, you had better have a dev env setup to cover the project requirements. That seems totally reasonable to me and just means we'd need a few things:

  • Be able to calculate CPython>=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,<=3.9 -> {CPython 2.7, CPython3.6, CPython3.7, CPython 3.8, CPython 3.9} or fail.
  • Always fail CPython>=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,<4 since <4 is open ended - things break the minute 3.10 is released for example. The failure can note the remedy is picking a fully closed range.
  • If the full required set of pythons is present locally, generate a lock file for each or a lock file that covers each (depends how you format the lock file).

@stuhood
Copy link
Member

stuhood commented Jul 6, 2021

My conclusion at least was that Pants should effectively treat a range as a declaration by the user that all of the versions that can be matched by that range are "effectively equivalent" for them.

That makes sense I think for all? uses of ICs except lock files. A lock file says we have a consistent resolve independent of the time we go to resolve things or the (valid) interpreter we're using at the moment. If that lock file is only written for one minor version in a range that covers >1 minor version, well, we've failed the whole lock file concept and not in fact locked things.

To be clear: generating the lockfile is one thing: consuming it is another. If you generate the lockfile on one minor Python version with a reasonably tight range, there is a very high probability (implied by "reasonably tight range") that that lockfile will work fine when consumed by any other Python within the range. And that's sufficient to provide you the safety and stability guarantees of a lockfile, afaict?

Critically, the file should not be re-generated if you change Pythons within the range (this is the reason for us to treat the range as a declaration that the Pythons in that range are equivalent). So it will only be regenerated if the input requirements/platforms/ranges change.

It seems to me the only right thing to do with an IC range that covers more than one minor version is to fail to generate the lock file on any machine that doesn't cover the range.

Given the above, I do not think that we need to "cover the range". We can lean on the declaration of equivalency.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 12, 2021

Using #12312 for illustration on Pants itself, I want to highlight what your declaration of equivalency actually means in practice for a - I assume we agree - "reasonable" IC range - Pants' itself:

$ ./pants --python-setup-interpreter-constraints="['==3.7.*']" lock src:: tests::
08:41:02.59 [INFO] Completed: Building pip_compile.pex with 1 requirement: pip-tools==6.2.0
08:41:11.17 [INFO] Completed: Generate lockfile for 19 requirementses: PyYAML<5.5,>=5.4, ansicolors==1.1.8, fasteners==0.16, freezegun==1.1.0, humbug==0.2.6, packaging==20.9, pex==2.1.42, psutil==5.8.0, pytest<6.3,>=6.0.1, request... (222 characters truncated)
08:41:11.17 [INFO] Wrote lockfile to 3rdparty/python/lockfile.txt
$ mv 3rdparty/python/lockfile.txt 3rdparty/python/lockfile.txt.37
$ ./pants --python-setup-interpreter-constraints="['==3.8.*']" lock src:: tests::
08:41:46.13 [INFO] Completed: Building pip_compile.pex with 1 requirement: pip-tools==6.2.0
08:41:56.68 [INFO] Completed: Generate lockfile for 19 requirementses: PyYAML<5.5,>=5.4, ansicolors==1.1.8, fasteners==0.16, freezegun==1.1.0, humbug==0.2.6, packaging==20.9, pex==2.1.42, psutil==5.8.0, pytest<6.3,>=6.0.1, request... (222 characters truncated)
08:41:56.68 [INFO] Wrote lockfile to 3rdparty/python/lockfile.txt
$ mv 3rdparty/python/lockfile.txt 3rdparty/python/lockfile.txt.38
$ ./pants --python-setup-interpreter-constraints="['==3.9.*']" lock src:: tests::
08:42:41.35 [INFO] Completed: Building pip_compile.pex with 1 requirement: pip-tools==6.2.0
08:42:51.35 [INFO] Completed: Generate lockfile for 19 requirementses: PyYAML<5.5,>=5.4, ansicolors==1.1.8, fasteners==0.16, freezegun==1.1.0, humbug==0.2.6, packaging==20.9, pex==2.1.42, psutil==5.8.0, pytest<6.3,>=6.0.1, request... (222 characters truncated)
08:42:51.35 [INFO] Wrote lockfile to 3rdparty/python/lockfile.txt
$ mv 3rdparty/python/lockfile.txt 3rdparty/python/lockfile.txt.39
$ diff3 3rdparty/python/lockfile.txt.3*
====
1:2c
  # This file is autogenerated by pip-compile with python 3.7
2:2c
  # This file is autogenerated by pip-compile with python 3.8
3:2c
  # This file is autogenerated by pip-compile with python 3.9
====1
1:97,102c
  importlib-metadata==4.6.1 \
      --hash=sha256:079ada16b7fc30dfbb5d13399a5113110dab1aa7c2bc62f66af75f0b717c8cac \
      --hash=sha256:9f55f560e116f8643ecf2922d9cd3e1c7e8d52e683178fecd9d08f6aa357e11e
      # via
      #   pluggy
      #   pytest
2:96a
3:96a
====1
1:272,274c
      # via
      #   -r requirements.in
      #   importlib-metadata
2:266c
3:266c
      # via -r requirements.in
====1
1:279,282c
  zipp==3.5.0 \
      --hash=sha256:957cfda87797e389580cb8b9e3870841ca991e2125350677b2ca83a0e99390a3 \
      --hash=sha256:f5812b1e007e48cff63449a5e9f4e7ebea716b4111f9c4f9a645f91d579bf0c4
      # via importlib-metadata
2:270a
3:270a

Afaict this is no bueno. Not from a locked resolve standpoint (I don't want dep drift to shoot me in the foot tomorrow) and certinly not from a security (supply chain) standpoint.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 12, 2021

To summarize this a bit more, if a Pants dev on a modern machine (has Python 3.8 or 3.9 1st on the PATH or as the only python3) generates a lock file for Pants, that lockfile will be missing entries for importlib-metadata and zipp and so both of those are subject to possible float or hijack until sometime later someone with a 3.7 generates the lockfile.

This is why it seems to me we only have 2 options:

  1. Fail lockfile generation unless we "cover the range" *.
  2. Disallow ICs.
  • Even covering the range is not good enough. Enironment markers allow selecting dependencies based on effectively arbitrary constrains (for example platform_version: https://www.python.org/dev/peps/pep-0508/#environment-markers). I think though, in practice (a TC database query could prove this) that python_version is the dominant marker in use. I have never seen python_full_version used in the wild nor the more exotic markers. It's only for this reason that I think 1 could be good enough as opposed to simply declaring equivalency which the above shows just doesn't actually work in practice (minor version-dependent backports are a common case for one). We could be absolutely sure though here by post-processing a resolve and looking at all the resolved dists metadata to see what environment markers were actually used and if anything other than python_version, we could then fail the lock operation even if the IC range was covered since, post resolve, we have proof we also needed to cover environment marker X which we could not.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 12, 2021

I feel like I've repeated this argument several times; so I'm not sure what I'm missing or how to resolve. The summary is that python dist resolution allows ~arbitrary interpreter specific dep selection and this necessarily means a resolve for one interpreter is never guaranteed to be correct for another interpreter (unless you can post inspect any environment markers actually used). That fact combined with the two roles of a lockfile afaict leads to all the rest. Perhaps we disagree on the role of a lockfile.

@stuhood
Copy link
Member

stuhood commented Jul 12, 2021

Using #12312 for illustration on Pants itself, I want to highlight what your declaration of equivalency actually means in practice for a - I assume we agree - "reasonable" IC range - Pants' itself:

This is a really good example, and helps to prove your point... I am definitely surprised to see shifts in resolves within that range. So maybe it is the case that we need to cover the range in terms of major versions, and have a lockfile per major version. I'm definitely sketched out by this from a performance perspective.

Regardless of whether you take the "declaration of equivalency" approach or the "cover the range" approach, it still seems like everything points to encouraging users to narrow their ranges significantly. Maybe the "cover the range" approach is safer, because rather than warning for the purposes of safety, we can warn for the purposes of performance ("[warn] Hey, did you know that >=3.7,<3.10 will result in 3 different resolves/lockfiles? Consider using a narrower range, or set --resolve-width-threshold=4 to silence this warning.")

@Eric-Arellano
Copy link
Contributor

This is a really good example

An excellent example indeed. #12316 fails because I generated the lockfile with Python 3.9 so importlib-metadata is left off, and our CI running Py37 fails. Note that I can't run Py37 on my M1, which is a great example of my concern here. How, for example, would John generate a lockfile for macOS machines when he wants to bump some requirements? Having to ping a macOS user would be frustrating.

I was intentionally taking a brief punt on this issue so that I could land some of the less controversial stuff and get more inspiration. I suspect we'll need to solve this sooner.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 12, 2021

I'm definitely sketched out by this from a performance perspective.

You shouldn't be. Pex already resolves for every discovered interpreter on a machine that fits an IC range today (in parallel). We should be able to hit exactly that perf profile here too.

...it still seems like everything points to encouraging users to narrow their ranges significantly.

Maybe, but I think there is no way to avoid the fact that there simply will be use cases that use large ranges. We cannot tell those folks "don't do that".

Maybe the "cover the range" approach is safer, because rather than warning for the purposes of safety, we can warn for the purposes of performance ("[warn] Hey, did you know that >=3.7,<3.10 will result in 3 different resolves/lockfiles? Consider using a narrower range, or set --resolve-width-threshold=4 to silence this warning.")

I really think we should do post-resolve processing of dist-info/METADATA and use Requires-Python and Requires-Dist metadata to exactly determine the breadth of validity of a lock. I derailed things a bit with the "cover the range" terminology. Its actually about covering the environment range as selected by environment markers. Its just that the most common environment marker to use only picks out python minor versions (python_version). Whether we warn or fail can be debated, but certainly that can be an Pants option.

I suspect we'll need to solve this sooner.

Yes. Alot of design effort has been focused on UX - there are actual thorny fundamental does it even work issues to sort out though before even getting to that. A swing in focus is needed to make sure we ship something that at base works.

@stuhood
Copy link
Member

stuhood commented Jul 12, 2021

You shouldn't be. Pex already resolves for every discovered interpreter on a machine that fits an IC range today (in parallel). We should be able to hit exactly that perf profile here too.

Not quite: lint/test use

:param internal_only: Whether we ever materialize the Pex and distribute it directly
to end users, such as with the `binary` goal. Typically, instead, the user never
directly uses the Pex, e.g. with `lint` and `test`. If True, we will use a Pex setting
that results in faster build time but compatibility with fewer interpreters at runtime.
, which has Pants choose a single interpreter to use in order to bypass that fanout.

Agreed on the rest.

@Eric-Arellano
Copy link
Contributor

I really think we should do post-resolve processing of dist-info/METADATA and use Requires-Python and Requires-Dist metadata to exactly determine the breadth of validity of a lock. I derailed things a bit with the "cover the range" terminology. Its actually about covering the environment range as selected by environment markers. Its just that the most common environment marker to use only picks out python minor versions (python_version). Whether we warn or fail can be debated, but certainly that can be an Pants option.

@jsirois are you envisioning that the end result is a single lockfile? Consider pantsbuild/pants w/ >=3.7,<3.10. So, we try to run pip-compile with 3.7, 3.8, and 3.9. Assume that works: what do we do w/ the 3 resulting files and their conflict? Fwict a few options:

  1. Check in 3 distinct lockfiles.
  2. Try to automatically merge them.
  3. Have user manually merge, similar to what you have to do with generate_constraints.sh.

1 has a complication that Pants then needs to know how to consume the distinct lockfiles, like when to load Py37 vs Py38 lockfile for this loose range.

For 2, iirc, you mentioned we could teach Pex and/or pip-compile to generate the lockfile against multiple interpreters in a single run? It could add any necessary env markers.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 13, 2021

My experiment over in Pex uses 1 json file with a nested object for the original requirements strings and then a nested object for each interpreter / platform containing the interpreter / platforms lock info. Pex can consume that mono-lockfile and pick out the relevant interpreter / platform section when reading the lockfile to do a resolve later.

So ... I think that's just a different format for 1. Instead of 3 distinct files, 1 file containing 3 distinct sections.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 13, 2021

This example I have laying around only has 1 lock, but I think you get the idea:

$ cat lock.json | jq .
{
  "pex_version": "2.1.42",
  "requirements": [
    "requests"
  ],
  "resolves": {
    "manylinux_2_33_x86_64-cp-39-cp39": [
      {
        "name": "certifi",
        "sha256": "50b1e4f8446b06f41be7dd6338db18e0990601dce795c2b1686458aa7e8fa7d8",
        "source_artifact": null,
        "version": "2021.5.30"
      },
      {
        "name": "chardet",
        "sha256": "f864054d66fd9118f2e67044ac8981a54775ec5b67aed0441892edb553d21da5",
        "source_artifact": null,
        "version": "4.0.0"
      },
      {
        "name": "idna",
        "sha256": "b97d804b1e9b523befed77c48dacec60e6dcb0b5391d57af6a65a312a90648c0",
        "source_artifact": null,
        "version": "2.10"
      },
      {
        "name": "requests",
        "sha256": "c210084e36a42ae6b9219e00e48287def368a26d03a048ddad7bfee44f75871e",
        "source_artifact": null,
        "version": "2.25.1"
      },
      {
        "name": "urllib3",
        "sha256": "753a0374df26658f99d826cfe40394a686d05985786d946fbe4165b5148f5a7c",
        "source_artifact": null,
        "version": "1.26.5"
      }
    ]
  }
}

@Eric-Arellano
Copy link
Contributor

@jsirois If we were to go with 3 distinct requirements-txt style lockfiles, how feasible do you think it'd be to teach Pex when to use what? For example, if we were to use pip-tools suggested naming format:

Note that if you are deploying on multiple Python environments (read the section below), then you must commit a seperate output file for each Python environment. We suggest to use the {env}-requirements.txt format (ex: win32-py3.7-requirements.txt, macos-py3.6-requirements.txt, etc.).

Could we have Pants feed all relevant constraints to Pex, and then Pex chooses which to use from the set? Rather than Pants having to determine which to use.

--

I'm trying to scope things to determine if it's feasible to stick with pip-compile or if we need to start using Pex's proposed proprietary format.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 14, 2021

Have you figured out how to post-process a resolve's dist-info/METADATA outside Pex / using pip-compile yet? It seems that nugget comes before formats.

Eric-Arellano added a commit that referenced this issue Jul 16, 2021
…Lambdex, and Protobuf MyPy (#12357)

These tools are simple to support: they run entirely independently of user code, and they get their interpreter constraints from a subsystem rather than from user code.

*Black is temporarily not used because it does not play nicely with interpreter constraints: #12200.

### Tools that still need lockfiles

Need to consider user interpreter constraints:

* Bandit
* Flake8
* setuptools
* pip-tools

Need to be mixed into user lockfiles (or have shading):

* MyPy
* Pylint
* Pytest
* IPython(?)
Eric-Arellano added a commit that referenced this issue Jul 20, 2021
…ons` (#12371)

To robustly handle lockfiles and interpreter constraints (#12200), we should generate a lockfile for each major/minor Python version. Regardless of the lockfile format we choose from #12362, this will be useful to do.

This new helper function allows us to convert a range like `>=3.6,<3.9` into a set of ICs like `[==3.6.*", "==3.7.*", "==3.8.*"]`. We can then iterate over each element and generate a lockfile for those ICs.

This code is careful to handle patch versions correctly. Many users have reported setting things like `!=3.6.3`, e.g. to avoid system Python, and we need to preserve that.

[ci skip-rust]
@Eric-Arellano
Copy link
Contributor

Poetry robustly handles environment markers, including for interpreter constraints and platforms: #12470 (comment)

If we end up using Poetry for generation, I think this issue is moot.

Eric-Arellano added a commit that referenced this issue Aug 13, 2021
…-compile (#12549)

**Disclaimer**: This is not a formal commitment to Poetry, as we still need a more rigorous assessment it can handle everything we need. Instead, this is an incremental improvement in that Poetry handles things much better than pip-compile. 

It gets us closer to the final result we want, and makes it much more ergonomic to use the experimental feature—like `generate_all_lockfiles.sh` now not needing any manual editing. But we may decide to switch from Poetry to something like pdb or Pex.

--

See #12470 for why we are not using pip-compile. 

One of the major motivations is that Poetry generates lockfiles compatible with all requested Python interpreter versions, along with Linux, macOS, and Windows. Meaning, you no longer need to generate the lockfile in each requested environment and manually merge like we used to. This solves #12200 and obviates the need for #12463.

--

This PR adds only basic initial support. If we do decide to stick with Poetry, some of the remaining TODOs:

- Handle PEP 440-style requirements.
- Hook up to `[python-setup]` and `[python-repos]` options.
- Hook up to caching.
- Support `--platform` via post-processing `poetry.lock`: #12557
- Possibly remove un-used deps/hashes to reduce supply chain attack risk: #12458

--

Poetry is more rigorous than pip-compile in ensuring that all interpreter constraints are valid, which prompted needing to tweak a few of our tools' constraints.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants