Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from get-pip to ensurepip #951

Closed
edmorley opened this issue Aug 21, 2024 · 5 comments · Fixed by #955
Closed

Switch from get-pip to ensurepip #951

edmorley opened this issue Aug 21, 2024 · 5 comments · Fixed by #955

Comments

@edmorley
Copy link
Contributor

edmorley commented Aug 21, 2024

The images in this repo currently use https://github.com/pypa/get-pip to bootstrap pip, however, all versions of Python supported here now include the ensurepip module (which was added in Python 3.4), which could be used instead (along with a suitable pip install -U command to upgrade to the chosen pinned pip etc):
https://docs.python.org/3/library/ensurepip.html

Switching to ensurepip would remove an external dependency and the script version bumping that comes with it (eg 1980d7b x8 for each upstream get-pip change).

Would you be open to a PR that made that switch?

@tianon
Copy link
Member

tianon commented Aug 21, 2024

Oh, very interesting proposal -- I can't remember why we explicitly disabled usage of ensurepip previously. 🤔

From what I can tell, in 3.12+ it effectively just installs itself (no downloading or other modules), so we could even stop scraping pip versions from it and just "trust" it to do the right thing, perhaps even just by removing our explicit flag that disables it during build/install. 🤔

Does it maybe install to a different path than the explicit/get-pip version? I'm trying to jog my memory as to why that fence is there, but I really am coming up blank. 😅

@edmorley
Copy link
Contributor Author

edmorley commented Aug 21, 2024

There are two ways we could use ensurepip:

  1. By continuing to disable ensurepip during the Python compile, but then running it explicitly in a separate layer afterwards (the same way that get-pip is run in a separate layer)
  2. By re-enabling it during the Python build, and having pip be installed in the same layer as Python again

When I filed this issue I was thinking of (1).

Searching issues/PRs for ensurepip prior art, I found:

So it appears that:

  • The use of the ensurepip (during the Python compile) for Python 3 images meant the implicit packages from get-pip were no longer installed.
  • This was first fixed by running an extra pip install step after the Python compile, to install setuptools and wheel.
  • However, to prevent layer invalidation of the Python layer every time a new pip/setuptools/wheel release occurred, this was then switched to disabling ensurepip during the build and calling get-pip in a later layer for all Python versions. In addition, the way get-pip was updated would mean the images would only get bumped for new pip releases and not also every time setuptools was released.

I think things have changed a fair bit since then:

  • The images no longer default to latest pip/setuptools (and instead the versions that ship with that Python version), so the concern over too many setuptools updates doesn't apply. (xref How about updating setuptools, wheel, pip and installing setuptools-scm from git? #365)
  • Python 2 images are no longer being built (where ensurepip doesn't exist in the stdlib), so ensurepip can be used unconditionally, rather than having the prior complexity of split get-pip/ensurepip codepaths.

As such, using ensurepip does seem viable now? The decision would be whether to use it via approach (1) or (2). Since the "too many setuptools releases" concern doesn't apply any more (given tracking bundled version), approach (2) does seem more viable than it was in the past, too - and would mean fewer layers (if that's still something that people care about).

@tianon
Copy link
Member

tianon commented Aug 23, 2024

Nice, thank you for the detailed digging! This is really helpful. ❤️

Looking at https://docs.python.org/3/library/ensurepip.html, --default-pip seems interesting (and something get-pip.py is currently doing by default). For Python itself, we're currently setting up some symlinks manually, but maybe there's a way we can pass through --default-pip to the Python install process directly so we don't have to add pip to that list? 👀

@edmorley
Copy link
Contributor Author

I have a WIP of this locally, but will wait until #954 merged before continuing with it, to avoid conflicts.

edmorley added a commit to edmorley/python that referenced this issue Aug 30, 2024
Since:
* All versions of Python that are actively built by this repo now
  include the `ensurepip` module.
* The policy of these images is now to use the same pip version as the
  one bundled with `ensurepip` (rather than always upgrading as pip
  releases occur) to avoid breaking changes, and for parity with the
  `venv` module.
* As such, we might as well actually use `ensurepip` to install pip
  (since it installs the exact pip version we want) rather than manually
  doing the same using `get-pip.py`.

Now that the pip/setuptools versions track (or mostly track, in the case
of setuptools) the ensurepip versions, the concerns over frequent
invalidation of the Python layer no longer apply, and so the
pip/setuptools install can now be part of the Python layer, reducing
layer count by one.

This change is a no-op in terms of pip/setuptools/wheel versions,
since the pip versions being used already exactly matched the
`ensurepip` version of pip.

Closes docker-library#951.
@edmorley
Copy link
Contributor Author

I've opened #955 for this, and chose approach (2) from #951 (comment).

There's no rush to look at that PR - I think it makes sense to wait until #954 has been released for a bit to reduce the chance of further churn :-)

edmorley added a commit to edmorley/python that referenced this issue Aug 30, 2024
Since:
* All versions of Python that are actively built by this repo now
  include the `ensurepip` module.
* The policy of these images is now to use the same pip version as the
  one bundled with `ensurepip` (rather than always upgrading as pip
  releases occur) to avoid breaking changes, and for parity with the
  `venv` module.
* As such, we might as well actually use `ensurepip` to install pip
  (since it installs the exact pip version we want) rather than manually
  doing the same using `get-pip.py`.

Now that the pip/setuptools versions track (or mostly track, in the case
of setuptools) the ensurepip versions, the concerns over frequent
invalidation of the Python layer no longer apply, and so the
pip/setuptools install can now be part of the Python layer, reducing
layer count by one.

This change is a no-op in terms of pip/setuptools/wheel versions,
since the pip versions being used already exactly matched the
`ensurepip` version of pip.

Closes docker-library#951.
edmorley added a commit to edmorley/python that referenced this issue Sep 7, 2024
Since:
* All versions of Python that are actively built by this repo now
  include the `ensurepip` module.
* The policy of these images is now to use the same pip version as the
  one bundled with `ensurepip` (rather than always upgrading as pip
  releases occur) to avoid breaking changes, and for parity with the
  `venv` module.
* As such, we might as well actually use `ensurepip` to install pip
  (since it installs the exact pip version we want) rather than manually
  doing the same using `get-pip.py`.

Now that the pip/setuptools versions track (or mostly track, in the case
of setuptools) the ensurepip versions, the concerns over frequent
invalidation of the Python layer no longer apply, and so the
pip/setuptools install can now be part of the Python layer, reducing
layer count by one.

This change is a no-op in terms of pip/setuptools/wheel versions,
since the pip versions being used already exactly matched the
`ensurepip` version of pip.

Closes docker-library#951.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants