From adc69cbe067bc8c8329dadca6ee195ff7b497eb1 Mon Sep 17 00:00:00 2001 From: Pradyun Gedam Date: Fri, 11 Feb 2022 15:06:43 +0000 Subject: [PATCH] Add a dedicated topic page on "Secure Installs" This moves the content about hash checking mode into here, along with adding minor additional recommendation of using `--no-binary :all:`. --- docs/html/cli/pip_install.rst | 158 +----------------- .../reference/requirements-file-format.md | 2 +- docs/html/topics/index.md | 1 + docs/html/topics/secure-installs.md | 100 +++++++++++ 4 files changed, 104 insertions(+), 157 deletions(-) create mode 100644 docs/html/topics/secure-installs.md diff --git a/docs/html/cli/pip_install.rst b/docs/html/cli/pip_install.rst index 43e0ebfea86..cfff4f7e270 100644 --- a/docs/html/cli/pip_install.rst +++ b/docs/html/cli/pip_install.rst @@ -239,164 +239,10 @@ Wheel Cache This is now covered in :doc:`../topics/caching`. -.. _`hash-checking mode`: - -Hash-Checking Mode +Hash checking mode ------------------ -Since version 8.0, pip can check downloaded package archives against local -hashes to protect against remote tampering. To verify a package against one or -more hashes, add them to the end of the line:: - - FooProject == 1.2 --hash=sha256:2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 \ - --hash=sha256:486ea46224d1bb4fb680f34f7c9ad96a8f24ec88be73ea8e5a6c65260e9cb8a7 - -(The ability to use multiple hashes is important when a package has both -binary and source distributions or when it offers binary distributions for a -variety of platforms.) - -The recommended hash algorithm at the moment is sha256, but stronger ones are -allowed, including all those supported by ``hashlib``. However, weaker ones -such as md5, sha1, and sha224 are excluded to avoid giving a false sense of -security. - -Hash verification is an all-or-nothing proposition. Specifying a ``--hash`` -against any requirement not only checks that hash but also activates a global -*hash-checking mode*, which imposes several other security restrictions: - -* Hashes are required for all requirements. This is because a partially-hashed - requirements file is of little use and thus likely an error: a malicious - actor could slip bad code into the installation via one of the unhashed - requirements. Note that hashes embedded in URL-style requirements via the - ``#md5=...`` syntax suffice to satisfy this rule (regardless of hash - strength, for legacy reasons), though you should use a stronger - hash like sha256 whenever possible. -* Hashes are required for all dependencies. An error results if there is a - dependency that is not spelled out and hashed in the requirements file. -* Requirements that take the form of project names (rather than URLs or local - filesystem paths) must be pinned to a specific version using ``==``. This - prevents a surprising hash mismatch upon the release of a new version - that matches the requirement specifier. -* ``--egg`` is disallowed, because it delegates installation of dependencies - to setuptools, giving up pip's ability to enforce any of the above. - -.. _`--require-hashes`: - -Hash-checking mode can be forced on with the ``--require-hashes`` command-line -option: - -.. tab:: Unix/macOS - - .. code-block:: console - - $ python -m pip install --require-hashes -r requirements.txt - ... - Hashes are required in --require-hashes mode (implicitly on when a hash is - specified for any package). These requirements were missing hashes, - leaving them open to tampering. These are the hashes the downloaded - archives actually had. You can add lines like these to your requirements - files to prevent tampering. - pyelasticsearch==1.0 --hash=sha256:44ddfb1225054d7d6b1d02e9338e7d4809be94edbe9929a2ec0807d38df993fa - more-itertools==2.2 --hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0 - -.. tab:: Windows - - .. code-block:: console - - C:\> py -m pip install --require-hashes -r requirements.txt - ... - Hashes are required in --require-hashes mode (implicitly on when a hash is - specified for any package). These requirements were missing hashes, - leaving them open to tampering. These are the hashes the downloaded - archives actually had. You can add lines like these to your requirements - files to prevent tampering. - pyelasticsearch==1.0 --hash=sha256:44ddfb1225054d7d6b1d02e9338e7d4809be94edbe9929a2ec0807d38df993fa - more-itertools==2.2 --hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0 - - -This can be useful in deploy scripts, to ensure that the author of the -requirements file provided hashes. It is also a convenient way to bootstrap -your list of hashes, since it shows the hashes of the packages it fetched. It -fetches only the preferred archive for each package, so you may still need to -add hashes for alternatives archives using :ref:`pip hash`: for instance if -there is both a binary and a source distribution. - -The :ref:`wheel cache ` is disabled in hash-checking mode to -prevent spurious hash mismatch errors. These would otherwise occur while -installing sdists that had already been automatically built into cached wheels: -those wheels would be selected for installation, but their hashes would not -match the sdist ones from the requirements file. A further complication is that -locally built wheels are nondeterministic: contemporary modification times make -their way into the archive, making hashes unpredictable across machines and -cache flushes. Compilation of C code adds further nondeterminism, as many -compilers include random-seeded values in their output. However, wheels fetched -from index servers are the same every time. They land in pip's HTTP cache, not -its wheel cache, and are used normally in hash-checking mode. The only downside -of having the wheel cache disabled is thus extra build time for sdists, and -this can be solved by making sure pre-built wheels are available from the index -server. - -Hash-checking mode also works with :ref:`pip download` and :ref:`pip wheel`. -See :doc:`../topics/repeatable-installs` for a comparison of hash-checking mode -with other repeatability strategies. - -.. warning:: - - Beware of the ``setup_requires`` keyword arg in :file:`setup.py`. The - (rare) packages that use it will cause those dependencies to be downloaded - by setuptools directly, skipping pip's hash-checking. If you need to use - such a package, see :ref:`Controlling - setup_requires `. - -.. warning:: - - Be careful not to nullify all your security work when you install your - actual project by using setuptools directly: for example, by calling - ``python setup.py install``, ``python setup.py develop``, or - ``easy_install``. Setuptools will happily go out and download, unchecked, - anything you missed in your requirements file—and it’s easy to miss things - as your project evolves. To be safe, install your project using pip and - :ref:`--no-deps `. - - Instead of ``python setup.py develop``, use... - - .. tab:: Unix/macOS - - .. code-block:: shell - - python -m pip install --no-deps -e . - - .. tab:: Windows - - .. code-block:: shell - - py -m pip install --no-deps -e . - - - Instead of ``python setup.py install``, use... - - .. tab:: Unix/macOS - - .. code-block:: shell - - python -m pip install --no-deps . - - .. tab:: Windows - - .. code-block:: shell - - py -m pip install --no-deps . - -Hashes from PyPI -^^^^^^^^^^^^^^^^ - -PyPI provides an MD5 hash in the fragment portion of each package download URL, -like ``#md5=123...``, which pip checks as a protection against download -corruption. Other hash algorithms that have guaranteed support from ``hashlib`` -are also supported here: sha1, sha224, sha384, sha256, and sha512. Since this -hash originates remotely, it is not a useful guard against tampering and thus -does not satisfy the ``--require-hashes`` demand that every package have a -local hash. +This is now covered in :doc:`../topics/secure-installs`. Local Project Installs ---------------------- diff --git a/docs/html/reference/requirements-file-format.md b/docs/html/reference/requirements-file-format.md index f70bffc77e3..cf1d434eb6a 100644 --- a/docs/html/reference/requirements-file-format.md +++ b/docs/html/reference/requirements-file-format.md @@ -111,7 +111,7 @@ The options which can be applied to individual requirements are: - {ref}`--install-option ` - {ref}`--global-option ` -- `--hash` (for {ref}`Hash-Checking mode`) +- `--hash` (for {ref}`Hash-checking mode`) ## Referring to other requirements files diff --git a/docs/html/topics/index.md b/docs/html/topics/index.md index 01037044bc3..011205a111d 100644 --- a/docs/html/topics/index.md +++ b/docs/html/topics/index.md @@ -16,5 +16,6 @@ configuration dependency-resolution local-project-installs repeatable-installs +secure-installs vcs-support ``` diff --git a/docs/html/topics/secure-installs.md b/docs/html/topics/secure-installs.md new file mode 100644 index 00000000000..f012842b2ac --- /dev/null +++ b/docs/html/topics/secure-installs.md @@ -0,0 +1,100 @@ +# Secure installs + +By default, pip does not perform any checks to protect against remote tampering and involves running arbitrary code from distributions. It is, however, possible to use pip in a manner that changes these behaviours, to provide a more secure installation mechanism. + +This can be achieved by doing the following: + +- Enable {ref}`Hash-checking mode`, by passing {any}`--require-hashes` +- Disallow source distributions, by passing {any}`--only-binary :all: <--only-binary>` + +(Hash-checking mode)= + +## Hash-checking Mode + +```{versionadded} 8.0 + +``` + +This mode uses local hashes, embedded in a requirements.txt file, to protect against remote tampering and network issues. These hashes are specified using a `--hash` [per requirement option](per-requirement-options). + +Note that hash-checking is an all-or-nothing proposition. Specifying `--hash` against _any_ requirement will activate this mode globally. + +To add hashes for a package, add them to line as follows: + +``` +FooProject == 1.2 \ + --hash=sha256:2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 \ + --hash=sha256:486ea46224d1bb4fb680f34f7c9ad96a8f24ec88be73ea8e5a6c65260e9cb8a7 +``` + +### Additional restrictions + +- Hashes are required for _all_ requirements. + + This is because a partially-hashed requirements file is of little use and thus likely an error: a malicious actor could slip bad code into the installation via one of the unhashed requirements. + + Note that hashes embedded in URL-style requirements via the `#md5=...` syntax suffice to satisfy this rule (regardless of hash strength, for legacy reasons), though you should use a stronger hash like sha256 whenever possible. + +- Hashes are required for _all_ dependencies. + + If there is a dependency that is not spelled out and hashed in the requirements file, it will result in an error. + +- Requirements must be pinned (either to a URL, filesystem path or using `==`). + + This prevents a surprising hash mismatch upon the release of a new version that matches the requirement specifier. + +### Forcing Hash-checking mode + +It is possible to force the hash checking mode to be enabled, by passing `--require-hashes` command-line option. + +This can be useful in deploy scripts, to ensure that the author of the requirements file provided hashes. It is also a convenient way to bootstrap your list of hashes, since it shows the hashes of the packages it fetched. It fetches only the preferred archive for each package, so you may still need to add hashes for alternatives archives using {ref}`pip hash`: for instance if there is both a binary and a source distribution. + +### Hash algorithms + +The recommended hash algorithm at the moment is sha256, but stronger ones are allowed, including all those supported by `hashlib`. However, weaker ones such as md5, sha1, and sha224 are excluded to avoid giving a false sense of security. + +### Multiple hashes per package + +It is possible to use multiple hashes for each package. This is important when a package offers binary distributions for a variety of platforms or when it is important to allow both binary and source distributions. + +### Interaction with caching + +The {ref}`locally-built wheel cache ` is disabled in hash-checking mode to prevent spurious hash mismatch errors. + +These would otherwise occur while installing sdists that had already been automatically built into cached wheels: those wheels would be selected for installation, but their hashes would not match the sdist ones from the requirements file. + +A further complication is that locally built wheels are nondeterministic: contemporary modification times make their way into the archive, making hashes unpredictable across machines and cache flushes. Compilation of C code adds further nondeterminism, as many compilers include random-seeded values in their output. + +However, wheels fetched from index servers are required to be the same every time. They land in pip's HTTP cache, not its wheel cache, and are used normally in hash-checking mode. The only downside of having the wheel cache disabled is thus extra build time for sdists, and this can be solved by making sure pre-built wheels are available from the index server. + +### Using hashes from PyPI (or other index servers) + +PyPI (and certain other index servers) provides a hash for the distribution, in the fragment portion of each download URL, like `#sha256=123...`, which pip checks as a protection against download corruption. + +Other hash algorithms that have guaranteed support from `hashlib` are also supported here: sha1, sha224, sha384, sha256, and sha512. Since this hash originates remotely, it is not a useful guard against tampering and thus does not satisfy the `--require-hashes` demand that every package have a local hash. + +## Repeatable installs + +Hash-checking mode also works with {ref}`pip download` and {ref}`pip wheel`. See {doc}`../topics/repeatable-installs` for a comparison of hash-checking mode with other repeatability strategies. + +```{warning} +Beware of the `setup_requires` keyword arg in {file}`setup.py`. The (rare) packages that use it will cause those dependencies to be downloaded by setuptools directly, skipping pip's hash-checking. If you need to use such a package, see {ref}`controlling setup_requires `. +``` + +## Do not use setuptools directly + +Be careful not to nullify all your security work by installing your actual project by using setuptools' deprecated interfaces directly: for example, by calling `python setup.py install`, `python setup.py develop`, or `easy_install`. + +These will happily go out and download, unchecked, anything you missed in your requirements file and it’s easy to miss things as your project evolves. To be safe, install your project using pip and {any}`--no-deps`. + +Instead of `python setup.py install`, use: + +```{pip-cli} +$ pip install --no-deps . +``` + +Instead of `python setup.py develop`, use: + +```{pip-cli} +$ pip install --no-deps -e . +```