Add unified lockfile, pip interoperability #124

jvansanten · 2021-11-11T13:57:19Z

This PR does two major things that I have personally found useful:

Adds support for pip dependencies (See Support solving for dependencies that are only installable via pip #122)
Adds a structured, multi-platform lock file (See also Allow customization and extension of lockfile metadata headers #106, as well as New lock file format mamba-org/mamba#1209)

Along the way, it factors details of conda invocation and environment solving into dedicated submodules for easier reuse. This is particular important for the PyPI solver, which uses poetry internally, but could potentially be replaced with something else.

Point 2 conflicts with existing work by @maresb in #106, which I had missed when I started on this, and so may have unintentionally duplicated. I would be happy to merge these efforts, as it appears that we made different choices, both for good reasons. In the following I will try to explain my choices.

My long-term goal for this is to be able to automatically update conda environments (including pip reps) used in CI with Renovate. For that I need to be able to:

Solve for pip dependencies layered on top of the conda env.
Update the current solution to use a newer version of a single target package, without upgrading packages outside the dependency graph of the target.
Install different subsets of dependencies without re-solving (e.g. development reps in CI, main deps in production).
Be able to specify all configuration info (such as channels and target platforms) in the source files

Requirement 1 is addressed by including an (optional) pypi solver using poetry as described in #122.

Requirement 2 could be partially addressed using the existing env and explicit formats, but is significantly easier when it can be parsed in a structured way, especially if the parser has to be written again in TypeScript for Renovate. Requirement 3 implies an intermediate file containing all dependencies that can be filtered at install time.

Requirement 4 is addressed by adding support for some extra sections (platforms in environment.yaml, tool.conda-lock.platforms and tool.conda-lock.channels in pyproject.toml) in source files.

The new lock file is conda-lock.toml, for no better reason than that is straightforward to extract information from TOML files with awk. YAML would also be fine. The lock contents are mostly a flat list of packages of the form:

[[package]]
name = "photospline"
version = "2.0.7"
manager = "conda"
platform = "linux-64"
url = "https://conda.anaconda.org/conda-forge/linux-64/photospline-2.0.7-py39ha552708_0.tar.bz2"
hash = "a8fcc6d1c0e2525159bd83643b2e636c"
optional = false
category = "main"

[package.dependencies]
cfitsio = ">=3.470,<3.471.0a0"
libgcc-ng = ">=9.3.0"
libstdcxx-ng = ">=9.3.0"
numpy = ">=1.19.5,<2.0a0"
python = ">=3.9,<3.10.0a0"
python_abi = "3.9.* *_cp39"
suitesparse = ">=5.7.2,<6.0a0"

This unified lock file is created implicitly by every invocation of conda-lock lock. If -k explicit or -k env is specified, the contents are also converted into either explicit or env formats for use with conda env create. There is also a conda-lock render command that produces an explicit or env lock file for each platform, filtering by platform, optional, and category. conda-lock install -f conda-lock.toml internally renders for the current platform, then passes the rendered content on to conda-lock.

Finally, conda-lock lock --update TARGET extracts the previous solution from the lock file, uses it to populate the metadata of a fake conda env, and updates the target package and any dependencies that need to be bumped. The procedure is similar for updating pypi dependencies.

What do you think?

In the absence of an external interface to pip's resolver (see e.g. pypa/pip#7819), this uses Poetry's resolution logic to convert pip requirements from environment.yaml to either transitive dependencies (in the case of env output) or direct references (in the case of explicit output). In explicit mode these are emitted as comment lines that `conda-lock install` can unpack and pass to `pip install` inside of the target environment.

Dependencies marked with `source = "pypi"` are delegated to the pip section of the generated conda env; all others are assumed to be available from conda channels.

Pip only for now; conda update support is slightly trickier

Signficantly faster and less memory-hungry than fetchting the entire index, at least for conda-forge.

conda-lock now records its solution in conda-lock.toml, in a form roughly inspired by poetry.lock. Each entry has a platform and a category (e.g. "main" or "dev"), which allows you to extract a solution for a target platform and extras set without re-solving. This can be done either with `conda-lock render`, creating an environment or explicit lockfile that can be installed with vanilla conda, or `conda-lock install` to render and install in one go. This makes it possible for `conda-lock` to take all its configuration from the source file.

Do not attempt to combined locked dependencies for different platforms, as they can legitimately have different versions (e.g. libgfortran5 for linux-64 is 6 major versions ahead of osx-64). Instead, allow exactly one platform, url, and hash per item.

conda_lock/src_parser/pyproject_toml.py

conda_lock/conda_lock.py

mariusvniekerk · 2021-11-12T15:20:14Z

conda_lock/conda_solver.py

+
+
+@contextmanager
+def fake_conda_environment(locked: Iterable[LockedDependency], platform: str):


Neat solution

We could also potentially make use of $PREFIX/conda-meta/pinned (https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-pkgs.html#preventing-packages-from-updating-pinning) if we want to prevent movement.

I had forgotten about explicit pinning. It looks like conda and micromamba can be convinced to apply minimal updates, but I did end up resorting to pinning to make minimal updates work with mamba. What I really want is a kind of advisory pinning, i.e. "do not update this unless it prevents you from updating the actual target," but that probably only exists deep in the libsolv configuration that mamba uses.

yeah thats probably a far future thing. Maybe a @wolfv question

For weird reasons I sometimes cannot comment under a comment in github. Anyways, @mariusvniekerk I didn't quite get the

yeah thats probably a far future thing. Maybe a @wolfv question

What did you mean exactly? :)

I think that was referring to #124 (comment), which got chopped up between review rounds.

Yeah. Its not necceary for this change. There is probably some work needed on the mamba side to better support that for our purposes.

conda_lock/conda_solver.py

conda_lock/invoke_conda.py

conda_lock/pypi_solver.py

mariusvniekerk · 2021-11-12T16:32:14Z

Tests are failing due to assumptions of a py39 stdlib. Python 3.7 is still under support for conda-forge so we have to keep that one at least as a minimal python version requirement.

jvansanten · 2021-11-12T16:33:04Z

Tests are failing due to assumptions of a py39 stdlib. Python 3.7 is still under support for conda-forge so we have to keep that one at least as a minimal python version requirement.

Right, that was me being lazy. Will clean up.

Co-authored-by: Marius van Niekerk <marius.v.niekerk@gmail.com>

Now that regro/cf-scripts#1465 is merged

maresb · 2021-11-14T21:33:14Z

This looks very interesting. I'm very interested in what @mariusvniekerk would see as a good way forward between our two PRs and the discussion of lockfile formats. Whatever the direction is, I hope we can make some rich metadata for the lockfile format. Would it make sense for me to finish out my PR?

Unneeded now that there is a structured lockfile

jvansanten · 2021-11-19T10:47:16Z

I'm reasonably happy with this now. It took a lot of incremental commits to get the tests to pass on Windows, so squashing is probably advisable.

This makes it possible for 3rd-party tools to find a single, canonically-named lock file and use that to find the arbitrarily-named source files it was created from.

in an attempt to get relative paths to work on Windows

PosixPath.resolve() happily returns an absolute path to a nonexistant file if it has no parents, but WindowsPath.resolve() just stops if strict=False. This causes os.path.commonpath to choke on the resulting relative path.

mariusvniekerk · 2021-11-19T21:59:31Z

🎆 Wow congratulations on slaying the beast

maresb · 2021-11-19T22:19:47Z

Ya, this looks extremely impressive.

mariusvniekerk · 2021-11-22T19:27:20Z

@jvansanten Feel free to squash whatever you want to squash. Going to go over this thing this afternoon.

conda_lock/conda_solver.py

conda_lock/src_parser/environment_yaml.py

conda_lock/src_parser/lockfile.py

conda_lock/conda_lock.py

mariusvniekerk · 2021-11-22T20:22:16Z

conda_lock/conda_solver.py

+
+
+@contextmanager
+def fake_conda_environment(locked: Iterable[LockedDependency], platform: str):


yeah thats probably a far future thing. Maybe a @wolfv question

conda_lock/pypi_solver.py

wolfv · 2021-11-23T14:50:56Z

For weird reasons I sometimes cannot comment under a comment in github. Anyways, @mariusvniekerk I didn't quite get the

yeah thats probably a far future thing. Maybe a @wolfv question

What did you mean exactly? :)

jvansanten · 2021-11-24T10:12:04Z

For weird reasons I sometimes cannot comment under a comment in github. Anyways, @mariusvniekerk I didn't quite get the

yeah thats probably a far future thing. Maybe a @wolfv question

What did you mean exactly? :)

I think that was referring to #124 (comment), which got chopped up between review rounds.

jvansanten added 17 commits October 20, 2021 15:10

Support pip dependencies from poetry

532a807

Dependencies marked with `source = "pypi"` are delegated to the pip section of the generated conda env; all others are assumed to be available from conda channels.

Preliminary update support

b4fb601

Pip only for now; conda update support is slightly trickier

Enable update support for conda

67e95e1

Get index info from tarballs

9ad0895

Signficantly faster and less memory-hungry than fetchting the entire index, at least for conda-forge.

Allow exact versions

bf39929

Allow platforms to be specified in pyproject.toml

de8849b

Make poetry dependency optional

fecb7cc

Refactor Lockfile as pydantic model

aace69d

Simplify multi-platform solution

8f57c77

Do not attempt to combined locked dependencies for different platforms, as they can legitimately have different versions (e.g. libgfortran5 for linux-64 is 6 major versions ahead of osx-64). Instead, allow exactly one platform, url, and hash per item.

Support osx-arm64

5631a0a

Handle non-JSON error output from micromamba

79102a8

Parse lockfile on render

03beb2e

Parse lockfile on render

c8452f7

Ignore editable pip deps

d058f02

Merge remote-tracking branch 'upstream/main' into unified-lockfile

2a5425e

mariusvniekerk reviewed Nov 12, 2021

View reviewed changes

jvansanten and others added 7 commits November 12, 2021 17:48

Treat None as empty update

8a7fbb0

Co-authored-by: Marius van Niekerk <marius.v.niekerk@gmail.com>

Use Mapping for read-only dicts

323ec0c

Remove typing-extensions mapping

8c85671

Now that regro/cf-scripts#1465 is merged

Explicitly check for conda manager

63a8cf5

Remove stray spec_hash() left over from testing

07a61f8

Treat None as empty update

445f8ee

ci: add poetry to requirements

a716ed7

jvansanten added 2 commits November 15, 2021 15:07

Add shim for functools.cache on py < 3.9

f1e866d

Remove _get_repodata_for_package

3cc7806

Unneeded now that there is a structured lockfile

jvansanten added 6 commits November 19, 2021 14:25

Record paths of source files in locked metadata

010c89e

This makes it possible for 3rd-party tools to find a single, canonically-named lock file and use that to find the arbitrarily-named source files it was created from.

cleanup: remove unused imports

c37cae7

Use version validation on install

8e326d1

Use native paths to find relative paths

2abb8ff

in an attempt to get relative paths to work on Windows

Require source files to exist

372c14f

PosixPath.resolve() happily returns an absolute path to a nonexistant file if it has no parents, but WindowsPath.resolve() just stops if strict=False. This causes os.path.commonpath to choke on the resulting relative path.

tests: ensure fake source file exists

7dcc4de

mariusvniekerk reviewed Nov 22, 2021

View reviewed changes

conda_lock/conda_solver.py Outdated Show resolved Hide resolved

mariusvniekerk reviewed Nov 22, 2021

View reviewed changes

conda_lock/src_parser/environment_yaml.py Outdated Show resolved Hide resolved

mariusvniekerk reviewed Nov 22, 2021

View reviewed changes

jvansanten added 7 commits November 24, 2021 11:05

Switch lockfile format from toml to yaml

f8041e2

Clarify relationship between LINK and FETCH

91ed7a4

Add type annotations for parse_conda_requirement

831bca2

Document extensions to environment.yml

545cfdd

Take a lockfile path from the command line

1f5f224

Use source file paths from lockfile by default

df566dc

Add universal tags for all osx variants

9ac7af6

mariusvniekerk merged commit 754d754 into conda:main Nov 29, 2021

jvansanten mentioned this pull request Dec 9, 2021

Support solving for dependencies that are only installable via pip #122

Closed

itamarst mentioned this pull request Dec 13, 2021

Channels list should match that of the original environment.yaml simplistix/picky-conda#3

Closed

scottyhq mentioned this pull request Jan 11, 2022

update schedule section of cookiecutter uwhackweek/jupyterbook-template#41

Merged

scottyhq mentioned this pull request Feb 1, 2022

Documented basic usage in README out of sync with released pypi version (0.13.2) #138

Closed

thomasjpfan mentioned this pull request Feb 10, 2022

Pin dependencies version in CI with lock files with a bot that auto-updates the lockfiles scikit-learn/scikit-learn#22425

Closed

mariusvniekerk mentioned this pull request Mar 3, 2022

Support pip packages and pip GH installs in a conda environment.yaml file list. #4

Closed

matthewfeickert mentioned this pull request Oct 13, 2023

Provide support or recommendation for how to interact with conda-lock lockfiles jupyterhub/repo2docker#1312

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unified lockfile, pip interoperability #124

Add unified lockfile, pip interoperability #124

jvansanten commented Nov 11, 2021 •

edited

Loading

mariusvniekerk Nov 12, 2021

mariusvniekerk Nov 12, 2021

jvansanten Nov 16, 2021

mariusvniekerk Nov 22, 2021

mariusvniekerk Nov 24, 2021

mariusvniekerk commented Nov 12, 2021

jvansanten commented Nov 12, 2021

maresb commented Nov 14, 2021 •

edited

Loading

jvansanten commented Nov 19, 2021

mariusvniekerk commented Nov 19, 2021

maresb commented Nov 19, 2021

mariusvniekerk commented Nov 22, 2021

mariusvniekerk Nov 22, 2021

wolfv commented Nov 23, 2021

jvansanten commented Nov 24, 2021



		@contextmanager
		def fake_conda_environment(locked: Iterable[LockedDependency], platform: str):

Add unified lockfile, pip interoperability #124

Add unified lockfile, pip interoperability #124

Conversation

jvansanten commented Nov 11, 2021 • edited Loading

mariusvniekerk Nov 12, 2021

Choose a reason for hiding this comment

mariusvniekerk Nov 12, 2021

Choose a reason for hiding this comment

jvansanten Nov 16, 2021

Choose a reason for hiding this comment

mariusvniekerk Nov 22, 2021

Choose a reason for hiding this comment

mariusvniekerk Nov 24, 2021

Choose a reason for hiding this comment

mariusvniekerk commented Nov 12, 2021

jvansanten commented Nov 12, 2021

maresb commented Nov 14, 2021 • edited Loading

jvansanten commented Nov 19, 2021

mariusvniekerk commented Nov 19, 2021

maresb commented Nov 19, 2021

mariusvniekerk commented Nov 22, 2021

mariusvniekerk Nov 22, 2021

Choose a reason for hiding this comment

wolfv commented Nov 23, 2021

jvansanten commented Nov 24, 2021

jvansanten commented Nov 11, 2021 •

edited

Loading

maresb commented Nov 14, 2021 •

edited

Loading