-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alphabetic ordering of packages #491
Comments
The main problem with the current setup is that when you add a new package the lock package is shuffled, this makes With an alphabetically sorted setup, it's a breeze to audit the lock file changes. I hope you consider and accept this proposal. |
I suspect that this is an artifact of the old explicit format, where the package URLs are listed without the dependencies. And because the dependency info is not included with the explicit format, the only way to convey the correct installation order is to topologically sort the dependencies beforehand. With the new format, since topological sorting is fast and easy, I see no reason to do it beforehand. (@mariusvniekerk, please correct me in case I'm missing something.) I'd be happy to consider a PR. (I think it should be really trivial.) |
Thanks a lot @FelixSchwarz for looking through the history. It really seems like toposort was chosen simply to fix the package order. It seems to me like as long as we ensure that explicit lockfiles are toposorted, then there should be no problem with sorting the unified YAML format alphabetically. Also, when toposort is computed breadth-first like here, then we could easily track the |
So one thing that is done with the sorting is that it explicity uses the same topographical sort as |
It should be relatively simple to perform alphabetic (maybe (platform, manager, package)) ordering on serializing the Lockfile to disk and to just call conda-lock/conda_lock/lockfile/v2prelim/models.py Lines 74 to 75 in f64b74e
|
We should agree on an ordering of the keys. I actually had in mind Putting |
With categories, is it possible to have multiple versions of the same package? Then we also have to sort by category/version/build. |
@baszalmstra, nope, there's a single solution having fixed versions over all categories. Then categories allow for selecting a subset of packages within that single solution. |
Thanks @baszalmstra for the excellent suggestion! This is now available in v2.5.0. |
Checklist
What is the idea?
I noticed that
conda-lock.yml
files are topologically sorted.I was wondering why this was chosen over alphabetic sorting. It feels to me that alphabetically sorting the packages makes the format more stable compared to topological sorting. If a package version changes and it adds or removes a dependency this might change the order of packages in the format resulting in a large chance when viewing the diff.
Why is this needed?
It would be ideal that small changes of the lock file would also result in small changes in a diff/patch. Large changes cause reviews to skip reviewing lock-file changes while the small chance might be significant.
Since the dependencies of packages are also stored, the topological sorting of packages can relatively easily be reconstructed from the file itself (Mamba already does this, rattler doesn't need it).
What should happen?
Instead of sorting the packages in the file topologically we sort them by name, by platform, by version. I think this will result in the smallest possible diff when:
Which I think are the most common causes of change.
Additional Context
Would love to hear your thoughts! :)
The text was updated successfully, but these errors were encountered: