Replies: 9 comments 6 replies
-
@tgamblin This is the proposal for the work I want to lock myself in a room and get a jump on next week when there won't be any meetings @rblake-llnl I think you will find this interesting. |
Beta Was this translation helpful? Give feedback.
-
Should this be a discussion instead of an issue? |
Beta Was this translation helpful? Give feedback.
-
@adamjstewart I'm not sure. Do we want to push all feature requests to discussions? If so then definitely yes. |
Beta Was this translation helpful? Give feedback.
-
Should we have a separate repo for proposals like this? e.g., like k8s KEPs: https://github.com/kubernetes/enhancements. It might be useful to add this written step to the design process for larger features, not just for discussions but to give people a chance to see the full idea. I think Spack's getting large enough that the writing will help -- there are a lot of interconnected issues for many of the features. |
Beta Was this translation helpful? Give feedback.
-
Personally I think issues are fine for minor feature requests, but big changes that require discussion and planning might be better as discussions. I think a formal system for SEPs (Spack Enhancement Proposals) sounds like overkill. Anyway, sorry for derailing this conversation. Feel free to hide these comments as off-topic once we make a decision. |
Beta Was this translation helpful? Give feedback.
-
We could call them "Spack Upgrades" just to confuse anyone who has ever used another package manager... |
Beta Was this translation helpful? Give feedback.
-
Converting this to a discussion; I agree large feature requests should be discussions. |
Beta Was this translation helpful? Give feedback.
-
1. The git ref / version congruency assumption is false, but I think it still works!
It seems that However, I think that something like adding a variant To be horribly pedantic because this word is used a lot: I think interchangeably might be a more appropriate replacement than seamlessly. Could be wrong! 2. [nonblocking] Pedantry
Was the intention here to say "Fetching git versions is a previously-solved problem"? The performance concerns described in 2C seem to contradict the "fundamentally easy" part, imho. I could be missing something here.
Seems great here, but I might also recommend looking for prior art on "tracking git versions through other version systems". I'm not sure if svn -> git will have to perform some of this interspersing of versions, for example. But this section looks good.
imho keep this as a followup issue even if we can't do it for v1 of this feature!
One possible way to avoid this type of edge case is to adopt a deterministic strategy like sbt's dynver plugin which records more than just git info (e.g. the string can be used to determine e.g. whether the working tree is dirty, and a timestamp). 3. Counterproposal to "Clone Management" via Fetching Tags Only
Why would cloning for each comparison be necessary? Do we need to have their source code at some point to infer the version? I thought that the list of tags and their mapping to git checksums was defining the version comparison process. Have we investigated methods to avoid performing a clone each time? Proposed v1/v2 tag fetchingI believe there is a v1 and a v2 standard that git servers perform for "ref advertisement", which is a process we can latch onto specifically to avoid reading any other git data. That version is controlled by the git server administrator, but a v2 request can gracefully degrade into a v1 version. Most git servers open HTTP and GIT interfaces, as well as SSH for places like github.
Example v1> git init
> git fetch --tags https://github.com/spack/spack
> git tag
NCSA-v2
efischer/docs-v1
old-install-layout
releases/latest
v0.10.0
... 4. Sync Up DownstreamThe reason I am familiar with git ref checkouts is same reason that led me to create #20359 and #20407 -- because I was extremely familiar with a git FUSE project which made checkouts instantaneous and removed the need for explicit push/pull by syncing new refs remotely as they were created. This article is also useful for describing the aspects of git that can contribute to performance issues (as in the precise sequence of internal changes, not just the individual commands): https://www.atlassian.com/git/tutorials/monorepos. Completely separately from the above, I wanted to use this opportunity to introduce some situations those two PRs could be become useful in, and I think some of your proposal elements were ideal for this.
This seems like a lot of extra error-handling and user behavior we'd need to define (a whole new service protocol of sorts) in order to support a feature that works purely by comparing specs. It reminds me of how java URIs used to contact DNS. 5. #20407 Mirror requires no modification to support VCS, or any other feature.
However, if we used #20407, we would have a few more constraints:
On the Spack user side:
6. #20359 Avoid any Unnecessary I/O
|
Beta Was this translation helpful? Give feedback.
-
Should we be concerned about tags being removed or replaced in a git repo? |
Beta Was this translation helpful? Give feedback.
-
1. Motivation
Many users pin packages to particular git commits, tags, or branches, rather than released versions. This is particularly relevant to developers, who often track several packages through their release processes simultaneously, with versions constraints that propagate through their entire DAG. Therefore, Spack should allow users to seamlessly treat git refs, and other Version Control System (VCS) references, as versions in a Spack spec, while maintaining appropriate comparisons between those versions and the versions, version ranges, and version lists used as control-flow in the Spack package.
2. Approach
Fetching git versions is a fundamentally easy problem -- the only difficulty there is determining to use git for packages that have both git and url download options. The difficulty is how to build from versions that are not directly comparable to the (usually semver) version constraints in the build recipe.
2.A Comparing Versions
For version comparisons, we need to compare a git commit to a semver version. Assuming the release versions are tagged in git, it is feasible to compare them to a git version. To compare semver S to git version G, first we compute the merge-base of S (as tagged in git) with the default branch, MS, and the merge-base of G with the default branch, MG. If MG is an ancestor of MS, G is "less than" S, and vice-versa. If MG and MS are equal, then G is "less than" S if it is an ancestor of S, and vice-versa. If MG = MS and G and S are not directly comparable, then G and S are not comparable versions, and neither is "less than" the other, but they are not equal.
If the release versions are not tagged in git, there is no automatic way to compare them to git versions. However, the package file may contain a translation from release versions to git commits, which would allow the prior algorithm to proceed for such a package.
2.B When to resolve comparisons
There are two primary options for when to resolve comparisons between git versions and release versions.
2.B.i Compare as necessary
The easiest way to check comparisons between git versions and release versions would be to compute the comparison as necessary. This involves the least complex caching (although we may cache individual comparisons with this method).
2.B.ii Compare once, use range
Initial planning for this effort focused on the possibility of computing a "range" for each git version of the closest release
versions that could be ascertained to be "less than" and "greater than" the git commit. While this method initially seemed to create fewer comparisons (and more obvious caching) than the method to compare as necessary, caching individual comparisons in the ad-hoc method should result in similar performance with less complexity of design.
2.C When to clone
While git versioning will be most useful to developers, it is also of use to large-scale coordination projects like the XSDK. For that reason, we will avoid relying on developer packages (specified by the
develop
key in a Spack environment) for version comparisons.However, cloning for each comparison would be prohibitive from a performance perspective, in addition to being problematic on systems without internet access.
To resolve this, we propose to modify the structure of Spack mirrors to better support VCS softare. For each type
of VCS understood by Spack (currently
git
,hg
, andsvn
) we propose a top-level directory under the Spack mirror which contains a single instance of the VCS repo per-package. The builtin Spack mirror (the "source cache") will attempt to pull this repo any time it encounters a git ref it cannot find, as well as any time it checks out a branch. User-made mirrors will be static, and will throw errors on git refs they cannot find, as will the builtin mirror when fetching does not resolve the issue. Spack will attempt to pull from the internet when such an error is raised, but fail gracefully if it cannot be reached.3. Project Breakdown
We propose 3 separate pull requests to Spack
3.A Mirror VCS software separately
This PR will establish the mirror subdirectories for each VCS system, and manage when we pull/fail for unfound git refs for each type of mirror. It will leave for future work the details for non-git VCS systems, while reserving their subdirectory names for that future use.
3.B Automatic interpolation of git versions by FetchStrategy
This PR will allow Spack to (a) determine whether an unknown version is a git reference or an unknown release version, and (b) fetch a package based on a git reference when it is not known to the package file, even when the type of the reference (branch, tag, commit) is unknown. Other VCS systems will be left for future work.
3.C Automatic comparison of git versions to release versions
Implement the algorithm described in section 2.A, along with appropriate caching of results for performance.
3.D Future PRs implied by this work, but not included
3.D.i Mirror HG software separately
3.D.ii Mirror SVN software separately
3.D.iii Automatic interpolation of HG versions
3.D.iv Automatic interpolation of SVN versions
3.D.v Automatic comparison of HG versions to release versions
This will require independent research into the appropriate comparison algorithm
3.D.vi Automatic comparison of SVN versions to release versions
This will require independent research into the appropriate comparison algorithm
Beta Was this translation helpful? Give feedback.
All reactions