-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Upstream Issue] Libgit2 and transitively Stacked Git do not support sparse checkouts #195
Comments
Thanks for writing up this issue, @NonLogicalDev. It would be nice if libgit2 supported sparse checkout natively. That said, it doesn't seem like that project has the will to take on the risk of sparse checkout any time soon. When reimplementing StGit in Rust, a key design decision was how to interact with the git repository. Initially, I tried orienting almost entirely to libgit2--it is fast and its abstractions are pretty nice. However, as I implemented more and more of StGit, I ran into a bunch of compatibility problems; tests would mostly pass, but various edge cases would fail. Patch application and merge resolution (in various contexts) just don't work correctly. So where libgit2 didn't get the job done, I reverted to using Coming back to sparse checkouts, it might be that there is yet another subset of operations that StGit is currently doing with libgit2 that could be converted to Perhaps the line we can draw for libgit2's use in StGit is to only use it as an object and configuration database. I.e. looking up refs, commits, trees, blobs, and config, but any significant mutations of repository or working tree state should be done by I can think of a couple of concrete next steps for this issue:
|
If you do not use worktrees for your sparse checkout, the documentation implies that it should be fine to delete that configuration option ( A coworker of mine has a fork of Rust's git2 which implements (ignores? not sure) the |
@arxanas, you seem to be pointing out a detail related to how the sparse checkouts feature intersects with the multiple worktrees feature, which is interesting and potentially useful toward the intersection of those two features in StGit, but does not provide a solution path for StGit support of sparse checkouts. I just want to make sure I'm not missing something here. |
@jpgrayson For the end user, you can restore StGit support for sparse checkouts, assuming you don't also use multiple worktrees for that same checkout, by deleting the problematic Or you as the developer can update StGit to use the forked |
My assumption is that StGit has plethora of problems with sparse checkouts independent of its intersection with multiple worktrees. I had performed a quick test scenario where I enabled sparse checkouts on a small subset of a linux kernel repo using a vanilla/default worktree. The patch resulting from Performing that same scenario again, but with From my point of view, there are mostly orthogonal paths toward solving sparse checkouts with StGit and multiple worktrees with StGit, and solving the major sparse checkouts issues does not require using a libgit2 patched to comprehend per-worktree configs. I still feel like I may be missing a point you're trying to make, @arxanas, so apologies if I'm not understanding your point. |
@arxanas I had the same experience as @jpgrayson when testing on my repo, I tried both building the PR that ignored per Worktree Config extension, and just un-setting the WorkTree Config Extension enable switch. In read only the experience was butter smooth, like it ought to be, the problems start when you try to interact with the actual index. I.e. At that point LibGit really goes off the rails, as it thinks that all of the It is a difficult tradeoff to make, either support partial functionality (i.e. do not support the sparse checkouts) and keep the blazing speed of LibGit. I really like the rust version, it is so darn fast! Big props @jpgrayson. I just wish I could use it for work haha. |
Sorry, I misunderstood the scope of this issue to be about crashing only. My own project git-branchless more or less worked after deleting I use the hybrid approach where libgit2 is mostly an object database, and I shell out to Feel free to copy any code from there which you find useful. Maybe of note:
Well, I think you will still need to deal with the fact that |
Thanks for this additional context, @arxanas. And thanks for the links to |
This relates to sparse checkout support in StGit (#195). Since libgit2 does not support git's sparse checkout feature, when getting worktree file status via the git2 API in a sparsely checked-out worktree will indicate that all known files outside the sparse checkout cone(s) are deleted. This is the first of several problems preventing `stg refresh` from working correctly in the context of a sparsely checked-out worktree. The next problem is that when building the index for the refresh, the git2 API is further tripped-up and "adds" deletion entries for all the out-of-cone files. By using `git` (via the stupid module) for status and index operations, `stg refresh` is made to work with sparse checkout. The stupid module is converted from a single file into a directory (stupid/mod.rs) in order to make room for some higher level abstractions for status and changed files. The new stupid::status module provides abstractions for handling the output of `git status --porcelain=v2`. Care is taken to minimize allocations and defer processing. Allocations are minimized by Statuses holding onto the status output buffer (the only allocation) and providing references into that buffer for its various methods. A similar approach is used in stupid::diff, which provides an abstraction over the modified file list coming from `git diff-tree --name-only`. To make `stg pop` work, worktree and index status checks are now performed using `git status` instead of libgit2. Eventually, all StGit commands will need to use this approach to avoid problems in sparse checkout worktrees.
This relates to sparse checkout support (#195). Conflict checks and index/worktree cleanness checks are converted to using Stupid::statuses() for all StGit commands. This eliminates a common class of problem that can prevent StGit from working in a sparse checkout worktree.
This affects transaction-time checkout and the various checkouts performed by `stg branch`. This further improves StGit support for sparse checkout worktrees (#195).
The new Stupid::with_temp_index() method replaces the Index::with_temp_index() and Index::with_temp_index_file() extension methods. This is motivated by sparse checkouts (#195) since git2::Index does not comprehend sparse checkouts.
This relates to sparse checkout support (#195). git2 does not comprehend sparse checkouts and thus using any index <=> worktree operations is likely to break sparse checkouts. Most important index operations have already been converted to using git commands, this gets all the remaining instances. N.B. in run_pre_commit_hook(), GIT_INDEX_FILE is no longer being supplied explicitly. In all of StGit's use cases either using the default index or inheriting GIT_INDEX_FILE from the calling environment should be okay.
The new t7000-sparse-checkout.sh test script exercises various StGit commands in the context of a sparse checkout worktree. Relates-to: #195
I've done some work on this problem. See above commits. For all the use cases I've tried, StGit now seems to work in a sparse checkout worktree. I'd love to know if this works for your use cases, @NonLogicalDev. Maybe checkout the new Some additional details: The strategy was to use
These changes do seem to affect runtime performance, at least as measured by how long the test suite takes. Perhaps on the order of 10%. This seems like a small price to pay for correctness. Also, these changes shouldn't have much/any affect on the most latency-sensitive StiGit commands, such as |
@jpgrayson ❤️ , let me get to testing this right away! |
I think I may have uncovered another issue:
|
Strange one:
|
I will cut a separate issue with that, I think I just happened to have old |
The stack upgrade path (which is where the "Malformed version 4 meta" error originates) is triggered by the presence of the I'll note that stack metadata version 4 was introduced in StGit 1.0 and stack metadata version 5 was introduced in StGit 1.2. Also, I believe this metadata issue is orthogonal to sparse checkout. You should be able to run StGit on a new branch to test sparse checkout. And you can probably recover from the metadata problem by deleting |
@jpgrayson I played around with new But I had to disable
I will need to do further testing if sparse checkout related commands still work with that setting un-set. |
Thanks for the note about |
Since 0.13, a few changes have been made to git2, including rust-lang/git2-rs#791 which should address the extensions.worktreeConfig issue in #195. The upgrade required fixing uses of [ConfigEntries][1] because it's no longer an iterator. [1]: https://docs.rs/git2/0.15.0/git2/struct.ConfigEntries.html However, it provides an API that lends itself well to `while let` to provide a for loop like experience, so this was an easy fix.
AFAIK StGit is doing okay with sparse checkouts at this point. Closing this issue. We can reopen as necessary. |
The vanilla STG still does not work:
I have created a PR to fix this with a workaround. It has been working without issues for us at my company for the past few months. |
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195 Signed-off-by: Oleg Utkin <oleg@nonlogical.io>
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195 Signed-off-by: Oleg Utkin <oleg@nonlogical.io>
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195 Signed-off-by: Oleg Utkin <oleg@nonlogical.io>
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195 Signed-off-by: Oleg Utkin <oleg@nonlogical.io>
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195 Signed-off-by: Oleg Utkin <oleg@nonlogical.io>
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES stacked-git#195 Signed-off-by: Oleg Utkin <oleg@nonlogical.io>
`libgit2` library by default fails any operation if it encounters any unrecognised extensions configured in a repo, as per Git documentation. ref: https://git-scm.com/docs/repository-version > If a version-1 repository specifies any extensions.* keys that the > running git has not implemented, the operation MUST NOT proceed. > Similarly, if the value of any known key is not understood by the > implementation, the operation MUST NOT proceed. If a repository is using a sparse checkout mode it force enables the `worktreeconfig` extensions which allows setting configuration in repo on a per worktree basis, to enable some of the sparse checkout functionality. Since `stg` is using `libgit2` library only as an interface to object database and for looking up alias lookup from git configuration (i.e. Read Only operation) and uses git cli for everything else, it is relatively safe to ignore per worktree git configuration. This patch adds `worktreeconfig` extension to the `libgit2` extension whitelist. CLOSES #195 Signed-off-by: Oleg Utkin <oleg@nonlogical.io>
Upstream Issues:
extensions.worktreeconfig
forrepositoryformatversion=1
libgit2/libgit2#6044Adding this issue to track upstream issues of LibGit2 related to sparse checkout support.
Our company is using a huge monorepo, and we have recently onboarded onto Git Sparse checkouts, which seem to be unsupported in LibGit2 which Rust version is using.
Despite my attempts at doing custom builds with some PRs I found online for LibGit2, although I could get the Read Only workflow to succeed like
stg series
.I could not get the
stg push
to work as it seems like sparse index is throwing LibGit2 way off.The text was updated successfully, but these errors were encountered: