-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support the extension for worktree-specific config #6202
Support the extension for worktree-specific config #6202
Conversation
7d84792
to
a20800c
Compare
39d036e
to
f4cc52c
Compare
I've realized while writing up a corresponding PR to rust-lang/git2-rs that inserting a value here could have a detrimental impact on compatibility with existing bindings. Does this need to wait for a version break of some sort? Or do I need to come up with an alternate implementation of If we want to keep the existing int-comparison based overrides, we should probably take the opportunity to insert some wiggle room by making these values multiples of ten instead. |
Pinging on this. What's the next step here? Does this patch break ABI? Is there anything we can actually do about that? |
Thanks for the ping @vermiculus - I'm happy to review this, I should be able to take some time today. You're correct that it does break ABI, and no, I don't think that there's anything that we can do about this. If this is the correct ordering, then we'll give people a heads up in the next release but I'm afraid that there's little else that we can do here. |
Thanks for taking the time 😃 If there's no way to avoid breaking ABI, will it be just as much work for consumers if we change all the enum values to allow for future inserts? GIT_CONFIG_LEVEL_PROGRAMDATA = 10,
GIT_CONFIG_LEVEL_SYSTEM = 20,
// etc. I can't see why or how we'd ever have another value (even at a conceptual level), but I imagine the same was said before |
Coming back around to this. What's the next step here? Expecting I'll need to rebase to resolve the new conflicts, that I don't expect those conflicts to be terribly difficult to resolve. |
Yep, sorry for the conflicts. 😢 If you rebase, I think that it's ready to go. |
f4cc52c
to
7809dbb
Compare
No worries! Conflicts are a sign of active development 😉 (edit: also, holy cow is #6133 exciting! Definitely agree dogfooding is going to be a boon here -- especially for applications that might want to link to a porcelain like https://github.com/magit/libegit2.) Rebased; looks like the conflicts were resolved automatically. This is ready to roll from my end assuming you're good with #6202 (comment) and the remaining open questions in the top post. |
Pinging on this again 😄 |
While testing this, I found the following bug: #6246 After that patch is merged (or integrated here) libgit2 is able to open repos with |
Sorry for the delay, @vermiculus. And thanks, again. |
Hmm. Not sure what’s going on with the tests. It’s possible that this was a problem caused by #6252 but I’ll have to take a closer look tomorrow. |
🤔 Doesn't look like it's related to #6252; these test failures look legitimate. I'm afraid that I will need to refamiliarize myself with the config code to offer any suggestions for how to proceed. |
7809dbb
to
8bda4a6
Compare
a5a6eee includes some logging output for investigation. It appears the search space is null for these two calls:
|
I'm investigating I think I've learned that at Once we get into Continuing investigation on that assumption. |
ec1a85d
to
b1ca747
Compare
I believe the above patch is in the right direction (based on ~three hours' learning about config backends...) though some tests are still failing. I believe this is because 'backends' is not sorted by config-level. I'll try to come back to this in a few hours. |
src/libgit2/config.c
Outdated
return 0; | ||
case BACKEND_USE_DELETE: | ||
/* return the first backend that contains the given config-name */ | ||
backend->backend->get(backend->backend, name, &entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I need to lock/unlock the backend here? Is there a risk of data getting written out from underneath me?
@ethomson Just saw that https://github.com/libgit2/libgit2/labels/v1.8 was added. Cool! As it happens, I'm also looking at the possibility of spending some $DAY_JOB hours on this in the near future. Any chance I could get permission to run the tests again? |
ded2432
to
e1b0bcb
Compare
@ethomson We're hoping to look at this again next week; Well it's not exactly kosher, but I was able to find a way to at least run some of the tests on my fork by merging |
Introduce the logical concept of a worktree-level config. The new value sits between _LOCAL and _APP to allow `git_config_get_*` to 'just work'. The assumption of how `git_config_get_*` works was tested experimentally by setting _WORKTREE to some nonsense value (like -3) and watching the new test fail.
Now that GIT_CONFIG_LEVEL_WORKTREE exists logically, define and load $GIT_DIR/config.worktree into that level.
c513014
to
11a859e
Compare
Now that worktree-level configuration can be read from disk and manipulated in memory, we should be able to say we support 'extensions.worktreeConfig'.
432510f
to
c34c6ed
Compare
This structure provides for cleaner diffs in upcoming commits.
c34c6ed
to
71bb851
Compare
@ethomson I believe this is ready to go – or at least ready for review. The only failures I see in my CI runs are of the form:
and this may even be expected in my somewhat-hacked-together setup (force-pushing to Let me know if there's any way I can help expedite review. Thanks! |
09d26d9
to
daeef15
Compare
It would seem that `get_backend_for_use` is primarily used when writing config data -- either to set keys or delete them (based on the possible values of `backend_use`). When git-config(1) is used for side-effects, it will modify only the local (repository-level) configuration unless explicitly overridden. From git-config(1): --local For writing options: write to the repository .git/config file. This is the default behavior. `get_backend_for_use` does not have the ability to specify a config level and typically is expected (it seems) to 'do the right thing'. Taking its cue from git-config(1), don't update worktree-specific config unless it's the only option. If that functionality is needed by consumers, I assume they would find the appropriate backend with `git_config_open_level` and feed that `git_config` object through to the `git_config_set_*` functions (as demonstrated in the provided test).
When deleting a key from a repository with multiple config backends (like a --local config and a --worktree config), it's important to return the correct backend to modify. This patch ensures that we don't return a backend that is incapable of deleting a given piece of configuration when that is the required use.
Before passing the config key name to backend->get(), it needs to be normalized first. Not all current callers are performing this normalization, so perform it centrally instead. Co-authored-by: Brian Lyles <brianmlyles@gmail.com>
This looks like a simple copy/paste error.
Now that get_backend_for_use can return other error codes (by virtue of key-name normalization), make sure to propagate the appropriate error code when used.
daeef15
to
4cd2d17
Compare
I'm happy with two things intermingled in a single PR, no need to split it out. I'm sorry that I haven't had time to 👀 this yet, but I'm hopeful to do it this weekend. 🙏 |
fixes #6650 |
src/libgit2/config.c
Outdated
/* If we're trying to delete a piece of config, make | ||
sure the backend we return actually defines it in | ||
the first place. */ | ||
if (use == BACKEND_USE_DELETE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to go through and refresh my understanding of our configuration logic. This change makes sense if the perspective of libgit2 is when we say delete a configuration key, then just delete it wherever it is. (And I think that's the semantics that our current config data sort of implies.)
This is different than git config
, which takes a config level (eg git config --global ...
) to operate on.
But I think that this change actually suggests a deeper problem with our API. IIUC, the goal of this change is to go find the config that we should delete a key in. What should we do if there's a multivar split over two configuration files? Instead of trying to find the backend in get_backend_for_use
, should we iterate over all backends?
(Not asking you to make this change, asking for your opinion on the API.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To expand on that — I think that an API that lets you specify a key to delete and the config level to delete it from makes sense. I think that an API that lets you specify a key to delete and then deletes all occurrences across all config levels makes sense. I'm less convinced that an API that lets you specify a key to delete and then deletes the first occurrence makes sense.
I think that we have the latter option — where it deletes the first occurrence — and I wonder if that's really the right API? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should we do if there's a multivar split over two configuration files?
Certainly this case would surprise folks; I would not expect to get different results with repeated calls:
git_config_delete_entry(cfg, "foo.bar"); // deletes --local
git_config_delete_entry(cfg, "foo.bar"); // deletes --global
That just feels 'wrong'. It would feel much more appropriate to say
git_config_delete_entry_at_level(cfg, GIT_CONFIG_LEVEL_LOCAL, "foo.bar"); // deletes --local
git_config_delete_entry_at_level(cfg, GIT_CONFIG_LEVEL_GLOBAL, "foo.bar"); // deletes --global
// with companion functions
git_config_set_string_at_level(cfg, GIT_CONFIG_LEVEL_LOCAL, "foo.bar", "baz");
git_config_get_entry_at_level(&entry, cfg, GIT_CONFIG_LEVEL_LOCAL, "foo.bar");
git_config_get_entry_with_inheritence(&entry, cfg, "foo.bar");
// ^^ this may be git_config_get_entry unchanged?
// etc.
It was surprising to us to learn that the current config API did not work this way. In a vacuum, I would consider git_config_set_string_at_level
to be the correct abstraction for a C library: explicit and unambiguous even if a little verbose.
I feel comfortable enough with the config code at this point to tackle such a thing in a separate PR if that's a direction you'd like to go. (In fact, I've rather been enjoying this codebase – it's a nice change of pace for me 😄.) It probably deserves a more robust design process, though.
It would also make sense to me to incorporate the worktree-config support only into that new API if that helps resolve concerns with this PR, but that will come with different considerations for doing due diligence to ensure consumer applications continue to 'work reasonably correctly'. (Although maybe that's not as high of a concern in this context as long as we make the appropriate deprecation warnings; I've not worked with this particular kind of project in a FOSS setting before.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not positive but my thinking here is that maybe we should keep things mostly as they are but, yes, avoid surprise:
I think that the "simple" git_config
functions should probably mirror the git command line itself. Reads return the "effective" configuration (applying the config files in order of priority and giving a result.) But we change writes/deleted to always refer to the local.
If somebody wants a specific level then they can open that file and operate on it directly. I don't think we necessarily need APIs to read and write at a specific level directly, I think that people can probably go get the config file that they're interested in.
But this rather presupposes that git's model is right here — and I'm increasingly convinced that it actually is.
The rather tricky part is the user configuration(s). We may need to be more thoughtful than "always write to local".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was reviewing this comment once more before writing up what this would look like
I think that the "simple" git_config functions should probably mirror the git command line itself. Reads return the "effective" configuration (applying the config files in order of priority and giving a result.) But we change writes/deleted to always refer to the local.
I think that's a good instinct as it would follow some principle of least surprise.
What I'm not sure of is what is meant by this:
The rather tricky part is the user configuration(s). We may need to be more thoughtful than "always write to local".
By 'user configurations', do you mean things like --global
and --system
config?
Here's what I'm going to propose:
For reads: keep the current implementation and use the most specific level available to us.
For writes:
- If there is exactly one config backend, use it.
- If
--local
is available, use it.- Signal an error (as
git
does).
The alternative to (3) (which might be the 'more thoughtful' approach, as best as I can imagine one) would be to set the most specific level of config available to us if --local
is not available, but I rather think that might be too smart for this 'plumbing' function. I do think the need of 'set this at the system/global level' would be met with git_config_open_level
followed by git_config_set_string
following the logic proposed above. (Some wrapper 'porcelain' could be devised as above, but it does seem like that would be low-value API.)
I have some errands to run, but I hope to have a patch here in a few hours' time. I'll just plop one commit on top so there are two versions to easily compare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed c82bef3 which takes the approach outlined above. Before the final version, I would like to add a test that git_config_open_level(&out, cfg, GIT_CONFIG_LEVEL_LOCAL)
only ever returns a config with one backend (even in degenerate cases) since I'm now relying on that behavior in the base case of get_backend_for_use
.
Note that tests introduced in #4217 are non-trivially altered (@pks-t), but I believe the test is still effective in demonstrating that the readonly
flag is respected.
There is another problem test_remote_httpproxy__env
that I've not tracked down yet; some details are in the commit though. I'll add that I've only gotten test_remote_httpproxy__env
to fail when running the full suite; I'm not able to get it to fail on its own or even just running the remote
test suite. There might be something wrong with the test; it's hard to say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a little time staring at this over the weekend and there's much more complexity here than I expected. You may have identified some of these problems as well. In particular, backends are shared across multiple git_config
instances. (The repository has a configuration with multiple backends with an ordering. When you create a snapshot, you get a new configuration with read only backends. And you can open_level
, you get a new configuration object with only one of those backends.)
The other "gotcha" is that we support users of the library to create their own custom configuration, GIT_CONFIG_LEVEL_APP
(or higher) which is higher-priority than the system/global/etc configuration. And we have no idea what they're doing. So trying to be clever about where writes go needs to take into account the APP
level which may or may not exist and may or may not be writeable.
The current flow today is "write to the highest level backend that isn't readonly", which may be APP
and if it's not it will be LOCAL
.
So, because of this shared state, you can't simply make the worktree backend "not writable" for the repository configuration, since it will also be not writable when you open_level
it, which you do want to write to. 🙃
And trying to hardcode any logic about defaulting to the local configuration is tricky, since there might be APP
-levels that you need to contend with (and that are being written to today and we would like to preserve backcompat).
My initial thought was that we could re-use the readonly
bit (or possibly make a new "writes need to be explicit" bit) and make the worktree config readonly when it's created via git_repository_open
, but that doesn't work, because subsequent open_level
will happily just give you the existing worktree config which is... readonly.
Instead, I'm coming back to the notion that configuration needs separate read and write ordering. Providing a set_writeorder
API allows callers (in particular, git_repository
) to set that they want LOCAL
to be the only write target during configuration creation. Users can subsequently add an APP
level, which will be be the highest-level writer (but they can change that themselves with set_writeorder
).
Ultimately, set_writeorder
only applies to the things that are in your config object. So if you want to override the default write order of the thing(s) you just added, you will need to call it again after adding them. (You can't reconfigure your write order.)
This required us to pull apart the refcounted backend object from the list of backends that are in a config so that they can be separate. Alas. In any case, I pushed up a branch ethomson/worktree-config
(diff). I'd love your feedback.
I also found and fixed some interesting bugs, notably that you can't git_transaction_free
an uncommitted transaction today. And that if you start a transaction and then mess around with backends, you're going to have a bad time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we were typing our comments at roughly the same time. 😅
Right, so, my concern here is definitely around the CONFIG_LEVEL_APP
no longer being a write target. I think that's critical to backcompat... Our CLI uses it to map -c
command-line arguments into a configuration backend (though this is read-only). I know that at least one hosting provider uses CONFIG_LEVEL_APP
, and ... I have no idea if they're using it to write config data or only read?
I'm loathe to break it, though. 😬
src/libgit2/config.c
Outdated
if (backend->backend->readonly) | ||
continue; | ||
|
||
/* git-config doesn't update worktree-level config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is interesting — isn't this true of all git config
entry levels? IOW, doesn't git config
always operate on the local configuration unless otherwise specified?
I actually think that this exposes an oddity in our whole ordering. Our configuration levels appear to be trying to support a scenario where we have a system of ordering so that higher priority backends can override the data in lower priority backends. (eg, if I have foo.bar
in system, and foo.bar
in local, then the foo.bar
in local is used.)
But we also want a system that's relatively flexible so that library users can plug in their own backends. (In other words, we don't want to just always default to the local backend, which is what we would do in the absence of any pluggability.)
The wrinkle here is that you're introducing a new backend that should take priority over reads but should not take priority over writes. I think that we should consider doing two things:
- For the sort of generic, unopinionated configuration system, we should have two priority ordered lists — one for read priority, and one for write priority
- When a repository is created by default, it should set the read priority in the order that we've specified here. The write priority should only have a single option, which is the local configuration. (By default, calling set without a specific entry level should probably just always go there.)
This lets users have control over the ordering, but sets a default that aligns with git.
Does this make sense? Am I entirely off base? Am I overthinking things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you're over-thinking things. I agree that the different behavior for reading/writing is not something that's nicely explicit in the current API.
This feels very related to #6202 (comment). I will have to noodle on these points more (I've got some household chores to do in the meantime), but I suspect my other reply might be a start toward a resolution here.
In this version: 1. If there is exactly one config backend, use it. 2. If `--local` is available, use it. 3. Signal an error (as `git` does). This closely simulates the behavior of git-config. A test does fail, however, introduced in [1]: test_config_readonly__writing_to_cfg_with_ro_precedence_succeeds Reading the message, I think the spirit of the readonly functionality is still satisfied and the test can be adjusted. Another test also fails: test_remote_httpproxy__env This one eludes me -- because it only fails *sometimes*. It does not fail if I just run that suite or surrounding suites. When it does fail, it fails on the very first assertion: /* HTTP proxy is ignored for HTTPS */ cl_setenv("HTTP_PROXY", "http://localhost:9/"); assert_proxy_is(NULL); I'm not sure what kind of test interaction could be going on here, but I suspect (given my changes) that the problem is somewhere within `http_proxy_config`. [1]: 95f29fb
Fixed via #6756 |
Fix #6044.
Tasks:
extensions.worktreeconfig
on the allow-list (don't bomb out a la [Feature] Supportextensions.worktreeconfig
forrepositoryformatversion=1
#6044)Open questions:
enum config_scope
?Does this break ABI?Yes, but evidently there's no way to avoid this.(meta) What subsystem even is this? (for the commit subject prefix)