Support multiple working copies #13

martinvonz · 2021-04-07T04:52:54Z

No description provided.

martinvonz · 2021-11-13T02:31:18Z

I'm debating what the UX should be when a commit that's checked out in one working copy gets rewritten.

First some background: We currently keep track of the current checkout in the repo "view", which is part of the "operation" object in the operation log. This is considered the source of truth for which commit is checked out in the working copy. That record gets updated within a transaction, before the working copy itself gets updated. The working copy has its own record of what's currently checked out. That's supposed to follow the record in the view object. For example, if a transaction committed and then the power was cut, we'll end up with a mismatch. In such cases, next time jj is run, we should detect that the working copy commit isn't the intended commit, and automatically update to the new commit. That's not implemented yet, but it should be done regardless of support for multiple working copies.

For support for multiple working copies, I plan to extend the view object to keep track of the current checkout in each working copy. The question is what should happen if you run a command in one working copy that impacts the checkout in another working copy. For example, you might rebase a whole tree of commits, including the current working copy's checkout and another working copy's checkout. I see a few options:

Update the other working copies' checked-out commits and also update the working copy files. This will of course only work if the other working copy is on a file system that's available (e.g. not on a USB stick or a network file system). It also requires the central repo to keep track of where every working copy is.
Update the other working copies' checked-out commits, but don't update their working copy files. The working copy would then be stale and it would be update later when the user runs a command in that working copy. There's a small risk that the old working copy's checkout would be GC'd at this point.
Don't update the other working copies' checked-out commits but leave them visible. This would mean that you'd end up with two visible commits with the same change ID, which by definition means that you'll have divergence. The user would then have to manually update and hide the old commit, or run some command that we don't yet have.
Don't update the other working copies' checked-out commits and hide them. This would mean that your working copy would not appear in jj log but jj log -r @ would still show it. The change would not be considered divergent. The user can recover by updating to the new commit. We would probably want to make jj status highlight the fact that the working copy's commit is not currently visible.

arxanas · 2021-11-13T06:42:11Z

It might be worth delineating two specific workflows that the user can pick from at a given time: live or on-demand sync. Similar to auto-fetch/auto-push systems for Git, or whatever Fossil does with auto-syncing. In particular, the same live sync workflow would extend to the use-case of syncing across multiple devices.

I think it would be hard to pick a single workflow that aligns with user expectations for multiple checkouts.

martinvonz · 2021-11-13T18:06:01Z

Interesting idea. I'll have to go read about what Fossil does.

For reference, Git does 3 or 4 (they're the same there since it has no concept of change ID and it only moves the current branch pointer on rebase [1]). Git still has the main repo keep track of each worktree's location. IIRC, it does that mostly to prevent GC of commits checked out in other worktree. I don't know if the reflog for each worktree's HEAD is stored in the main repo or in worktree (it's easy to check, of course, I just haven't yet).

[1] By the way, @arxanas, since git move can move multiple branches, I suppose that means you may want to check that the branches you're about to move are not checked out in other working copies (I haven't checked if you already do, but it seems like something that's easy to overlook).

Having a concept of a "workspace" will be useful for adding support for multiple workspaces (#13). You can think of the "workspace" as a repo combined with a working copy. A workspace corresponds 1:1 with a `.jj/` directory. It's pretty close to what other VCS simply call a "repo", but I've ended up using the word "repo" for what Git calls a "bare repo".

The `Repo` doesn't do anything with the `WorkingCopy` except keeping a reference to it for its users to use. In fact, the entire lib crate doesn't do antyhing with the `WorkingCopy`. It therefore seems simpler to have the users of the crate manage the `WorkingCopy` instance. This patch does that by letting `Workspace` own it. By not keeping an instance in `Repo`, which is `Sync`, we can also drop the `Arc<Mutex<>>` wrapping. I left `Repo::working_copy()` for convenience for now, but now it creates a new instance every time. It's only used in tests. This further decoupling should help us add support for multiple working copies (#13).

This is another step towards removing coupling between the repo and the working copy, so we can have multiple working copies for a single repo (#13).

martinvonz · 2021-11-26T05:54:32Z

I just pushed some commits refactoring ReadonlyRepo so it no longer has working_copy() and working_copy_path(). The coupling was already very weak, but now it's gone completely. It wasn't a necessary change, as we could instead have made ReadonlyRepo::working_copy() simply depend on which working copy it was loaded from, but it is cleaner to separate it completely. It was always a bit ugly how the WorkingCopy had to be kept in a Mutex only because ReadonlyRepo is Sync. Now that ReadonlyRepo doesn't have a WorkingCopy, we don't need the Mutex and we can benefit from Rust's ownership rules.

I've added a "workspace" concept and a Workspace type, representing a working copy and a .jj/ directory. .jj/working_copy/ will keep the working copy state ("index"/"dirstate" in Git-/Mercurial-speak) for the workspace. All other directories in .jj/ (i.e. .jj/store/, .jj/op_store/, .jj/op_heads/, .jj/index/) will be shared between all workspaces backed by the same repo. Perhaps we should move them into .jj/repo/ or something to clarify that.

I still haven't decided which of the solutions for updating "other" workspaces I like best.

martinvonz · 2022-01-15T18:36:00Z

I've put the this work on hold for a while but I hope to come back to it now. One idea I remember having earlier was to record the operation ID of the last successful update in .jj/working_copy/. I don't remember the details, but thinking a bit more about it now, it seems to make sense.

By having the operation ID there, we can detect if the working copy is stale. It may be stale because a process had crashed before it finished updating the working copy, or maybe it was a workspace that was stored on a disconnected USB stick. It could also be that it was stale because we went with solution 2 above and a process had rebased the commit in another workspace connected to the same repo.

The working copy's associated operation ID can also help us address a hack related to concurrent updates of the working copy. If we have recorded the operation ID there, then we simply reload at that operation instead of the current operation.

If we store the operation ID in the .jj/working_copy/, then the commit ID that's recorded there won't be needed anymore since it can be found in the operation. We may still want to store it so we don't need to look up the operation when updating the working copy, but it would be just a cache of the checkout recorded in the operation.

When there are concurrent operations that want to update the working copy, it's useful to know which operation was the last to successfully update the working copy. That can help use decide how to resolve a mismatch between the repo view's record and the working copy's record. If we detect such a difference, we can look at the working copy's operation ID to see if it was updated by an operation before or after we loaded the repo. If the working copy's record says that it was updated at operation A and we have loaded the repo at operation B (after A), we know that the working copy is stale, so we can automatically update it (or tell the user to run some command to update it if we think that's more user-friendly). Conversely, if we have loaded the repo at operation A and the working copy's record says that it was updated at operation B, we know that there was some concurrent operation that updated it. We can then decide to print a warning telling the user that we skipped updating because of the conflict. We already have logic for not updating the working copy if the repo is loaded at an earlier operation, but maybe we can drop that if we record the operation in the working copy (as this patch does).

Now that we have the operation ID recorded in the working copy state, we can tell if the working copy is stale. When it is, we update it to the repo view's checkout.

) It's clearly `Workspace`'s job to create `.jj/working_copy/`, I must have just forgotten to move it there.

The `.jj/` directory contains information about two distinct parts: the repo and the working copy. Most subdirectories are related to the repo; only `.jj/working_copy/` is about the working copy. Let's move the repo-related bits into a new `.jj/repo/` subdirectory. That makes it clearer that they're related to the repo. It will probably also be easier to manage when we have support for multiple workspaces backed by a single repo.

This patch teaches the `View` object to keep track of the checkout in each workspace. It serializes that information into the `OpStore`. For compatibility with existing repos, the existing field for a single workspace's checkout is interpreted as being for the workspace called "default". This is just an early step towards support for multiple workspaces. Remaining things to do: * Record the workspace ID somewhere in `.jj/` (maybe in `.jj/working_copy/`) * Update existing code to use the workspace ID instead of assuming it's always "default" as we do after this patch * Add a way of indicating in `.jj/` that the repo lives elsewhere and make it possible to load a repo from such workspaces * Add a command for creating additional workspaces * Show each workspace's checkout in log output

This patch makes it so the workspace ID can be stored in `.jj/working_copy/checkout`. The workspace ID is still always "default".

When checking out a new commit, we look at the old checkout to see if it's empty so we should abandon it. We current use the default workspace's checkout. We need to respect the workspace ID we're given in `MutableRepo::check_out()`, and we need to be able to deal with that workspace not existing yet (i.e. this being the first checkout in that workspace).

We detect concurrent working copy changes by checking that the old commit matches the repo's view. We should use the current workspace when looking up the checkout in the view.

When updating the working copy after committing a transaction, we should update it based on the right checkout.

Before committing the working copy, we check if the working copy is checked out to the commit we expect based on the repo's view. We always use the default workspace's checkout, so we need to fix that.

When importing Git HEAD, we already use the right workspace ID for the new checkout, but the old checkout we abandon is always the default workspace's. We should fix that even if we will never support sharing a working copy with Git in a non-default workspace.

If the workspace is shared with a Git repo, we sometimes update Git's HEAD ref. We should get the new checkout from the right workspace ID when doing that (though I'm not sure we'll ever support sharing the working copy with Git in a non-default workspace).

…#13) `jj new` will update onto the new commit if the previous commit was the current checkout. That code needs to use the current workspace's checkout.

`jj status` shows the status for the default workspace. Make it use the current workspace instead.

Because we record each workspace's checkout in the repo view, we can -- unlike other VCSs -- let the user refer to any workspace's checkout in revsets. This patch adds syntax for that, so you can show the contents of the checkout in workspace "foo" with `jj show foo@`. That won't automatically commit that workspace's working copy, however.

We don't use the `current_checkout` keyword in out default templates, but let's still fix it, so it refers to the current workspace.

We should highlight (with bright colors by default) the current workspace's checkout, not the default workspace's checkout.

As part of creating a new repository, we create an open commit on top of the root and set that as the current checkout. Now that we have support for multiple checkouts in the model, we also have support for zero checkouts, which means we don't need to create that commit on top of the root when creating the repo. We can therefore move out of `ReadonlyRepo`'s initialization code and let `Workspace` instead take care of it. A user-visible effect of this change is that we now create one operation for initilizing the repo and another one for checking out the root commit. That seems fine, and will be consistent with the additional operation we will create when adding further workspaces.

In workspaces added after the initial one, the idea is to have `.jj/repo` be a file whose contents is a path to the location of the repo directory in some other workspace.

With all the groudwork done, everything should just work with multiple workspaces now. So let's add a command for creating workspaces.

It seems helpful to show in the log output which commit is checked out in which workspace, so let's try that. I made it only show the information if there are multiple checkouts for now.

…nce (#13) When you run `jj co abc123` and that commit is already checked out, we just print a message. The condition for that assumed that the checkout existed, which it won't if you just ran `jj workspace forget`. Let's avoid that crash, especially since `jj co` is an easy way to restore the working copy if you had accidentally run `jj workspace forget` (though `jj undo` is even easier).

martinvonz · 2022-02-03T06:10:28Z

You can now create additional workspaces with jj workspace add <path to new workspace>, list workspaces with jj workspace list, and forget workspaces with jj workspace forget. Each workspace's checked-out commit shows up in the default log template (they're recorded in the repo view, so that's easy and cheap to do). You can also refer to them using e.g. jj diff --from workspace1@ --to workspace2@ to show a diff from workspace1's checkout to workspace2's checkout. Plain @ is now a short form to refer to the current workspace's checkout.

I went with solution 2 above, meaning that all checkouts are rebased along with commits they point to, but the working copy is not updated. If you run a command in the workspace after its checkout has been updated from another workspace, then it'll get automatically updated. Perhaps we should change that to just print a warning, in case the user was running a build or tests, or maybe it's just confusing.

I consider this done now and we can open new bugs for any improvements to it.

martinvonz added the enhancement New feature or request label Apr 7, 2021

martinvonz mentioned this issue Nov 11, 2021

worktree equivalent #41

Closed

arxanas mentioned this issue Nov 13, 2021

move: Ensure that moved branches behave sensibly across worktrees arxanas/git-branchless#215

Open

martinvonz added a commit that referenced this issue Nov 26, 2021

repo: remove working_copy(), only used in tests

c08adba

This is another step towards removing coupling between the repo and the working copy, so we can have multiple working copies for a single repo (#13).

martinvonz added a commit that referenced this issue Jan 29, 2022

workspace: take over creation of .jj/working_copy/ from repo.rs (#13

81edd92

) It's clearly `Workspace`'s job to create `.jj/working_copy/`, I must have just forgotten to move it there.

martinvonz added a commit that referenced this issue Jan 29, 2022

workspace: rename some symbols related to .jj/ to jj_dir (#13)

35f0c17

martinvonz added a commit that referenced this issue Jan 29, 2022

tests: avoid depending on .jj/ structure in test_bad_locking (#13)

35e0b85

martinvonz added a commit that referenced this issue Feb 2, 2022

working_copy: keep track of workspace ID (#13)

fb8fbdc

This patch makes it so the workspace ID can be stored in `.jj/working_copy/checkout`. The workspace ID is still always "default".

martinvonz added a commit that referenced this issue Feb 2, 2022

view: add workspace_id argument to set_checkout() (#13)

766c01a

martinvonz added a commit that referenced this issue Feb 2, 2022

view: merge checkouts for all workspaces, not just the default one (#13)

6d3e201

martinvonz added a commit that referenced this issue Feb 2, 2022

rewrite: update all checkouts, not just the default workspace's (#13)

ac0d040

martinvonz added a commit that referenced this issue Feb 2, 2022

cli: update working copy to current workspace's checkout (#13)

daaf735

When updating the working copy after committing a transaction, we should update it based on the right checkout.

martinvonz added a commit that referenced this issue Feb 2, 2022

cli: add an early return and reduce indentation (#13)

9cabb7e

martinvonz added a commit that referenced this issue Feb 2, 2022

cli: make jj status show status for the current workspace (#13)

5b46fa3

`jj status` shows the status for the default workspace. Make it use the current workspace instead.

martinvonz added a commit that referenced this issue Feb 2, 2022

templater: make current_checkout be about the current workspace (#13)

92af544

We don't use the `current_checkout` keyword in out default templates, but let's still fix it, so it refers to the current workspace.

martinvonz added a commit that referenced this issue Feb 2, 2022

cli: make jj [obs]log highlight current workspace's checkout (#13)

f2e7086

We should highlight (with bright colors by default) the current workspace's checkout, not the default workspace's checkout.

martinvonz added a commit that referenced this issue Feb 3, 2022

workspace: add a function for initializing additional workspace (#13)

c09a4e1

martinvonz added a commit that referenced this issue Feb 3, 2022

cli: add a command for adding an additional workspace (#13)

5da9d60

With all the groudwork done, everything should just work with multiple workspaces now. So let's add a command for creating workspaces.

martinvonz added a commit that referenced this issue Feb 3, 2022

cli: add a command for listing workspaces (#13)

4eddb72

martinvonz added a commit that referenced this issue Feb 3, 2022

cli: add a command for forgetting a workspace (#13)

ef60da0

martinvonz closed this as completed Feb 3, 2022

martinvonz mentioned this issue Sep 5, 2022

cli: commit working copy before recovering from stale state #515

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multiple working copies #13

Support multiple working copies #13

martinvonz commented Apr 7, 2021

martinvonz commented Nov 13, 2021 •

edited

Loading

arxanas commented Nov 13, 2021

martinvonz commented Nov 13, 2021

martinvonz commented Nov 26, 2021

martinvonz commented Jan 15, 2022

martinvonz commented Feb 3, 2022

Support multiple working copies #13

Support multiple working copies #13

Comments

martinvonz commented Apr 7, 2021

martinvonz commented Nov 13, 2021 • edited Loading

arxanas commented Nov 13, 2021

martinvonz commented Nov 13, 2021

martinvonz commented Nov 26, 2021

martinvonz commented Jan 15, 2022

martinvonz commented Feb 3, 2022

martinvonz commented Nov 13, 2021 •

edited

Loading