Skip to content

Commit

Permalink
Merge updates to serialized status
Browse files Browse the repository at this point in the history
Includes these pull requests:

	#1
	#6
	#10
	#11
	#157
	#212
	#260
	#270

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
  • Loading branch information
dscho committed Sep 18, 2024
2 parents b83f0d2 + 52dc6c6 commit 97eaa01
Show file tree
Hide file tree
Showing 18 changed files with 2,256 additions and 31 deletions.
22 changes: 22 additions & 0 deletions Documentation/config/status.txt
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,25 @@ status.submoduleSummary::
the --ignore-submodules=dirty command-line option or the 'git
submodule summary' command, which shows a similar output but does
not honor these settings.

status.deserializePath::
EXPERIMENTAL, Pathname to a file containing cached status results
generated by `--serialize`. This will be overridden by
`--deserialize=<path>` on the command line. If the cache file is
invalid or stale, git will fall-back and compute status normally.

status.deserializeWait::
EXPERIMENTAL, Specifies what `git status --deserialize` should do
if the serialization cache file is stale and whether it should
fall-back and compute status normally. This will be overridden by
`--deserialize-wait=<value>` on the command line.
+
--
* `fail` - cause git to exit with an error when the status cache file
is stale; this is intended for testing and debugging.
* `block` - cause git to spin and periodically retry the cache file
every 100 ms; this is intended to help coordinate with another git
instance concurrently computing the cache file.
* `no` - to immediately fall-back if cache file is stale. This is the default.
* `<timeout>` - time (in tenths of a second) to spin and retry.
--
35 changes: 35 additions & 0 deletions Documentation/git-status.txt
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,21 @@ ignored, then the directory is not shown, but all contents are shown.
threshold.
See also linkgit:git-diff[1] `--find-renames`.

--serialize[=<path>]::
(EXPERIMENTAL) Serialize raw status results to a file or stdout
in a format suitable for use by `--deserialize`. If a path is
given, serialize data will be written to that path *and* normal
status output will be written to stdout. If path is omitted,
only binary serialization data will be written to stdout.

--deserialize[=<path>]::
(EXPERIMENTAL) Deserialize raw status results from a file or
stdin rather than scanning the worktree. If `<path>` is omitted
and `status.deserializePath` is unset, input is read from stdin.
--no-deserialize::
(EXPERIMENTAL) Disable implicit deserialization of status results
from the value of `status.deserializePath`.

<pathspec>...::
See the 'pathspec' entry in linkgit:gitglossary[7].

Expand Down Expand Up @@ -424,6 +439,26 @@ quoted as explained for the configuration variable `core.quotePath`
(see linkgit:git-config[1]).


SERIALIZATION and DESERIALIZATION (EXPERIMENTAL)
------------------------------------------------

The `--serialize` option allows git to cache the result of a
possibly time-consuming status scan to a binary file. A local
service/daemon watching file system events could use this to
periodically pre-compute a fresh status result.

Interactive users could then use `--deserialize` to simply
(and immediately) print the last-known-good result without
waiting for the status scan.

The binary serialization file format includes some worktree state
information allowing `--deserialize` to reject the cached data
and force a normal status scan if, for example, the commit, branch,
or status modes/options change. The format cannot, however, indicate
when the cached data is otherwise stale -- that coordination belongs
to the task driving the serializations.


CONFIGURATION
-------------

Expand Down
107 changes: 107 additions & 0 deletions Documentation/technical/status-serialization-format.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
Git status serialization format
===============================

Git status serialization enables git to dump the results of a status scan
to a binary file. This file can then be loaded by later status invocations
to print the cached status results.

The file contains the essential fields from:
() the index
() the "struct wt_status" for the overall results
() the contents of "struct wt_status_change_data" for tracked changed files
() the list of untracked and ignored files

Version 1 Format:
=================

The V1 file begins with a required header section followed by optional
sections for each type of item (changed, untracked, ignored). Individual
item sections are only present if necessary. Each item section begins
with an item-type header with the number of items in the section.

Each "line" in the format is encoded using pkt-line with a final LF.
Flush packets are used to terminate sections.

-----------------
PKT-LINE("version" SP "1")
<v1-header-section>
[<v1-changed-item-section>]
[<v1-untracked-item-section>]
[<v1-ignored-item-section>]
-----------------


V1 Header
---------

The v1-header-section fields are taken directly from "struct wt_status".
Each field is printed on a separate pkt-line. Lines for NULL string
values are omitted. All integers are printed with "%d". OIDs are
printed in hex.

v1-header-section = <v1-index-headers>
<v1-wt-status-headers>
PKT-LINE(<flush>)

v1-index-headers = PKT-LINE("index_mtime" SP <sec> SP <nsec> LF)

v1-wt-status-headers = PKT-LINE("is_initial" SP <integer> LF)
[ PKT-LINE("branch" SP <branch-name> LF) ]
[ PKT-LINE("reference" SP <reference-name> LF) ]
PKT-LINE("show_ignored_files" SP <integer> LF)
PKT-LINE("show_untracked_files" SP <integer> LF)
PKT-LINE("show_ignored_directory" SP <integer> LF)
[ PKT-LINE("ignore_submodule_arg" SP <string> LF) ]
PKT-LINE("detect_rename" SP <integer> LF)
PKT-LINE("rename_score" SP <integer> LF)
PKT-LINE("rename_limit" SP <integer> LF)
PKT-LINE("detect_break" SP <integer> LF)
PKT-LINE("sha1_commit" SP <oid> LF)
PKT-LINE("committable" SP <integer> LF)
PKT-LINE("workdir_dirty" SP <integer> LF)


V1 Changed Items
----------------

The v1-changed-item-section lists all of the changed items with one
item per pkt-line. Each pkt-line contains: a binary block of data
from "struct wt_status_serialize_data_fixed" in a fixed header where
integers are in network byte order and OIDs are in raw (non-hex) form.
This is followed by one or two raw pathnames (not c-quoted) with NUL
terminators (both NULs are always present even if there is no rename).

v1-changed-item-section = PKT-LINE("changed" SP <count> LF)
[ PKT-LINE(<changed_item> LF) ]+
PKT-LINE(<flush>)

changed_item = <byte[4] worktree_status>
<byte[4] index_status>
<byte[4] stagemask>
<byte[4] score>
<byte[4] mode_head>
<byte[4] mode_index>
<byte[4] mode_worktree>
<byte[4] dirty_submodule>
<byte[4] new_submodule_commits>
<byte[20] oid_head>
<byte[20] oid_index>
<byte[*] path>
NUL
[ <byte[*] src_path> ]
NUL


V1 Untracked and Ignored Items
------------------------------

These sections are simple lists of pathnames. They ARE NOT
c-quoted.

v1-untracked-item-section = PKT-LINE("untracked" SP <count> LF)
[ PKT-LINE(<pathname> LF) ]+
PKT-LINE(<flush>)

v1-ignored-item-section = PKT-LINE("ignored" SP <count> LF)
[ PKT-LINE(<pathname> LF) ]+
PKT-LINE(<flush>)
2 changes: 2 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -1199,6 +1199,8 @@ LIB_OBJS += wrapper.o
LIB_OBJS += write-or-die.o
LIB_OBJS += ws.o
LIB_OBJS += wt-status.o
LIB_OBJS += wt-status-deserialize.o
LIB_OBJS += wt-status-serialize.o
LIB_OBJS += xdiff-interface.o

BUILTIN_OBJS += builtin/add.o
Expand Down
Loading

0 comments on commit 97eaa01

Please sign in to comment.