Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the experimental git survey command to analyze (large) local repositories #667

Merged
merged 25 commits into from
Jul 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
0b18176
survey: stub in new experimental `git-survey` command
jeffhostetler Apr 29, 2024
49a0456
survey: add command line opts to select references
jeffhostetler Apr 29, 2024
7fff8f6
survey: collect the set of requested refs
jeffhostetler Apr 29, 2024
69486b8
survey: calculate stats on refs and print results
jeffhostetler Apr 29, 2024
520208f
survey: stub in treewalk of reachable commits and objects
jeffhostetler Apr 29, 2024
47d937a
survey: add traverse callback for commits
jeffhostetler Apr 29, 2024
3c0b8ae
survey: add vector of largest objects for various scaling dimensions
jeffhostetler May 1, 2024
b71068c
survey: add pathname of blob or tree to large_item_vec
jeffhostetler May 15, 2024
4b17ac4
survey: add commit-oid to large_item detail
jeffhostetler May 15, 2024
c9855a4
survey: add commit name-rev lookup to each large_item
jeffhostetler May 20, 2024
fe7aceb
survey: add --json option and setup for pretty output
jeffhostetler May 21, 2024
aec2d90
survey: add pretty printing of stats
jeffhostetler May 21, 2024
78a9ef0
t8100: create test for git-survey
jeffhostetler May 29, 2024
a30915a
survey: add --no-name-rev option
jeffhostetler Jun 4, 2024
3f67799
survey: started TODO list at bottom of source file
jeffhostetler Jun 17, 2024
644434e
survey: expanded TODO list at the bottom of the source file
jeffhostetler Jun 28, 2024
3f0f64e
survey: expanded TODO with more notes
jeffhostetler Jul 1, 2024
ecc070f
Merge branch 'jh/experimental-survey'
dscho Jul 1, 2024
d23dcf7
fixup! survey: calculate stats on refs and print results
dscho Jul 1, 2024
38de1fa
fixup! survey: add pretty printing of stats
dscho Jul 1, 2024
dd802f5
fixup! survey: calculate stats on refs and print results
dscho Jul 1, 2024
2eead28
fixup! survey: calculate stats on refs and print results
dscho Jul 1, 2024
3168768
fixup! survey: add pretty printing of stats
dscho Jul 1, 2024
5ae1d0c
survey: clearly note the experimental nature in the output
dscho Jul 1, 2024
45c981e
fixup! survey: stub in new experimental `git-survey` command
dscho Jul 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@
/git-submodule
/git-submodule--helper
/git-subtree
/git-survey
/git-svn
/git-switch
/git-symbolic-ref
Expand Down
2 changes: 2 additions & 0 deletions Documentation/config.txt
Original file line number Diff line number Diff line change
Expand Up @@ -531,6 +531,8 @@ include::config/status.txt[]

include::config/submodule.txt[]

include::config/survey.txt[]

include::config/tag.txt[]

include::config/tar.txt[]
Expand Down
41 changes: 41 additions & 0 deletions Documentation/config/survey.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
survey.namerev::
Boolean to show/hide `git name-rev` information for
each reported commit and the containing commit of each
reported tree and blob.

survey.progress::
Boolean to show/hide progress information. Defaults to
true when interactive (stderr is bound to a TTY).

survey.showBlobSizes::
A non-negative integer value. Requests details on the <n>
largest file blobs by size in bytes. Provides a default
value for `--blob-sizes=<n>` in linkgit:git-survey[1].

survey.showCommitParents::
A non-negative integer value. Requests details on the <n>
commits with the most number of parents. Provides a default
value for `--commit-parents=<n>` in linkgit:git-survey[1].

survey.showCommitSizes::
A non-negative integer value. Requests details on the <n>
largest commits by size in bytes. Generally, these are the
commits with the largest commit messages. Provides a default
value for `--commit-sizes=<n>` in linkgit:git-survey[1].

survey.showTreeEntries::
A non-negative integer value. Requests details on the <n>
trees (directories) with the most number of entries (files
and subdirectories). Provides a default value for
`--tree-entries=<n>` in linkgit:git-survey[1].

survey.showTreeSizes::
A non-negative integer value. Requests details on the <n>
largest trees (directories) by size in bytes. This will
set will usually be equal to the `survey.showTreeEntries`
set, but may be skewed by very long file or subdirectory
entry names. Provides a default value for
`--tree-sizes=<n>` in linkgit:git-survey[1].

survey.verbose::
Boolean to show/hide verbose output. Default to false.
108 changes: 108 additions & 0 deletions Documentation/git-survey.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
git-survey(1)
=============

NAME
----
git-survey - EXPERIMENTAL: Measure various repository dimensions of scale

SYNOPSIS
--------
[verse]
(EXPERIMENTAL!) `git survey` <options>

DESCRIPTION
-----------

Survey the repository and measure various dimensions of scale.

As repositories grow to "monorepo" size, certain data shapes can cause
performance problems. `git-survey` attempts to measure and report on
known problem areas.

Ref Selection and Reachable Objects
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In this first analysis phase, `git survey` will iterate over the set of
requested branches, tags, and other refs and treewalk over all of the
reachable commits, trees, and blobs and generate various statistics.

OPTIONS
-------

--progress::
Show progress. This is automatically enabled when interactive.

--json::
Print results in JSON rather than in a human-friendly format.

--[no-]name-rev::
Print `git name-rev` output for each commit, tree, and blob.
Defaults to true.

Ref Selection
~~~~~~~~~~~~~

The following options control the set of refs that `git survey` will examine.
By default, `git survey` will look at tags, local branches, and remote refs.
If any of the following options are given, the default set is cleared and
only refs for the given options are added.

--all-refs::
Use all refs. This includes local branches, tags, remote refs,
notes, and stashes. This option overrides all of the following.

--branches::
Add local branches (`refs/heads/`) to the set.

--tags::
Add tags (`refs/tags/`) to the set.

--remotes::
Add remote branches (`refs/remote/`) to the set.

--detached::
Add HEAD to the set.

--other::
Add notes (`refs/notes/`) and stashes (`refs/stash/`) to the set.

Large Item Selection
~~~~~~~~~~~~~~~~~~~~

The following options control the optional display of large items under
various dimensions of scale. The OID of the largest `n` objects will be
displayed in reverse sorted order. For each, `n` defaults to 10.

--commit-parents::
Shows the OIDs of the commits with the most parent commits.

--commit-sizes::
Shows the OIDs of the largest commits by size in bytes. This is
usually the ones with the largest commit messages.

--tree-entries::
Shows the OIDs of the trees with the most number of entries. These
are the directories with the most number of files or subdirectories.

--tree-sizes::
Shows the OIDs of the largest trees by size in bytes. This set
will usually be the same as the vector of number of entries unless
skewed by very long entry names.

--blob-sizes::
Shows the OIDs of the largest blobs by size in bytes.

OUTPUT
------

By default, `git survey` will print information about the repository in a
human-readable format that includes overviews and tables.

CONFIGURATION
-------------

include::config/survey.txt[]

GIT
---
Part of the linkgit:git[1] suite
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -1332,6 +1332,7 @@ BUILTIN_OBJS += builtin/sparse-checkout.o
BUILTIN_OBJS += builtin/stash.o
BUILTIN_OBJS += builtin/stripspace.o
BUILTIN_OBJS += builtin/submodule--helper.o
BUILTIN_OBJS += builtin/survey.o
BUILTIN_OBJS += builtin/symbolic-ref.o
BUILTIN_OBJS += builtin/tag.o
BUILTIN_OBJS += builtin/unpack-file.o
Expand Down
1 change: 1 addition & 0 deletions builtin.h
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,7 @@ int cmd_status(int argc, const char **argv, const char *prefix);
int cmd_stash(int argc, const char **argv, const char *prefix);
int cmd_stripspace(int argc, const char **argv, const char *prefix);
int cmd_submodule__helper(int argc, const char **argv, const char *prefix);
int cmd_survey(int argc, const char **argv, const char *prefix);
int cmd_switch(int argc, const char **argv, const char *prefix);
int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);
int cmd_tag(int argc, const char **argv, const char *prefix);
Expand Down
Loading
Loading