PyGit is a lightweight Git client written in Python. It was written to strengthen my understanding of how Git works under the hood by exploring its internals. You can read about its development here.
Currently, PyGit supports the following Git commands:
init
log
cat-file
hash-object
The init
command creates a new Git repository, with the following structure:
- The
.git
Folder: Contains all repository files. - Key Folders:
branches
objects
refs
- Description File: A description of the git repo (usually unused)
- HEAD File: Specifies the repository's head pathspec (e.g.,
ref: refs/heads/master
by default). - Config File: Specifies repository details such as the Git version format.
Git stores repository information using four types of objects:
- Blob: Raw file data (e.g., the contents of
main.c
). - Commit: Metadata about a commit.
- Tag: A named reference to a specific commit or object, including tag date and creator.
- Tree: Relates files to directories.
Git objects are compressed using zlib
. Once decompressed, they follow this format:
- Header: Identifies the object type (e.g.,
blob
,commit
). - Size: Object size in ASCII text.
- Data: The actual object contents (e.g., file data, commit metadata).
Development of PyGit was guided by two key resources:
- Official Git Documentation: An invaluable source for understanding Git's internals, including the structure of Git objects.
- Write Yourself a Git: A step-by-step guide for building your own Git client, particularly helpful for parsing Git objects.