Skip to content

Latest commit

 

History

History
167 lines (121 loc) · 6.62 KB

concepts.mkd

File metadata and controls

167 lines (121 loc) · 6.62 KB

The working tree, the index and the repository

The local directory that contains all of the managed files is called the working tree.

The repository contains, among other things, a collection of commits. It also includes references to certain commits, called heads.

Unlike many other version control tools, Git does not record changes directly from the working tree to the repository. Before being commited to the repository, changes are first recorded in the index. This is a staging area where you add each individual change. When all the desired changes are staged they can be added to the repo in a single commit.

Repositories

Your working copy of a repository is a complete repo: history, branches, etc. This not only means that you can have full version control even offline, but also that logs, branches, commit information, etc are local and fast.

Commits

  • a set of file contents, called blobs, which is a snapshot of the working tree when the commit was made.
  • references to parent commits (usually there is only one.)
  • information about the author, date, etc.
  • a SHA1 identifier.

In a Git repository history there is only one commit without any parent reference, the initial commit.

The history is a directed acyclic graph of commits, with the references to parent commits pointing back in time, ultimately reaching the first commit. You can start from any commit and follow the parent pointers all the way back to that first commit. This is how history works.

Heads and HEAD

A head is a named reference to a commit. By default repos start with only one head, called master, but a repo can have any number of heads.

HEAD is the reference to what is currently checked out in your working tree. If you are working on the branch feature then HEAD points to the last commit in the branch history.

HEAD is the alias to the currently active head, head is a named reference to a commit in the repo. This distinction is important and a common convention in Git documentation.

What happens when you commit?

  • Git creates the commit object.
  • The parent of the commit is the commit referenced by HEAD.
  • If you are in a branch, the branch head is changed to point to the new commit.
  • HEAD is updated to point to the new commit.

Naming commits

There are several ways of referencing a commit in git commands:

  • The full SHA1 identifier of the commit object.
  • Enough characters of the commit identifier.
  • Named reference (a branch or tag name).
  • Relative from a known commit:
    • HEAD^ is the parent of HEAD.
    • HEAD^^ is the grand-parent of HEAD.
    • e7b20f1b~4 is the commit 4 steps back in history from e7b20f1b.

The full list is in the git-revisions manpage.

Branches

In Git a branch and a head are basically the same. A branch is represented by a head and every head represents a branch. Usually when the term "branch" is used, it means the head commit and the history that lead to it. When the term "head" is used it usually means the commit object that is referenced by the particular head.

The default branch is called master.

When you create a branch, Git creates a head that point to the commit from which you are branching.

What happens when you merge?

Assuming you are merging the branch feature into master:

  • Git identifies the common ancestor of feature and master. Let's call this ancestor-commit.
  • If feature references ancestor-commit, do nothing.
  • If master references ancestor-commit, do a fast-forward.
  • Otherwise, create a diff of ancestor-commit and feature.
  • Attempt to apply the diff to master.
  • If there are no conflicts, Git creates a new commit with two parents: the commits referenced by feature and master.
  • If there are no conflicts, master and HEAD are updated to this new commit.
  • If there are conflicts, Git inserts the conflict markers in the appropriate files, creates conflict files and no commit is created. The user is informed of the conflict.

Resolving conflicts

After resolving the conflict you add the changes to the stage and manually commit. Git remembers that you are in the middle of a merge and sets the parents of the new commit correctly. It also pre-populates the commit message with a note on the merge and a list of files that had conflicts.

Fast-forward merges

This is an optimization. In a fast-forward merge no merge commit is created, Git just updates the heads.

Remote repositories

Git follows a distributed model of version control. To identify remote repositories Git uses a remote-specification. A remote-specification can be a path on the same filesystem, a remote repo over ssh, http, etc.

Cloning a repo

When you clone a repo Git performs the following steps:

  • It creates a local directory. The default is the name of the remote repo.
  • It initializes a repository in the new directory.
  • All the commit objects and head references are copied from the remote repo.
  • A remote repository reference is created (the default name is origin) and it points to the repo that you are cloning.
  • For each head reference in the remote repo a remote head is created in the local repo. The remote heads are named origin/<remote-head-name>.
  • A remote head in the local repo, corresponding to the current active head in the remote repo is set to track it.

Remote repository references and remote heads work in a similar way as local heads. You can rebase, merge, show specific commits, show the logs, etc.

When you interact and modify a cloned repository most changes are done in your clone only. If you want to modify the remote repo you have to do it explicitly.

Fetching and rebasing vs. pulling

A pull operation from a remote repository does a fetch from the remote repository reference, followed by a merge from the tracked branch into the current active local branch. This can create some unnecessary merge commits, which is the reason that some people are against pulling.

Pushing changes

When you push to a remote repository Git does the following:

  • Add the new commit objects to the remote repo.
  • Updates the remote head to reference the same commit it references in the local repo.
  • Updates the remote head reference in the local repo.

You should only push changes that will result in a fast-forward merge on the remote repository. Otherwise the remote repo will have dangling commits.


Sources