Yet Another Git Guide

Posted on Sep 24, 2021

Introduction

I use Git (and its companions like GitHub) a lot, but I’m still learning new things about it every day. As I do so, I’ll add to this document, written in the form of a tutorial to myself. This is something I’d like to keep public so that I can point friends and collaborators to this, to make sure we’re on the same page.

I don’t aim or claim to be comprehensive here, so I’ll only cover what’s relevant for how I personally use Git. I find the following to be useful resources:

  • Git manual pages. These are nice if you know what you’re looking for, but hard to browse because the manual is split up into pages for each subcommand. To see these, run man git-<subcommand>, or find the page online.
  • Atlassian’s Advanced Tips for Git. These are lengthier compared to what I have here, with better illustrations, and distributed over several articles.
  • Scott Chacon and Ben Straub’s Pro Git, especially the Git Internals chapter. Goes into a lot detail about the inner workings of Git.
  • Katie Sylor-Miller’s Oh Shit, Git!?!. Quick and dirty (language) guide for common Git maladies.
  • Roger Dudler’s git - the simple guide. Nice place to get started, with nice illustrations.

If you come across this and have any suggestions or corrections, please reach out.

Useful commands

Here is a summary of useful commands that you may use while using Git.

Check state of local repo:

git status

Commits

To add stage a file (include it in the next commit):

git add <file>

To restore from index (also known as staging area):

git restore <file>

To unstage a file (restore from HEAD):

git restore --staged <file>

To move a file (rename it in the next commit):

git mv <src> <dst>
mv <src> <dst> && git add <src> <dst> # equivalent

To delete a file (remove it in the next commit):

git rm <file>
rm <file> && git add <file> # equivalent

To make a commit:

git commit                      # opens up $EDITOR to compose commit message
git commit -v                   # short for --verbose; show diff in $EDITOR
git commit -m <commit-message>  # short for --message; inline commit message

To amend last commit with state of current index:

git commit --amend              # opens up $EDITOR to amend commit message

Note that --amend undoes the last commit. If that commit has already been pushed elsewhere, this will create a divergence in commit history, which will require merges, rebases, or force pushes to fix.

To abort an in-progress commit in the interactive editor, just delete leave the commit message blank.

To undo the last commit:

git reset --soft HEAD^ # shorthand for "most recent commit"

Commit history

To see the commit history:

git log
git log -- <path> # only show commits related to <path>
git log --graph   # show commit history as dag

To inspect working tree at a commit:

git checkout <hash>

To restore HEAD to a commit:

git reset <hash>         # same as --mixed
git reset --soft  <hash> # do not touch working tree or index
git reset --mixed <hash> # restore index but not working tree; default
git reset --hard  <hash> # restore working tree and index

Branches

To see all local branches (* next to current branch):

git branch
git branch -a # short for --all; also shows remote branches
git branch -vv # short for --verbose; also shows HEAD commit and upstream

To check out an existing branch:

git switch <branch-name>
git checkout <branch-name> # also works, overloaded legacy command name

Create and switch to a new branch:

git switch -c <branch-name>
git checkout -b <branch-name> # also works, overloaded legacy command name

Delete a branch:

git branch -d <branch-name> # delete branch only if fully merged into upstream
git branch -D <branch-name> # delete branch forcibly

Set upstream:

git branch -u <upstream> # short for --set-upstream-to=<upstream>

Remotes

List remotes:

git remote      # show remote names only
git remote -v   # show remote names and URL

Create remote:

git remote add <remote-name> <remote-url>

Rename remote:

git remote rename <old> <new>

Set remote URL:

git remote set-url <remote-name> <remote-url>

Collaboration

To push to current branch’s upstream:

git push    # fails if unable to fast-forward upstream
git push -f # short for --force; forcibly updates upstream to mirror local

To push to any remote, any branch (note the lack of / between <remote> and <branch>):

git push <remote> <branch>

To download commits/refs from remote (but not do anything to any local branches):

git fetch <remote>
git fetch <remote> <branch>

To merge commits into current branch:

git merge <from-local-branch>
git merge <from-remote>/<branch>  # should git fetch first

Fetch and merge (“pull”) commits:

git pull                                          # from upstream remote branch
git pull <remote> <branch>                        # from specified remote branch
git fetch <remote> && git merge <remote>/<branch> # equivalent to above

Note that the above commands may create merge commits if the history between local and remote branch have diverged. The below commands avoid this:

git pull --ff-only            # fail if diverged
git pull --rebase             # rebase automatically if diverged
git pull --rebase=interactive # rebase interactively if diverged

Rebasing commit history:

git rebase                        # rebase current branch onto upstream branch
git rebase --onto <other-branch>  # rebase current branch with <other-branch> as new base

Note that rebasing does not accommodate merges (without special flags).

To amend commit history during rebase (edit, reorder, squash, etc.):

git rebase -i # short for --interactive; specify what to do for each commit

Note that any kind of rebase (via git rebase or via git pull --rebase) may rewrite history. Like git commit --amend, if your history prior to rebase had already been published to a remote, you will need to merge, rebase, and/or force push in order to align the diverged histories.

Concepts

Commits and commit histories

Commits point to a snapshot of what a repo looks at a certain point in time, and each commit has zero or more parent commits. Commits with zero parents are root commits, while commits with two or more commits are called merge commits.

The commit history of a repo forms a directed, acyclic graph of snapshots of the repo. To visualize this DAG, run:

git log --graph

Note that all commits in this graph are reachable by traversing the parent commits from the latest commit. When all commits only have a single parent, we say that the commit history is linear. This is desirable because it totally orders all the commits, making it as if all modifications to the repo were performed sequentially.

Branches

It is often useful to work with several commit histories concurrently. In Git, you do this with branches. A branch is just a convenient way of referring to a particular commit, called the HEAD commit. Whenever you make a commit, you update the HEAD commit to point to the new latest commit.

You can see all the branches available in your repo with the command:

git branch

There will be a * next to the branch you are currently on. We say this branch is “checked out”. By default, you should be checked out on the main branch (previously, this was named master, but that is being phased out).

To check out another existing branch, run:

git checkout <branch-name>

This will restore your working directory to the snapshot captured by the HEAD commit of the branch you are switching to. Note that this may fail if you have uncommitted changes on your current branch that would be overwritten by checking out the other branch, so you should always make sure to do that before checking out a new branch.

Merging vs rebasing

When different commits are made to different branches, their commit histories are said to have diverged. Even if they don’t conflict (e.g., because they only make changes to different files), it still isn’t obvious how best to combine these histories.

For instance, consider two branches, main and topic, with a shared history starting at commit B. topic builds commits E, F, and G off of B, but main has since moved on and added commits C and D. This leads to the following graph:

      E---F---G   topic
     /
A---B---C---D     main

Note that from the perspective of topic, C and D just don’t exist; if you run git log on topic, you will only see the following:

A---B---E---F---G   topic

The easiest resolution here is to create a merge commit M to join together the two histories:

      E---F---G---M   topic
     /           /
A---B------C----D     main

If there are any conflicts between E-F-G and C-D, this merge commit is also an opportunity to mark how these conflicts are resolved (recall that it is just a snapshot of the working tree state).

Now, assume that after committing H onto topic, we are finally finished working on it, and want to merge it back into main. If main has not accummulated any subsequent commits since D, then all we need to do is just point its HEAD to that of topic, commit H:

      E---F---G---M---H   main
     /           /
A---B------C----D

Note that this leaves behind a non-linear history that can quickly accumulate into a sprawl as main accummulates merge commits from other branches.

An alternative to merging is to rebase topic’s commits onto main. Consider the commit graph prior to the merge:

      E---F---G   topic
     /
A---B---C---D     main

Instead of reusing commits E, F, and G, we can replay those changes on top of D, and create alternative commits E', F', and G':

              E'--F'--G'  topic
             /
A---B---C---D             main

Note that E', F' and G' are distinct commits from E, F, and G, because the contents of their working tree are now based on changes introduced by C and D. E, F, and G still exist in the sense that they weren’t actually rewritten here, but are no longer accessible from the topic branch (whose HEAD now points to G'):

        E---F---G         (abandoned)
       /
      /       E'--F'--G'  topic
     /       /
A---B---C---D             main

Yet we still say that we are rewriting history because from the perspective of topic, it does appear that we’ve introduced entirely new commits to replace the old ones. And while we’re rewriting history anyway, we can introduce additional changes as we’re creating these new commits.

We typically access these additional features by performing an interactive rebase:

git rebase -i # short for --interactive

This opens up your editor with the list of the commits the rebase is going to perform, and allows you to edit that list to instruct Git to do something different:

pick E Commit message of E
pick F Commit message of F
pick G Commit message of G

For instance, you can change the commit message at any point, or pause the rebase to edit some files before resuming, etc. For instance, to perform E last, after F and G:

pick F Commit message of F
pick G Commit message of G
pick E Commit message of E

Another thing you can do is combine commits, otherwise known as squashing commits. This accumulates their changes into a single commit. For instance, to squash commits F and G together, before performing E:

pick F Commit message of F
squash G Commit message of G
pick E Commit message of E

If you have a messy or outdated commit history, you can use interactive rebasing to tidy it up locally and bring it up to date, before pushing or merging it elsewhere.

However, be mindful that this does rewrite history, and so if your pre-rebase commits were already published elsewhere, they will likely result in conflicts. For instance, if you had already pushed E, F, and G upstream, you will be unable to push your rebased branch without first pulling to merge those commits back in, defeating the purpose of rebasing in the first place. In this case, the solution is to force push:

git push -f # short for --force

This will forcibly overwrite the upstream HEAD to your current HEAD. If those commits upstream contain any data not in your local commit history, they will be lost, so force pushing is usually done as a last resort.