Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
Understanding Git — Branching
This is the second post in my Understanding Git series so be sure to check out the first post that deals with git’s data model before you start with this one.
Let’s start where we left off last time — at git’s data model. Only this time we will simplify it a bit by only displaying the commit objects and giving them some symbolic names instead of checksums (just to make it easier to follow), so we get a graph like this:
Git data model simplified by displaying only commit objects
Those familiar with the graph theory will notice that this is a Directed Acyclic Graph (DAG). What that means is that the connection edges between graph nodes (in git’s case commits) are directed and if you start from one node travelling through the graph and following the edges direction you can never come to the same node that you started off (there is no “round-trips” ).
It is pretty much intuitive that we can differ three branches on our example graph. We’ll mark them as red (containing commits A,B,C,D,E), blue (containing commits A,B, F ,G) and green (containing commits A,B,H,I,J).
Git data graph containing three branches
So that’s one way of defining a branch — to associate it with a list of commits it contains. However, this is not the way git does it. Git uses a simpler and cheaper solution. Instead of having a list of all the commits belonging to a branch and keeping it updated, git only keeps track of the last commit on a branch. By knowing the last commit of a branch it is quite trivial to reconstruct the whole commits list of that branch just by following the directed edges of the git commit graph. For example, to define our blue branch, we only need to know that the last commit on the blue branch is G and from there if we need a list of all commits the blue branch contains we can just follow the directed graph edges starting from G.
Knowing the last commit on the Blue branch we can easily reconstruct its whole commits list
And this is how git manages branches, by keeping pointer to commits. So let’s see it “in action”.
First, we will initialise and empty repository
git init
and take a look at .git directory
$ tree .git/
.git/├── HEAD├── config├── description├── hooks│ ├── applypatch-msg.sample│ ├── commit-msg.sample│ ├── post-update.sample│ ├── pre-applypatch.sample│ ├── pre-commit.sample│ ├── pre-push.sample│ ├── pre-rebase.sample│ ├── pre-receive.sample│ ├── prepare-commit-msg.sample│ └── update.sample├── info│ └── exclude├── objects│ ├── info│ └── pack└── refs ├── heads └── tags
This time we will focus on the refs sub-directory. It stands for references and this is where git keeps the branch pointers.
Since we didn’t commit any changes yet, refs directory is empty, so we will create and commit a few files.
echo "Hello World" > helloEarth.txtgit add . git commit -m "Hello World Commit"
echo "Hello Mars" > helloMars.txt git add .git commit -m "Hello Mars Commit"
echo "Hello Saturn" > helloSaturn.txtgit add . git commit -m "Hello Saturn Commit"
If we do git branch now we see this output
* master
meaning we are now on the master branch (that git created automatically upon our first commit).
If we take another look at .git/refs
└── refs ├── heads │ └── master └── tags
we see there is a file in refs/heads sub-directory and it is named master just as our branch is. This is a text file so we can use cat to take a look at it
cat .git/refs/heads/master
and we see it contains a checksum
c641e4f0d19df0570667977edff860fed8f6c05a
and if we do
git log
we see it is the checksum of our last commit:
commit c641e4f0d19df0570667977edff860fed8f6c05a (HEAD -> master)Author: zspajich <zspajich@gmail.com>Date: Mon Feb 12 16:28:44 2018 +0100
Hello Saturn Commit
(Note: checksums will have different values on you computer)
So there we have it — a branch in git is just a text file containing a checksum of the last commit on that branch. In other words — a pointer to a commit.
A branch in git is just a pointer to a commit object
If we now create and checkout a new feature branch
git checkout -b feature
and take another look at .git/refs
tree .git/refs
sure we see another file called feature
└── refs ├── heads │ ├── feature │ └── master
and if take a look at it’s checksum (pointer)
cat .git/refs/heads/master
we see it’s the same as in the master file (branch)
c641e4f0d19df0570667977edff860fed8f6c05a
since we didn’t do any new commits on that branch.
Creating a new branch means creating a new pointer to the current commit
So that’s how fast and cheap creating a new branch in git is. Git just creates a text file and fills it with the checksum of the current commit.
But now that we have two branches there is one question. How does git know which of these two branches we are currently checked on? Well, there is one more special pointer (whose name will probably sound familiar to you) called HEAD . It is special because it (usually) doesn’t point to a commit object, but to a ref (branch) and git uses it to track which branch is currently checked out.
If we look inside HEAD
cat .git/HEAD
we see it currently points to the feature ref file (branch).
ref: refs/heads/feature
Special HEAD pointer tracks current ref/branch
If we would do
git checkout master
and take a look at HEAD
cat .git/HEAD
we would see
refs: refs/heads/master
it would point to the master branch.
HEAD points to master ref after checkout on master branch
So that‘s git’s branch model. It is very simple but important to know in order to understand many git operations that operate on that graph (merge, rebase, checkout, revert …).
body[data-twttr-rendered="true"] {background-color: transparent;}.twitter-tweet {margin: auto !important;}
finally figuring out that git commands are strangely named graph manipulation commands--creating/deleting nodes, moving pointers around
function notifyResize(height) {height = height ? height : document.documentElement.offsetHeight; var resized = false; if (window.donkey && donkey.resize) {donkey.resize(height); resized = true;}if (parent && parent._resizeIframe) {var obj = {iframe: window.frameElement, height: height}; parent._resizeIframe(obj); resized = true;}if (window.location && window.location.hash === "#amp=1" && window.parent && window.parent.postMessage) {window.parent.postMessage({sentinel: "amp", type: "embed-size", height: height}, "*");}if (window.webkit && window.webkit.messageHandlers && window.webkit.messageHandlers.resize) {window.webkit.messageHandlers.resize.postMessage(height); resized = true;}return resized;}twttr.events.bind('rendered', function (event) {notifyResize();}); twttr.events.bind('resize', function (event) {notifyResize();});if (parent && parent._resizeIframe) {var maxWidth = parseInt(window.frameElement.getAttribute("width")); if ( 500 < maxWidth) {window.frameElement.setAttribute("width", "500");}}
In our next and last part of this series we will look at something that we have skipped so far — git workflow. We all know we have to stage our changes before committing them, but what exactly is that staging directory or index as it is sometimes called? We’ll see in the next post.
Understanding Git — Branching was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.