1. low level command
git ls--files --stage
: view index contentgit cat-file < blob | tree | commit > hash
: view objects contentgit ls-tree hash
: view tree object content, becausegit cat-file tree hash
outputs are not human readable, so use this command instead.git show
is high level command, we cannot view the detail.
2. check hash
- (printf ‘blob %s\0” $(git cat-file blob hash | wc -c); git cat-file blob hash) | sha1sum
- replace blob to other object type, say tree or commit; replace hash to your correct object name.
- ref: what-is-a-git-commid-id
3. practice
(1) step 1: init repository and make a commit
~ $ mkdir alpha
~ $ cd alpha
~ $ git init
~ $ printf '1' > number.txt
~ $ git add number.txt
The work copy looks like this:
alpha
|——number.txt
You can check the .git/objects directory:
After add operation, it create 56 directory and contain file a6051ca2b02b04ef92d5150c9ef600403cb1de.
The hash 56a6051ca2b02b04ef92d5150c9ef600403cb1de is a blob name, and first two char use as directory name.
~ $ git commit -m '1'
After commit, it create a tree object and commit object.
Tree object hash is 920ec0a249d3e1cb9fef5927f0e19758ed8f1455
Commit object hash is 9cf95226987e82287cf00f58bbff21a2a4b20685
As Fig.1 show below:
Fig 1 Tree graph for ‘1’ commit
step 2: commit ‘data/1’
~ $ printf '1' > data/number.txt
~ $ git add data/number.txt
~ $ git commit -m 'data/1'
The working copy looks like this:
alpha
|—— number.txt
|—— data
|—— number.txt
We can treat each commit as a new version of the project. As the project goes on, each commit linked previous commit, forming a commit chain. For each commit, it form tree structure. If previous commits already contained object, it’ll reuse that object, that is point that object instead of recreating it.
Understand tree structure from bottom to up. The leaf node is blob object.
- files (data/number.txt) –> blob objects (56)
- blob objects (56) –> tree objects (92)
- tree objects (92) + blob objects (56) –> tree objects (ee)
The commit graph show as Fig 2.
Fig 2 commit ‘data/1’
step 3: commit ‘2’
we change the content of ‘number.txt’. we can infer that it’ll create a new blob object, because it’ll affect tree object (ee), so tree object (ee) will change, commit object (44) as well. Other objects remain unchange. Let’s verify it.
~ $ printf '2' > number.txt
~ $ git add number.txt
~ $ git commit -m '2'
After commit ‘2’, the commit graph show Fig 3.
Fig 3. commit ‘2’
step 3: commit ‘data/3’
~ $ printf '3' > data/number.txt
~ $ git add data/number.txt
~ $ git commit -m 'data/3'
we can sure that it’ll affect blob (data/number.txt), tree (data), tree (root), and commit object. show as Fig.4.
you can continue to explore what happen when you do add or commit command, and verify what you think. But now, we stop it.
summary
Git is built on a graph. Almost every Git command manipulates this graph. To understand Git deeply, focus on the properties of this graph, not workflows or commands.
To learn more about Git, investigate the .git directory.
- Git is built on graph (data structure). but we can focus on commit object, then we can treat it as linked list which is more easy to understand.
- Then, we focus on a specific commit object and ignore objects reuse, then we get a simple tree data structure. we are familiar with directory tree.
- Finally, if commits share same objects, we only need to preserve one object, and all commits pointer to that object.
- From bottom (blob object) to up (commit object), generate the hash.
- Do it by yourself, then you can understand Git better.