Version Control (Git) -MIT Missing Semester

Version control systems (VCSs) are tools used to track changes to source code (or other collections of files and folders). While other VCSs exist, Git is the de facto standard for version control. Memorize these shell sommands and type them to sync up. If you get any errors, save your work elsewhere, delete the project, and download a refresh copy.

Git’s data model

Snapshots

In Git terminology, a file is calles a “blob”, and it’s just a bunch of bytes. A directory is calles a “tree”, and it maps names of blobs or trees. A snapshot is the top-level tree that is being tracked.

<root> (root)
|
+- foo (tree)
|	|
|	+ bar.txt (blob, contents = "hello world")
|
+- baz.txt (blob, contents = "git is wonderful")

Modeling history: relating snapshots

In Git, a history is a directed acyclic graph (DAG) of snapshots. Git calls these snapshots "commit"s. In the ASCII art, the os correspond to individual commits (snapshots). The arrows point to the parent of each commit.

Data model, as pseudocode

On disk, all Git stores are objects and references: that’s all there is to Git’s data model.

// a file is a bunch of bytes
type blob = array<byte>

// a directory contains named files and directories
type tree = map<string, tree | blob>

// a commit has parents, metadata, and the top-level tree
type commit = struct {
	parent: array<commit>
	author: string
	message: string
	snapshot: tree
}	

Objects and content-addressing

An “object” is a blob, tree, or commit.

type object = blob | tree | commit

In Git data store, all objects are content-addressed by their SHA-1 hash.

objects = map<string, object>

def store(object):
	id = sha1(object)
	objects[id] = object

def load(id):
	return objects[id]

e.g. the tree for the example directory structure above (snopshot)

% git cat-file -p 698281bc680d1995c5f4caaf3359721a5a58d48d
100644 blob 4448adbf7ecd394f42ae135bbeed9676e894af85    baz.txt
040000 tree c68d233a33c5c06e0340e4c224f0afca87c8ce87    foo

The tree itself contains pointers to its contents. When objects reference other objects, they don’t actually contain them in their on-disk represention, but have a reference to them by their hash.

% git cat-file -p 4448adbf7ecd394f42ae135bbeed9676e894af85
git is wonderful

References

Git use human-readable names for SHA-1 hashes, called “references”. References are pointers to commits. For example, the master reference usually points to the latest commit in the main branch of development.

references = map<string, string>

def update_reference(name, id):
	references[name = id]

def read_reference(name):
	return references[name]

def load_reference(name_or_id):
	if name_or_id in references:
		return load(references[name_or_id])
	else:
		returm load(name_or_id)

In Git, “where we currently are” is a special reference called “HEAD”. So, when we takes a new snapshot, we know what it is relative to (how we set the parents field of the commit).

Repositories

We can define what (roughly) is a Git repository: it is the data objects and references. All git commands map to some manipulation of the commit DAG by adding objects and adding/updating references.
Whenever you’re typing in any command, think about what manipulation the command is making to the underlying graph data structure. Conversely, if you’re trying to make a particular kind of change to the commit DAG, e.g. “discard uncommitted chages and make the ‘master’ ref point to commit 5d83f9e”, there’s probably a command to do it:

git checkout master
git reset --hard 5d83f9e

Staging area

Git accomodates such scenarios by allowing you to specify which modifications should be included in the next snapshot through a mechanism called the “staging area”.

Git command-line interface

Basic

git help <command>
git init # creats a new git repo, with data stored in the `.git` directory
git status # tells you what's going on
git add <filename> # adds files to staging area
git commit # creat a new commit
git log # show a flattened log of history
git log --all --graph --decorate # visualizes history as a DAG

Branching and merging

git branch <name> # creates a branch
git checkout -b <name> # creats a branch and switches to it. same as `git branch <name>; git checkout <name>

Remotes

git remote add <name> <url> # add a remote
git push <remote> <local branch>:<remote branch> # send objects to remote, and update remote reference

Undo

git commit --amend # edit a commit's contents/message
git reset HEAD <file> # unstage a file
git checkout -- <file> # discard changes

Exercises

git clone https://github.com/missing-semester/missing-semester
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值