docker registry是一个用来存储符合OCI标准的镜像的仓库,源代码见github上的distribution项目。镜像仓库也可以看作是一个应用,docker也提供了该应用的镜像,名字就叫registry,可以通过docker pull registry命令拉取。
镜像其实就是一系列的由静态文件组成的层(layer),registry是如何存储镜像的呢?
registry会把与镜像有关的全部内容存到一个根目录下,根目录又分为两个目录,一个叫blobs,一个叫repositories。
先看blobs目录。在registry中,blobs可分为三类,一类是组成镜像的层(layer),一类是镜像的manifest文件,一类是镜像的manifest list文件。每一个文件都会计算出其sha256编码,然后用编码的前64位作为目录名建立一个目录,目录中只有一个名为data的文件,该文件中存储的就是相应的数据。把所有的64位编码的目录放在同一个根目录下就可以了,但是为了便于索引,再提取前两位,建立更高一层的目录。这样在所有的64位编码的目录中,前两位重复的就自然而然的放到同一个目录下。blobs目录的一个示例如下图所示:
然后是repositories目录,该目录的结构要比blobs目录复杂的多,repositories目录下首先是各个仓库组成的目录,每个仓库一个目录,目录的名字就是仓库的名字。
然后在每一个仓库下面都有三个目录,分别是:_layers,_manifests,_uploads 。
_uploads目录不用过多关注,当我们向registry上传镜像时,该目录会用来存放正在上传的镜像数据,上传结束后,所有数据会移动到blobs目录下,_uplpads目录就为空了。
而_layers,_manifests目录下全部都是link文件,这些link文件链接到blobs目录下的对应文件。之前说过,blobs目录下的文件分为三类:layer文件、manifest文件、manifest list文件。_layers目录下的link文件,与blobs目录下保存的属于该仓库镜像的layer文件一一对应。而_manifests目录又可分为两个子目录,一个是tags,一个是revisions,revisions目录下保存的就是所有版本的manifest文件和所有版本的manifest list文件的link文件。而tags目录则把该仓库按照镜像的不同版本进行分类(比如ubuntu仓库有20.04和18.04两个版本),每一个版本一个目录,每一个版本的目录下又有两个目录,一个是current,一个是index,current目录下保存的是当前版本的manifest文件的link文件,链接到blobs目录下的相应manifest文件。index目录是为了支持删除操作的,保存了当前版本的所有manifest文件的链接,当执行删除操作时,通过index目录可以将与该tag相关的所有blob进行删除。repositories目录的一个示例如下:
以上内容参考distribution项目中distribution/distribution/v3/registry/storage/paths.go文件中pathFor函数的注释,把相关内容粘贴如下:
func pathFor(spec pathSpec) (string, error)//函数声明
//源代码中相关注释如下:
// The path layout in the storage backend is roughly as follows:
//
// <root>/v2
// -> repositories/
// -><name>/
// -> _manifests/
// revisions
// -> <manifest digest path>
// -> link
// tags/<tag>
// -> current/link
// -> index
// -> <algorithm>/<hex digest>/link
// -> _layers/
// <layer links to blob store>
// -> _uploads/<id>
// data
// startedat
// hashstates/<algorithm>/<offset>
// -> blob/<algorithm>
// <split directory content addressable storage>
//
// The storage backend layout is broken up into a content-addressable blob
// store and repositories. The content-addressable blob store holds most data
// throughout the backend, keyed by algorithm and digests of the underlying
// content. Access to the blob store is controlled through links from the
// repository to blobstore.
//
// A repository is made up of layers, manifests and tags. The layers component
// is just a directory of layers which are "linked" into a repository. A layer
// can only be accessed through a qualified repository name if it is linked in
// the repository. Uploads of layers are managed in the uploads directory,
// which is key by upload id. When all data for an upload is received, the
// data is moved into the blob store and the upload directory is deleted.
// Abandoned uploads can be garbage collected by reading the startedat file
// and removing uploads that have been active for longer than a certain time.
//
// The third component of the repository directory is the manifests store,
// which is made up of a revision store and tag store. Manifests are stored in
// the blob store and linked into the revision store.
// While the registry can save all revisions of a manifest, no relationship is
// implied as to the ordering of changes to a manifest. The tag store provides
// support for name, tag lookups of manifests, using "current/link" under a
// named tag directory. An index is maintained to support deletions of all
// revisions of a given manifest tag.