ZFS architecture

This page is designed to take you through a brief overview of the ZFS architecture. It is not intended as an introduction to ZFS. We assume that you already have some familiarity with common terms and definitions, as well as a general sense of file system architecture.

这部分内容会介绍一下ZFS的架构。


Traditionally, ZFS consists of three main components: ZPL (ZFS POSIX Layer), DMU (Data Management Unit), and SPA (Storage Pool Allocator) as indicated in the above image.

通常地,ZFS包含三个主要部分:ZPL(ZFS POSIX Layer), DMU (Data Management Unit), 和 SPA ( Storage Pool Allocator) , 就如同上边的图里展现的那样。


In this picture, you can see the three basic layers, though there are quite a few more elements in each. In addition, we show zvol consumers, as well as the management path, namely zfs(1M) and zpool(1M). You'll find a brief description of all these subsystems below. This is not intended to be an exhaustive overview of exactly how everything works. We hope that this summary tour is easy to follow. If not, feel free to post to http://java.net/projects/solaris-zfs/lists.


在这张图中,你可以看到3层基本的结构,虽然每一层中还有其他的元素。另外,我们提供zvol的使用方法,通过管理路径,称之为zfs和zpool。你可以找到这些子系统的概述在下边。我们希望这个概要可以很容易的去理解。

File System Consumers

These are the basic applications that interact with ZFS solely through the POSIX filesystem APIs. Virtually every application falls into this category. The system calls are passed through the generic OpenSolaris VFS layer to the ZPL.

文件系统消费者

这里有基本的应用与ZFS通过POSIX 文件系统API还进行单一交互,虚拟的,每一个应用都会被归入这一个类别。系统调用会被传递给通用的OpenSolaris VFS 层 到 ZPL .


Device Consumers

ZFS provides 'emulated volumes' or volumes or zvols. These volumes are backed by storage from a storage pool, but appear as a normal device under /dev. This is not a typical use case, but there are a small set of cases where this capability is useful. There are a small number of applications that interact directly with these devices, but the most common consumer is a kernel filesystem or target driver layered on top of the device.

设备消费者

ZFS提供了模拟卷或者卷或者Zvol。这些卷来自storage poll后端的存储,但是却显示为一个普通的设备在 /dev 目录下。 这不是一种典型的使用方法,但是在一部分场合,这种用法是有效的。有一些应用,需要直接去使用这些设备。但是绝大多数平常的消费者是一个内核文件系统,或者目标驱动层在设备之上。


Management GUI

A web-based ZFS GUI is available in Solaris 10 releases and on the ZFS storage appliance.

基于web的ZFS GUI在Solaris 10上已经部署了,并且在ZFS 存储设备上。


Management Consumers

These applications manipulate ZFS file systems or storage pools, including examining properties and dataset hierarchy. While there are some scattered exceptions (zoneadm, zoneadmd, fstyp), the two main applications are zpool(1M) and zfs(1M).

这些应用操作ZFS文件系统或者存储池,包含检查属性或者dataset的层级信息。这里还有一些分散的情况(zoneadm,zoneadmd,fstyp),两个主要的应用是zpool和zfs。


JNI

This library provides a Java interface to libzfs and is tailored specifically for the GUI. As such, it is geared primarily toward observability, as the GUI performs most actions through the CLI.

这个库提供了一个java 接口对libzfs,并且是专门针对GUI裁剪过的。因此,他主要是针对监控性的,因为GUI通过CLI完成大多数操作。


libzfs

This is the primary interface for management apps to interact with the ZFS kernel module. The library presents a unified, object-based mechanism for accessing and manipulating both storage pools and file systems. The underlying mechanism used to communicate with the kernel is ioctl(2) calls through /dev/zfs.

libzfs

这是管理应用与ZFS的内核模块来交互的主要接口。这个库展示出一个统一的,基于对象的机制来访问和操作存储池和文件系统的。这个潜在的工作机制用来与内核交互的,使用通过调用/dev/zfs 的ioctl


ZPL (ZFS POSIX Layer)

The ZPL is the primary interface for interacting with ZFS as a file system. It is a (relatively) thin layer that sits atop the DMU and presents a filesystem abstraction of files and directories. It is responsible for bridging the gap between the VFS interfaces and the underlying DMU interfaces. It is also responsible for enforcing ACL (Access Control List) rules as well as synchronous (O_DSYNC) semantics.

ZFS POSIX 层

ZPS是主要的接口用来与ZFS交互的,作为一个文件系统。他相对于DMU来说,是更精简的,更抽象的一层。并且抽象为一个文件系统(文件和目录组成)。它主要负责将VFS接口与DMU以下的接口之间的差距连接起来。他也负责来加强ACL规则(Access control list),同样的数据同步(O_DSYNC) 语意。


ZVOL (ZFS Emulated Volume)

ZFS includes the ability to present raw devices backed by space from a storage pool. These are known as 'zvols' within the source code, and is implemented by a single file in the ZFS source.

ZVOL (ZFS 模拟卷)

ZFS 包含这种功能来从后端存储池的空间中来提供裸设备的功能。这些在源码中,被称为 zvol . 并且在ZFS 源中,通过一个单独的文件来实现。


/dev/zfs

This device is the primary point of control for libzfs. While consumers could consume the ioctl(2) interface directly, it is closely entwined with libzfs, and not a public interface (not that libzfs is, either). It consists of a single file, which does some validation on the ioctl() parameters and then vectors the request to the appropriate place within ZFS.

/dev/zfs

这个设备是用来控制libzfs的主要一点。当消费者可以直接通过ioctl(2)接口来控制设备时,它实际上和libzfs配合的很紧密,这并不是一个公共的接口。他包含一个单独的文件,来确认ioctl参数和一些向量来请求相应的ZFS的位置。


DMU (Data Management Unit)

The DMU is responsible for presenting a transactional object model, built atop the flat address space presented by the SPA. Consumers interact with the DMU via objsets, objects, and transactions. An objset is a collection of objects, where each object is an arbitrary piece of storage from the SPA. Each transaction is a series of operations that must be committed to disk as a group; it is central to the on-disk consistency for ZFS.

DMU(数据管理单元)

这个DMU模块负责展现一个交易对象模型,建立在SPA提供的地址空间之上。消费者与DMU交互通过对象集,对象和交易。对象级是对象的集合,每一个对象都是一个专属的存储区域来自SPA。每一个交易都是一系列的操作,这些操作必须committed到磁盘上作为一个群体。这是ZFS disk完整性的集中控制部分。


DSL (Dataset and Snapshot Layer)

The DSL aggregates DMU objsets into a hierarchical namespace, with inherited properties, as well as quota and reservation enforcement. It is also responsible for managing snapshots and clones of objsets.

DSL(Dataset and Snapshot layer)

这个DSL集合了DMU的对象级,赋予了等级的命名空间,具有继承的属性,比如说quota和加强的保留。他还用来负责管理对象的快照和克隆。


ZAP (ZFS Attribute Processor)

The ZAP is built atop the DMU, and uses scalable hash algorithms to create arbitrary (name, object) associations within an objset. It is most commonly used to implement directories within the ZPL, but is also used extensively throughout the DSL, as well as a method of storing pool-wide properties. There are two very different ZAP algorithms, designed for different type of directories. The "micro zap" is used when the number of entries is relatively small and each entry is reasonably short. The "fat zap" is used for larger directories, or those with extremely long names.

ZAP(ZFS 属性处理器)

ZAP建立在DMU之上,并且使用可扩展的哈希算法来创建与对象集专用的联系(名称,对象)。他是通常用来部署目录在ZPL中,也广泛用于DSL中,作为一个方法在存储池的属性上。这里有两种非常不同的ZAP机制,为不同类型的目录设计的。“Mirco zap”  用来处理目录的数量相对较少,而且每一个条目很短时。“fat zap”用来处理大目录,或者那些具备非常长名称的条目。


ZIL (ZFS Intent Log)

While ZFS provides always-consistent data on disk, it follows traditional file system semantics where the majority of data is not written to disk immediately; otherwise performance would be pathologically slow. But there are applications that require more stringent semantics where the data is guaranteed to be on disk by the time the read(2) or write(2) call returns. For those applications requiring this behavior (specified with O_DSYNC), the ZIL provides the necessary semantics using an efficient per-dataset transaction log that can be replayed in event of a crash.

ZIL(ZFS 内容日志)

当ZFS提供总是一致的数据在磁盘上,它遵守传统文件系统的语义,大部分的数据不会马上写到磁盘上;相反的,性能可能会有些慢。但是有一部门应用要求更加急切的语义,数据要求保证在磁盘上,当读或者写调用返回时。对于这些要求比较高的应用(特定 with O_DSYNC),ZIL提供了必要的语义使用了有效的数据集基于的交易日志,这些日志在系统崩溃的时候可以用来恢复。


Traversal

Traversal provides a safe, efficient, restartable method of walking all data within a live pool. It forms the basis of resilvering and scrubbing. It walks all metadata looking for blocks modified within a certain period of time. Thanks to the copy-on-write nature of ZFS, this has the benefit of quickly excluding large subtrees that have not been touched during an outage period. It is fundamentally a SPA facility, but has intimate knowledge of some DMU structures in order to handle snapshots, clones, and certain other characteristics of the on-disk format.

Traversal

Traversal 提供了一个安全的,有效的,可以重新启动的方法对一个活动池的遍历。它构成了重新同步和scrubbing的基础。他会搜索所有的元数据来查找一段时间内修改过的数据块。感谢ZFS的copy on write本质,这一点可以快速的去除那些大量的子树,这些都没有被修改过。这是SPA机制的机制,但是有密切的知识对于一部分DMU架构,用来处理快照、克隆和一些特定的磁盘格式的属性。


ARC (Adaptive Replacement Cache)

ZFS uses a modified version of an Adaptive Replacement Cache to provide its primary caching needs. This cache is layered between the DMU and the SPA and so acts at the virtual block-level. This allows filesystems to share their cached data with their snapshots and clones.

ARC(自适应可更换缓存)

ZFS 使用修改过的ARC版本来提供主要的缓存需求,这部分缓存部署在DMU和SPA之间,在虚拟block一级。它允许文件系统可以共享缓存数据,包括快照和克隆。


Pool Configuration (SPA)

While the entire pool layer is often referred to as the SPA (Storage Pool Allocator), the configuration portion is really the public interface. It is responsible for gluing together the ZIO and vdev layers into a consistent pool object. It includes routines to create and destroy pools from their configuration information, as well as sync the data out to the vdevs on regular intervals.

Pool configuration(SPA)

整个存储池这一层,通常被指为SPA(存储池部署器),配置部分通常指公共接口。它用来负责粘合ZIO和vdev层成一个统一的池对象。他负责创建和销毁池的配置信息,也用来同步数据到vdev上。在通常的间隔上。


ZIO (ZFS I/O Pipeline)

The ZIO pipeline is where all data must pass when going to or from the disk. It is responsible for translation DVAs (Device Virtual Addresses) into logical locations on a vdev, as well as checksumming and compressing data as necessary. It is implemented as a multi-stage pipeline, with a bit mask to control which stage gets executed for each I/O.

ZIO(ZFS I/O 管道)

ZIO管道指的是所有数据从磁盘上传输的通道,它用来负责传输DVA(设备虚拟地址)到vdev的逻辑地址,checksumming 和压缩数据如果必须的话。他是multistage 管道,通过bit mask来管理哪一个通道来获取每一个IO


VDEV (Virtual Devices)

The virtual device subsystem provides a unified method of arranging and accessing devices. Virtual devices form a tree, with a single root vdev and multiple interior (mirror and RAID-Z) and leaf (disk and file) vdevs. Each vdev is responsible for representing the available space, as well as laying out blocks on the physical disk.

Vdev (虚拟设备)

虚拟设备子系统提供了一个统一的方法来管理和访问设备。虚拟设备构成了一个树,拥有单独的root vdev和多个内部(镜像和RaidZ)和页(磁盘和文件)vdev. 每一个vdev用来负责展示可用的空间,物理磁盘上block的分布。


LDI (Layered Driver Interface)

At the bottom of the stack, ZFS interacts with the underlying physical devices through LDI, the Layered Driver Interface, as well as the VFS interfaces (when dealing with files).

LDI (Layer Driver 接口)

在软件栈的最下层,ZFS与物理设备通过LDI接口来交互,这是驱动层,和 VFS相似 。


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值