[高级操作系统] VFS详解(虚拟文件系统)

前言

本文旨在介绍linux(Unix)中的虚拟文件系统(VFS),我觉得学习就应该用一问一答式的方法,这样才能解自己的惑,所以在此以问答式方法来进行内容的讲解(新知识的学习)。

br,1=<br />

1 关于VFS,需要提出的几个问题(及对应答案)

1.1 VFS是什么?

VFS,全称:Virtual File System,即虚拟文件系统 [1],但是其还有一个称呼:virtual filesystem switch,我理解为虚拟文件系统开关 [2]。

VFS是一个在更具体文件系统之上的抽象层。

A Virtual File System (VFS) or virtual filesystem switch is an abstract layer on top of a more concrete file system. [2]


1.2 VFS的作用是什么呢?

1)使得用户应用能够访问不同类型的具体文件系统,比如,一个VFS能够用来访问一个本地或者网络存储设备且同时不会让用户应用知道区别;
2)VFS可以用来连接(brige)Windows,经典的Mac OS以及Unix文件系统,使得应用能够访问这些类型的本地文件系统上的文件,而不需要知道他们正在访问的是哪个文件系统;
3)VFS明确了内核(kernel)和具体文件系统之间的接口(或者说契约contract),所以通过实现对应的契约,我们就能够很轻松的给内核添加对新文件系统的支持;

此外,根据 [3] 的描述,还有以下作用:
4)保持对可用文件系统类型的跟踪(掌握);
5)连接或者断开设备和合适的文件系统实体;
6)对涉及到文件的操作,做任何合理的一般性的处理;
7)当特定文件系统的操作成为必需时,向量化他们到负责文件、目录或者inode(索引节,信息节点)的文件系统中。

此外,根据 [7、8]:
8)将底层(“low-level”)的文件系统代码和内核的其余部分分离开来;
9)既然有虚(virtual)文件系统,那肯定就有实(real)文件系统,VFS必须要能够管理所有不同类型的不同的文件系统(任何时间挂载的),为了做到这个,VFS有一套数据结构来描述整个虚文件系统以及现实且已挂载的文件系统。

但是呢,最后没想到在这个网站看到了我最想要的答案:

To ease the addition of new file systems and provide a generic file API, VFS, a virtual file system layer, was added to the Linux kernel. The extended file system (ext), was released in April 1992 as the first file system using the VFS API and was included in Linux version 0.96c.

大意: 为了让添加新的文件系统更加容易,并且提供一个通用(generic)的文件API,VFS,一个虚拟文件系统层被加到了linux内核中。 而对于EXT2这样的真实(real)文件系统,是去调用VFS API的。


1.3 谁在使用VFS?

1)device drivers,设备驱动
2)filesystems,文件系统

那么什么是设备驱动呢?
[5] 只是简单介绍了一下怎么写一个设备驱动,真正的含义还得参考维基百科 [6]

In computing, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer.[1] A driver provides a software interface to hardware devices, enabling operating systems and other computer programs to access hardware functions without needing to know precise details about the hardware being used.

A driver communicates with the device through the computer bus or communications subsystem to which the hardware connects. When a calling program invokes a routine in the driver, the driver issues commands to the device. Once the device sends data back to the driver, the driver may invoke routines in the original calling program. Drivers are hardware dependent and operating-system-specific. They usually provide the interrupt handling required for any necessary asynchronous time-dependent hardware interface.[2]

以上不详述,大意:
1)设备驱动式一个计算机程序,用来控制附在计算机上的特定类型的设备;一个设备驱动给硬件设备提供了一个软件接口,使得操作系统和其他计算机程序能够访问硬件函数(而不需要知道硬件是如何被使用的相关详细信息)
2)一个驱动通过连接硬件的计算机总线或者通信子系统来和硬件进行交流。当一个调用程序触发了驱动中了一个工作,驱动就会发起指令到设备,一旦设备把数据送回驱动,驱动就会触发原来调用程序的工作。
3)驱动是依赖于硬件的,也是操作系统特定的(意思就是不同的操作系统需要不同的驱动)。

突然发现,维基百科是真的很厉害,能够在仅仅两段话,用这么少的文字,来把一个概念解释的这么清楚,我觉得确实很强。以后要多参考维基百科,不能简单认为这样的百科只是“普通知识”而“看不上眼”。


1.4 VFS和文件系统到底什么关系呢?

见 1.5 (即下一小节)中的 结构 版块。


1.5 操作系统中的文件系统是什么呢?

定义:

In computing, a file system or filesystem controls how data is stored and retrieved. Without a file system, information placed in a storage medium would be one large body of data with no way to tell where one piece of information stops and the next begins. By separating the data into pieces and giving each piece a name, the information is easily isolated and identified. Taking its name from the way paper-based information systems are named, each group of data is called a “file”. The structure and logic rules used to manage the groups of information and their names is called a “file system”.

作用:
文件系统是用来管理存储设备(storage devices,storage medium,比如机械硬盘HDD,SSD,磁带,光盘)上存储的数据的。[10]

结构:

A file system consists of two or three layers. Sometimes the layers are explicitly separated, and sometimes the functions are combined.[7]

The logical file system is responsible for interaction with the user application. It provides the application program interface (API) for file operations — OPEN, CLOSE, READ, etc., and passes the requested operation to the layer below it for processing. The logical file system “manage[s] open file table entries and per-process file descriptors.”[8] This layer provides “file access, directory operations, [and] security and protection.”[7]

The second optional layer is the virtual file system. “This interface allows support for multiple concurrent instances of physical file systems, each of which is called a file system implementation.”[8]

The third layer is the physical file system. This layer is concerned with the physical operation of the storage device (e.g. disk). It processes physical blocks being read or written. It handles buffering and memory management and is responsible for the physical placement of blocks in specific locations on the storage medium. The physical file system interacts with the device drivers or with the channel to drive the storage device.[7]

原来VFS这一层是可选的(optional,即不是必须的)。这个接口对多个并行的物理文件系统实例(每一个都叫做文件系统的实现)提供支持。意思是: 如果只有一个文件系统,那么我可以不要VFS,但是在有多个文件系统的时候,VFS就不可或缺了,其在中间起协调、管理作用。 [11]

在这里插入图片描述


根据 [11],更准确的说,文件系统有三个组成部分
1)Device Drivers (原来设备驱动也算文件系统的一部分):直接和外围设备交互,进行IO操作,发起IO请求,为优化性能来调度对设备的访问
2)Basic File System:使用特定的设备驱动,有点像上面讲的第三层(physical layer);
3)Logical File System:提供和用户交互的接口。


1.6 VFS由什么构成呢?

根据 [8],可以大概看出来VFS是怎么组织的。
在这里插入图片描述

VFS用 superblocks 和 inodes 来描述系统文件(和EXT2的文件系统使用 superblocks,inodes非常相似)。其中 inodes 用来描述系统内的文件和目录。

每个文件系统在初始化的时候,它会用VFS注册它自己(我理解的是:每个真实文件系统挂载上来时,都会在VFS上登记备案 ),这通常发生在操作系统在系统引导(system boot time)时初始化它自己的过程。

The real file systems are either built into the kernel itself or are built as loadable modules. File System modules are loaded as the system needs them, so, for example, if the VFAT file system is implemented as a kernel module then it is only loaded when a VFAT file system is mounted. When a block device based file system is mounted, and this includes the root file system, the VFS must read its superblock. Each file system type’s superblock read routine must work out the file system’s topology and map that information onto a VFS superblock data structure. The VFS keeps a list of the mounted file systems in the system together with their VFS superblocks. Each VFS superblock contains information and pointers to routines that perform particular functions. So, for example, the superblock representing a mounted EXT2 file system contains a pointer to the EXT2 specific inode reading routine. This EXT2 inode read routine, like all of the file system specific inode read routines, fills out the fields in a VFS inode. Each VFS superblock contains a pointer to the first VFS inode on the file system. For the root file system, this is the inode that represents the ``/’’ directory. This mapping of information is very efficient for the EXT2 file system but moderately less so for other file systems.

大意:
1)真实文件系统要么在内核中构建(意思就是直接装在内核里面成为一部分了),要么以可加载模块的方式被构建。文件系统模块只有在系统需要他们的时候,才会被加载。比如,VFAT 文件系统被实现成一个内核模块,只有当一个VFAT 文件系统被挂载的时候,它才会被加载。
2)当一个基于块设备的文件系统被加载,且这包含根文件系统,VFS就必须读取它的superblocks(就叫超级块吧),每个文件系统类型的超级块read routine必须算出文件系统的拓扑结构,并且把这个信息映射到VFS的超级块数据结构中。
3)VFS保持着一个在系统(意思就是操作系统)中被挂载文件系统的列表以及他们的VFS超级块。每个VFS超级块包含到routines(指令集 a list of instructions that enable a computer to perform a particular task)的信息和指针。

那上图的cache和directory cache又是什么呢? [8]

As the system’s processes access directories and files system routines are called which traverse the VFS inodes in the system.

For example, typing ls for a directory or cat for a file cause the the Virtual File System to search through the VFS inodes which represent the file system. As every file and directory on the system is represented by a VFS inode then a number of inodes will be being repeatedly accessed. These inodes are kept in the inode cache which makes access to them quicker. If an inode is not in the inode cache, then a file system specific routine must be called in order to read the appropriate inode. The action of reading the inode causes it to be put into the inode cache and further accesses to the inode keep it in the cache. the less used VFS inodes get removed from the cache.

All of the Linux file systems use a common buffer cache to cache data buffers from the underlying devices to help speed up access by all of the file systems to the physical devices holding the file systems.

This buffer cache is independent of the file systems and is integrated into the mechanisms that the Linux kernel uses to allocate and read and write data buffers. It has the distinct advantage of making the Linux file systems independent from the underlying media and from the device drivers which support them. All block structured devices register themselves with the Linux kernel and present a uniform, block based, usually asynchronous interface. Even relatively complex block devices such as SCSI devices do this. As the real file systems read data from the underlying physical disks, this results in requests to the block device drivers to read physical blocks from the device that they control. Integrated into this block device interface is the buffer cache. As blocks are read by the file systems they are saved in the global buffer cache shared by all of the file systems and the Linux kernel. Buffers within it are identified by their block number and a unique identifier for the device that read it. So, if the same data is often needed, it will be retrieved from the buffer cache rather than read from the disk, which would take somewhat longer. Some devices support read ahead where data blocks are speculatively read just in case they are needed.

The VFS also keeps a cache of directory lookups so that the inodes for frequently used directories can be quickly found.

As an experiment, try listing a directory that you have not listed recently. The first time you list it, you may notice a slight pause but the second time that you list its contents the result is immediate. The directory cache does not store the inodes for the directories itself, these should be in the inode cache, it stores the mapping between the full directory names and their inode numbers.

大意:
1)所有的linux文件系统都使用一个常用的buffer cache(缓冲区缓存)来缓存来自底层设备的data buffers,这样可以加速文件系统对保有这个文件系统的物理设备的访问。
2)VFS也保有了一个文件夹查找(directory lookups)缓存(cache),以便经常被使用的文件夹的inodes能够被快速找到。(所以,通常在ls一个最近从没打开或者ls过的文件夹时,会有一点点停顿,但第二次就会立马显示结果。)
3)文件夹缓存不存储针对文件夹本身的inodes(这些应该放在inodes 缓存中),它存储了整个文件夹名字和他们的inode号之间的映射。

4)(漏了对inode的介绍,这里补上):系统中的每个文件或者文件夹都是用一个VFS inode表示的,放在inode cache里面的inode会被更快的访问,如果某个inode不在cache中,那么一个文件系统的指定routine(指令集)就会被调用来读取对应的inode。读取这个inode使得它被放进缓存中,进一步的访问会将它保持在缓存中。此外,更少被用到的VFS inodes会从cache中移除。


1.7 进一步了解superblocks和inodes

每个被挂载的文件系统都用一个VFS超级块表示。

VFS Superblock的组成如下:[12]

  • 设备
  • inode指针:被挂载的inode指针指向文件系统的第一个inode,被覆盖的inode指针指向文件系统挂载在的文件夹。
  • 块大小
  • 超级块操作:指向一系列超级块指令集
  • 文件系统类型
  • File System specific(文件系统细节)

VFS中的每一个文件,文件夹都只对应一个VFS inode。[13]

The information in each VFS inode is built from information in the underlying file system by file system specific routines. VFS inodes exist only in the kernel’s memory and are kept in the VFS inode cache so long as they are useful to the system. Amongst other information, VFS inodes contain the following fields:

VFS inode的组成如下:

  • 设备
  • inode号
  • mode
  • user ids:所有者的标识
  • times:创建、修改、写的次数
  • 块大小
  • inode操作
  • count
  • lock
  • dirty:一个标志位,如果这个VFS inode被写了,那么底层的文件系统也要做相应修改。
  • file system specific information

2 总结

其实学到这里我觉得差不多了,对VFS和文件系统都有了一个更深的了解。

此外,还有两个非常值得一看的文章:
Anatomy of the Linux file system https://www.ibm.com/developerworks/library/l-linux-filesystem/index.html 这个是将文件系统的,有图,有详细解释,逻辑性也很强。
Anatomy of the Linux virtual file system switch https://www.ibm.com/developerworks/library/l-virtual-filesystem-switch/index.html 这个是讲VFS的,我觉得也是不可多得,值得一看的资料。

二者都来自IBM,讲的很细。

对于接下来,我希望能够尝试写一个最简单的操作系统,在实践中领悟操作系统的种种设计原则和奇妙之处(不过估计没时间,这个太花时间了)。

https://unix.stackexchange.com/questions/34089/writing-a-kernel-from-scratch
https://tutorialsbynick.com/writing-an-os-baby-steps/
https://blog.csdn.net/u010469993/article/details/64126587 (可以看看右侧相关链接)
https://www.reddit.com/r/C_Programming/comments/8oyaux/a_stepbystep_guide_how_to_write_your_own_simple/

参考文献

[1] VFS https://baike.baidu.com/item/VFS/7519887
百度百科

[2] Virtual file system https://en.wikipedia.org/wiki/Virtual_file_system
维基百科,定义还是比较准确的。

[3] A tour of the Linux VFS https://www.tldp.org/LDP/khg/HyperNews/get/fs/vfstour.html
经典的元老级文章,据 [4] 记载是1996年写的。

[4] Overview of the Linux Virtual File System https://www.kernel.org/doc/Documentation/filesystems/vfs.txt

[5] What is a Device Driver? https://www.tldp.org/LDP/khg/HyperNews/get/devices/whatis.html
来自[3]

[6] Device driver https://en.wikipedia.org/wiki/Device_driver

[7] The Linux Virtual File System https://www.win.tue.nl/~aeb/linux/lk/lk-8.html
这个还是讲的不错的。感觉作者对linux,VFS很熟悉。

[8] The Virtual File System (VFS) http://www.science.unitn.it/~fiorella/guidelinux/tlk/node102.html
这里有个图解,和 [7] 相得益彰吧。

[9] ext2 https://en.wikipedia.org/wiki/Ext2
没想到在这里找到了我想要的对于VFS起源的解释。

[10] File system https://en.wikipedia.org/wiki/File_system
文件系统,但是没想到里面对virtual file system的描述这么精准,看来有时候还得触类旁通,不能一直扎在一个点上。我在VFS的介绍网页中没看到这种描述,但是在文件系统的百科中却找到了。

[11] The file system http://www.cs.jhu.edu/~yairamir/cs418/os7/sld004.htm
来自 [10],这个讲的非常好,但是他的界面是这样的:
在这里插入图片描述

明显,只能一页一页翻,很不方便。
一番研究之后,我点了这个主页(“小屋子”按钮)按钮,发现主页( http://www.cs.jhu.edu/~yairamir/cs418/600-418.html )可以下载:
在这里插入图片描述

是很好的教材,所以我先收藏了,讲的很好。

[12] The VFS Superblock http://www.science.unitn.it/~fiorella/guidelinux/tlk/node103.html#SECTION001121000000000000000
来自 [8]

[13] The VFS Inode http://www.science.unitn.it/~fiorella/guidelinux/tlk/node104.html#SECTION001122000000000000000
来自 [8]

  • 19
    点赞
  • 42
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值