《现代操作系统4th》英文版阅读笔记 4.3章 文件系统实现

Probably the most important issue in implementing file storage is keeping track of which disk blocks go with which file

实现文件存储最重要的一点就是如何在磁盘跟踪哪个磁盘块(block)保存哪一个文件。

contiguous allocation also has a very serious drawback: over the course of time, the disk becomes fragmented
连续性分配一个非常严重的缺点就是随着经常的使用,磁盘碎片化严重。

When a file is removed, its blocks are naturally freed, leaving a run of free blocks on the disk. The disk is not compacted on the spot to squeeze out the hole, since that would involve copying all the blocks following the hole, potentially millions of blocks, which would take hours or even days with large disks.

当一个文件删除掉后,该文件占用的磁盘块也被释放,在磁盘上会留下一连串的空闲块。磁盘并不会进行压缩以便把这些空闲块集中在一起,因为这样可能需要把空闲块后面的磁盘块都进行复制,数量可能非常大,可能花费几个小时甚至几天的时间。
The situation with DVDs is a bit more complicated. In principle, a 90-min movie could be encoded as a single file of length about 4.5 GB, but the file system used,UDF(Universal Disk Format), uses a 30-bit number to represent file length, which limits files to 1 GB. As a consequence, DVD movies are generally stored as three or four 1-GB files, each of which is contiguous. These physical pieces of the single logical file (the movie) are calledextents。
DVD的情况有点复杂。原则上,一个90分钟的电影可以编码为一个连续长度大小为4.5G的文件,但是由于DVD的文件系统采用的UDF标准,这个标准里面用30位的数字来表示文件长度,2的30次方为1GB,所以就限制了文件的最大长度为1GB。所以,DVD在存储电影时通常把电影存储为3个或4个1-GB的文件,每个文件都是连续的。单个逻辑文件(电影文件)的这些物理文件称为extents.

(链表格式的磁盘分配)
Also, the amount of data storage in a block is no longer a power of two because the pointer takes up a few bytes. While not fatal, having a peculiar size is less efficient because many programs read and write in blocks whose size is a power of two. With the first few bytes of each block occupied by a pointer to the next block, reads of the full block size require acquiring and concatenating information from two disk blocks, which generates extra overhead due to the copying.

此外,一个磁盘块上存储的数据大小不再是2的幂次方,因为指针占用了一些字节。虽然问题不致命,随意的磁盘大小效率比较低,因为许多程序在磁盘上读写时大小都是2的幂次方长度。由于每个磁盘块中一部分字节被下一个磁盘块指针占用,所以如果要读取磁盘块长度大小的数据的话就要获取结合两个磁盘块中的数据,产生额外的花销。

FAT(file allocation table)
Both disadvantages of the linked-list allocation can be eliminated by taking the pointer word from each disk block and putting it in a table in memory
链表分配方法的缺点可以消除,通过把每一个磁盘块中的指针从磁盘块中移除放在内存中的表中。

The primary disadvantage of this method is that the entire table must be in memory all the time to make it work. With a 1-TB disk and a 1-KB block size, the table needs 1 billion entries, one for each of the 1 billion disk blocks. Each entry has to be a minimum of 3 bytes. For speed in lookup, they should be 4 bytes. Thus the table will take up 3 GB or 2.4 GB of main memory all the time, depending on whether the system is optimized for space or time. Not wildly practical. Clearly the FAT idea does not scale well to large disks. It was the original MS-DOS file system and is still fully supported by all versions of Windows though.
这个方法最基本的缺陷就是整个表必须一直停留在内存中才能工作。如果有一个1-TB的磁盘,每个磁盘块1KB,那么这个表就有10亿项(2的30次幂),每项对应一个磁盘块。每一个表项至少3个字节,为了快速查找,他们应该是4个字节大小。所以这个表会一直占用3GB或2.4GB(应该是4GB吧)的主内存,依赖系统是否优化了空间或时间。不是普遍实用。显然FAT方法不适应非常大的磁盘。FAT格式是最初MD-DOS文件系统采用,目前仍然被各个版本的WINDOWS系统支持。

One problem with i-nodes is that if each one has room for a fixed number of disk addresses, what happens when a file grows beyond this limit? One solution is to reserve the last disk address not for a data block, but instead for the address of a
block containing more disk-block addresses,

i-node方法的一个问题就是如果每个i-node的空间只能保存固定数量的磁盘地址,那么一个文件大小超出地址范围限制怎么办?其中一个方法就是保留最后一个磁盘地址,这个磁盘里面保存的不是数据,而是包含更多磁盘块地址的磁盘块地址。(也就是说最后一个地址指向的磁盘中包含的是其他磁盘块的地址,这些磁盘块中包含了更多的文件数据)

《现代操作系统4th》英文版下载地址 点击下图












  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值