Cache Manager
The cache manager is a set of kernel-mode functions and system threads that cooperate with the memory manager to provide data caching for all Windows file system drivers (both local and net-work). In this chapter, we’ll explain how the cache manager, including its key internal data structures and functions, works; how it is sized at system initialization time; how it interacts with other elements of the operating system; and how you can observe its activity through performance counters. We'll also describe the five flags on the Windows CreateFile function that affect file caching.
缓存管理是负责为windows文件系统驱动(本地和远程)提供数据缓存,通过一套内核函数和系统线程,和内存管理进行协作。在这一章我们将解释
缓存管理是如何进行工作的,包括他的主要内部数据结构和函数;
在启动的时候如何确定他的大小;
如何和操作系统的其他部分进行交互;
如何通过性能计数器观察他的活动;
我们也会介绍五个标志在windows的CreateFile函数中使用,是如何影响缓存的。
Key Features of the Cache Manager
The cache manager has several key features:
■ Supports all file system types (both local and network), thus removing the need for each file
system to implement its own cache management code
■ Uses the memory manager to control which parts of which files are in physical memory (trad-
ing off demands for physical memory between user processes and the operating system)
■ Caches data on a virtual block basis (offsets within a file)—in contrast to many caching
systems, which cache on a logical block basis (offsets within a disk volume)—allowing for intel-
ligent read-ahead and high-speed access to the cache without involving file system drivers
(This method of caching, called fast I/O, is described later in this chapter.)
■ Supports “hints” passed by applications at file open time (such as random versus sequential
access, temporary file creation, and so on)
■ Supports recoverable file systems (for example, those that use transaction logging) to recover
data after a system failure Although we’ll talk more throughout this chapter about how these features are used in the cache manager, in this section we’ll introduce you to the concepts behind these features.
缓存管理器的主要特征
支持所有的文件系统(包含本地和网络),因此不需要每个文件系统编写自己的缓冲管理代码
使用内存管理来控制哪个文件的哪个部分在物理内存中,让内存管理器来平衡进程和系统对物理内存的需求
以虚拟块【文件块】为基础(文件内的偏移量)缓存数据,与许多以逻辑块【硬盘卷块】为基础进行缓存(磁盘卷内的偏移)的缓存系统不同,允许在不涉及文件系统驱动程序的情况下对缓存进行智能预读和高速访问(这种缓存方法称为快速I/O,将在本章稍后介绍)什么事快速IO?
支持应用程序在文件打开时传递的“提示”(例如随机与顺序访问、临时文件创建等)【通过函数的标志传递的】
支持可恢复文件系统(例如,使用事务日志记录的文件系统)恢复
尽管我们将在本章中详细讨论如何在缓存管理器中使用这些功能,但在本节中,我们将向您介绍这些功能背后的概念。
Single, Centralized System Cache
Some operating systems rely on each individual file system to cache data, a practice that results either in duplicated caching and memory management code in the operating system or in limitations on the kinds of data that can be cached. In contrast, Windows offers a centralized caching facility that caches all externally stored data, whether on local hard disks, floppy disks, network file servers, or CD-ROMs.Any data can be cached, whether it’s user data streams (the contents of a file and the ongoing read and write activity to that file) or file system metadata (such as directory and file headers). As you’ll discover in this chapter, the method Windows uses to access the cache depends on the type of data being cached.
单一集中式的系统缓冲
有些操作系统依赖每个文件系统去缓冲数据,这样的实现要不然缓存和内存管理的代码重复,要不然就会对可缓存的数据类型进行限制。于此相比,windows提供了一个集中式的缓存工具来缓存所有的外部数据,不管这些数据是在本地的硬盘,软盘,在网络文件服务器,还是CD=ROM,任何数据都可以被缓存。不管他是用户数据流(文件的内容还是文件的读写)还是文件系统的元数据(比如目录和文件头)。
在这一章你会发现,window访问缓冲数据的方法依赖于数据被缓存的类型或者方式
The Memory Manager
One unusual aspect of the cache manager is that it never knows how much cached data is actually in physical memory. This statement might sound strange because the purpose of a cache is to keep a subset of frequently accessed data in physical memory as a way to improve I/O performance. The reason the cache manager doesn’t know how much data is in physical memory is that it accesses data by mapping views of files into system virtual address spaces, using standard section objects (file mapping objects in Windows API terminology). (Section objects are the basic primitive of the memory manager and are explained in detail in Chapter 10, “Memory Management.”) As addresses in these mapped views are accessed, the memory manager pages in blocks that aren’t in physical memory. And when memory demands dictate, the memory manager unmaps these pages out of the cache and, if the data has changed, pages the data back to the files .
缓存管理器一个不同寻常的方面是他永远不知道多少缓存数据实际在物理内存上。这个表示可能听起来很奇怪,因为缓存的目的就是为了提高IO的性能而将一部分经常备份昂为的数据放在物理内存中。缓存管理器不知道有多少数据在物理内存中是因为他通过将文件映射到系统的虚拟地址空间来访问数据,映射文件的时候使用的是标准的section object.当访问这些映射视图中的地址时,如果在物理内存中直接访问,如果不在物理内存中,内存管理器会将这些块进行分页。 当内存有需求指示时,内存管理器将这些页面从缓存中取消映射,如果数据已更改,则将数据分页回文件
(一部分映射到物理内存,一部分进行分页处理,
如果其他的需要内存,那么会取消映射)这部分还要结合第十章进行理解
By caching on the basis of a virtual address space using mapped files, the cache manager avoids
generating read or write I/O request packets (IRPs) to access the data for files it’s caching. Instead, it simply copies data to or from the virtual addresses where the portion of the cached file is mapped and relies on the memory manager to fault in (or out) the data into (or out of) memory as needed.This process allows the memory manager to make global trade-offs on how much memory to give to the system cache versus how much to give to user processes. (The cache manager also initiates I/O,such as lazy writing, which is described later in this chapter; however, it calls the memory manager to write the pages.) Also, as you’ll learn in the next section, this design makes it possible for processes that open cached files to see the same data as do processes that are mapping the same files into their user address spaces.