Chapter 10. FS-Cache
FS-Cache is a persistent local cache that can be used by file systems to take data retrieved from over the network and cache it on local disk. This helps minimize network traffic for users accessing data from a file system mounted over the network (for example, NFS).
The following diagram is a high-level illustration of how FS-Cache works:
FS-Cache is designed to be as transparent as possible to the users and administrators of a system. Unlike
cachefs
on Solaris, FS-Cache allows a file system on a server to interact directly with a client's local cache without creating an overmounted file system. With NFS, a mount option instructs the client to mount the NFS share with FS-cache enabled.
FS-Cache does not alter the basic operation of a file system that works over the network - it merely provides that file system with a persistent place in which it can cache data. For instance, a client can still mount an NFS share whether or not FS-Cache is enabled. In addition, cached NFS can handle files that won't fit into the cache (whether individually or collectively) as files can be partially cached and do not have to be read completely up front. FS-Cache also hides all I/O errors that occur in the cache from the client file system driver.
To provide caching services, FS-Cache needs a
cache back-end. A cache back-end is a storage driver configured to provide caching services (i.e.
cachefiles
). In this case, FS-Cache requires a mounted block-based file system that supports
bmap
and extended attributes (e.g. ext3) as its cache back-end.
FS-Cache cannot arbitrarily cache any file system, whether through the network or otherwise: the shared file system's driver must be altered to allow interaction with FS-Cache, data storage/retrieval, and metadata set up and validation. FS-Cache needs
indexing keys and
coherency data from the cached file system to support persistence: indexing keys to match file system objects to cache objects, and coherency data to determine whether the cache objects are still valid.
Note: cachefilesd
In Red Hat Enterprise Linux 6.2 including all previous versions,
cachefilesd
is not installed by default and will need to be installed manually.
10.1. Performance Guarantee
FS-Cache does
not guarantee increased performance, however it ensures consistent performance by avoiding network congestion. Using a cache back-end incurs a performance penalty: for example, cached NFS shares add disk accesses to cross-network lookups. While FS-Cache tries to be as asynchronous as possible, there are synchronous paths (e.g. reads) where this isn't possible.
For example, using FS-Cache to cache an NFS share between two computers over an otherwise unladen GigE network will not demonstrate any performance improvements on file access. Rather, NFS requests would be satisfied faster from server memory rather than from local disk.
The use of FS-Cache, therefore, is a
compromise between various factors. If FS-Cache is being used to cache NFS traffic, for instance, it may slow the client down a little, but massively reduce the network and server loading by satisfying read requests locally without consuming network bandwidth.
10.2. Setting Up a Cache
Currently, Red Hat Enterprise Linux 6 only provides the
cachefiles
caching back-end. The
cachefilesd
daemon initiates and manages
cachefiles
. The
/etc/cachefilesd.conf
file controls how
cachefiles
provides caching services. To configure a cache back-end of this type, the
cachefilesd
package must be installed.
The first setting to configure in a cache back-end is which directory to use as a cache. To configure this, use the following parameter:
$ dir /path/to/cache
Typically, the cache back-end directory is set in
/etc/cachefilesd.conf
as
/var/cache/fscache
, as in:
$ dir /var/cache/fscache
FS-Cache will store the cache in the file system that hosts
/path/to/cache
. On a laptop, it is advisable to use the root file system (
/
) as the host file system, but for a desktop machine it would be more prudent to mount a disk partition specifically for the cache.
File systems that support functionalities required by FS-Cache cache back-end include the Red Hat Enterprise Linux 6 implementations of the following file systems:
-
ext3 (with extended attributes enabled)
-
ext4
-
BTRFS
-
XFS
The host file system must support user-defined extended attributes; FS-Cache uses these attributes to store coherency maintenance information. To enable user-defined extended attributes for ext3 file systems (i.e.
device
), use:
# tune2fs -o user_xattr /dev/device
Alternatively, extended attributes for a file system can be enabled at mount time, as in:
# mount /dev/device /path/to/cache -o user_xattr
The cache back-end works by maintaining a certain amount of free space on the partition hosting the cache. It grows and shrinks the cache in response to other elements of the system using up free space, making it safe to use on the root file system (for example, on a laptop). FS-Cache sets defaults on this behavior, which can be configured via
cache cull limits. For more information about configuring cache cull limits, refer to
Section 10.4, “Setting Cache Cull Limits”.
Once the configuration file is in place, start up the
cachefilesd
daemon:
# service cachefilesd start
To configure
cachefilesd
to start at boot time, execute the following command as root:
# chkconfig cachefilesd on
10.3. Using the Cache With NFS
NFS will not use the cache unless explicitly instructed. To configure an NFS mount to use FS-Cache, include the
-o fsc
option to the
mount
command:
# mount nfs-share:/ /mount/point -o fsc
All access to files under
/mount/point
will go through the cache, unless the file is opened for direct I/O or writing (refer to
Section 10.3.2, “Cache Limitations With NFS” for more information). NFS indexes cache contents using NFS file handle,
not the file name; this means that hard-linked files share the cache correctly.
Caching is supported in version 2, 3, and 4 of NFS. However, each version uses different branches for caching.
10.3.1. Cache Sharing
There are several potential issues to do with NFS cache sharing. Because the cache is persistent, blocks of data in the cache are indexed on a sequence of four keys:
-
Level 1: Server details
-
Level 2: Some mount options; security type; FSID; uniquifier
-
Level 3: File Handle
-
Level 4: Page number in file
To avoid coherency management problems between superblocks, all NFS superblocks that wish to cache data have unique Level 2 keys. Normally, two NFS mounts with same source volume and options will share a superblock, and thus share the caching, even if they mount different directories within that volume.
Example 10.1. Cache sharing
Take the following two
mount
commands:
mount home0:/disk0/fred /home/fred -o fsc
mount home0:/disk0/jim /home/jim -o fsc
Here,
/home/fred
and
/home/jim
will likely share the superblock as they have the same options, especially if they come from the same volume/partition on the NFS server (
home0
). Now, consider the next two subsequent mount commands:
mount home0:/disk0/fred /home/fred -o fsc,rsize=230
mount home0:/disk0/jim /home/jim -o fsc,rsize=231
In this case,
/home/fred
and
/home/jim
will not share the superblock as they have different network access parameters, which are part of the Level 2 key. The same goes for the following mount sequence:
mount home0:/disk0/fred /home/fred1 -o fsc,rsize=230
mount home0:/disk0/fred /home/fred2 -o fsc,rsize=231
Here, the contents of the two subtrees (
/home/fred1
and
/home/fred2
) will be cached
twice.
Another way to avoid superblock sharing is to suppress it explicitly with the
nosharecache
parameter. Using the same example:
mount home0:/disk0/fred /home/fred -o nosharecache,fsc
mount home0:/disk0/jim /home/jim -o nosharecache,fsc
However, in this case only one of the superblocks will be permitted to use cache since there is nothing to distinguish the Level 2 keys of
home0:/disk0/fred
and
home0:/disk0/jim
. To address this, add a
unique identifier on at least one of the mounts, i.e.
fsc=unique-identifier
. For example:
mount home0:/disk0/fred /home/fred -o nosharecache,fsc
mount home0:/disk0/jim /home/jim -o nosharecache,fsc=jim
Here, the unique identifier
jim
will be added to the Level 2 key used in the cache for
/home/jim
.
10.3.2. Cache Limitations With NFS
Opening a file from a shared file system for direct I/O will automatically bypass the cache. This is because this type of access must be direct to the server.
Opening a file from a shared file system for writing will not work on NFS version 2 and 3. The protocols of these versions do not provide sufficient coherency management information for the client to detect a concurrent write to the same file from another client.
As such, opening a file from a shared file system for either direct I/O or writing will flush the cached copy of the file. FS-Cache will not cache the file again until it is no longer opened for direct I/O or writing.
Furthermore, this release of FS-Cache only caches regular NFS files. FS-Cache will
not cache directories, symlinks, device files, FIFOs and sockets.
10.4. Setting Cache Cull Limits
The
cachefilesd
daemon works by caching remote data from shared file systems to free space on the disk. This could potentially consume all available free space, which could be bad if the disk also housed the root partition. To control this,
cachefilesd
tries to maintain a certain amount of free space by discarding old objects (i.e. accessed less recently) from the cache. This behavior is known as
cache culling.
Cache culling is done on the basis of the percentage of blocks and the percentage of files available in the underlying file system. There are six limits controlled by settings in
/etc/cachefilesd.conf
:
-
brun N% (percentage of blocks) ,
frun N% (percentage of files)
-
If the amount of free space and the number of available files in the cache rises above both these limits, then culling is turned off.
bcull N% (percentage of blocks),
fcull N% (percentage of files)
-
If the amount of available space or the number of files in the cache falls below either of these limits, then culling is started.
bstop N% (percentage of blocks),
fstop N% (percentage of files)
-
If the amount of available space or the number of available files in the cache falls below either of these limits, then no further allocation of disk space or files is permitted until culling has raised things above these limits again.
The default value of
N
for each setting is as follows:
-
brun
/frun
- 10% -
bcull
/fcull
- 7% -
bstop
/fstop
- 3%
When configuring these settings, the following must hold true:
0 <=
bstop
<
bcull
<
brun
< 100
0 <=
fstop
<
fcull
<
frun
< 100
These are the percentages of available space and available files and do not appear as 100 minus the percentage displayed by the
df
program.
Important
Culling depends on both b
xxx and f
xxx pairs simultaneously; they can not be treated separately.
10.5. Statistical Information
FS-Cache also keeps track of general statistical information. To view this information, use:
cat /proc/fs/fscache/stats
FS-Cache statistics includes information on decision points and object counters. For more details on the statistics provided by FS-Cache, refer to the following kernel document:
/usr/share/doc/kernel-doc-version/Documentation/filesystems/caching/fscache.txt
10.6. References
For more information on
cachefilesd
and how to configure it, refer to
man cachefilesd
and
man cachefilesd.conf
. The following kernel documents also provide additional information:
-
/usr/share/doc/cachefilesd-version-number/README
-
/usr/share/man/man5/cachefilesd.conf.5.gz
-
/usr/share/man/man8/cachefilesd.8.gz
For general information about FS-Cache, including details on its design constraints, available statistics, and capabilities, refer to the following kernel document:
/usr/share/doc/kernel-doc-version/Documentation/filesystems/caching/fscache.txt