https://access.redhat.com/solutions/203173
What are some of best practices for tuning XFS filesystems
SOLUTION 已验证 - 已更新 2016年三月17日06:39 -
环境
- Red Hat Enterprise Linux 5 [XFS available with the Scalable File System subscription]
- Red Hat Enterprise Linux 6 [XFS available with the Scalable File System subscription]
- Red Hat Enterprise Linux 7
问题
- Discuss the best practices and tunables for XFS filesystems, for better performance.
- What are some of best deployment practices for XFS?
决议
Defaults
-
There are few workloads where using non-default
mkfs.xfs
ormount
options are required. In most cases, the default values will suffice. -
The filesystem driver probes its underlying storage devices and the resulting default values are already optimized.
-
mkfs.xfs
will detect the difference between a single underlying disk and MD/DM RAID setups and will change the default values it uses to configure the filesystem appropriately. -
The defaults should only be changed if:
-
The specific workload on the machine is known to cause problems with the default settings, and can be worked around via a configuration change.
-
The workload is demonstrating bad performance when using the default configurations
-
Note: If performance is poor, an understanding of why the poor performance occurs is required before applying any changes to XFS.
XFS Design
-
A good understanding of XFS design is required to properly tune and use the filesystem.
-
The upstream XFS User-guide, file system structure, and documentation (even though not complete in many sections) provide a good source of information.
-
XFS is made up of Extents which are placed in Allocation Groups. When an XFS filesystem is created on top of RAID, the filesystem needs to know about the construction of the physical devices, and an Allocation Group is placed on each physical device. When a new directory is created, that directory is placed in a different Allocation Group.
-
This allows the system to do I/O to multiple directories at the same time and have that I/O serviced by different underlying physical devices. This does require some thought put into how files are laid out on the disk. For example, it makes no sense to store everything in the one directory.
XFS File System Creation Options
-
The LVM RAID Calculator and File System Layout Calculator can be used for this.
-
If using LVM,
mkfs.xfs
will query the Logical Volume Manager for the correct RAID geometry. Ensure that the LVM RAID configuration is correct, and no work to organize the filesystem across RAID devices should be required. -
mkfs.xfs
also attempts to query SCSI devices for their RAID alignment information, which can be provided in SCSI queries to the Linux SCSI block driver. -
If it is required to manually specify the RAID stripe unit and stripe width, the options
su
andsw
orsunit
andswidth
can be used. These are described inman mkfs.xfs
. -
The
agcount
parameter controls the number of Allocation Groups created, or theagsize
parameter can control the size of each created Allocation Group. This can have a performance impact depending on the concurrency of the workload and the layout of the files and directories on the disk. -
A specific example for nested RAID60 can be viewed at: How can I create a XFS filesystem optimal for RAID 60?
Mount Options
noatime
- The
noatime
mount option can be used to prevent fileatime (access time) being written to the log, if such data is not required. This can help prevent log metadata access for read-heavy workloads. The
mtime` (modification time) is still updated when file contents are changed, which is sufficient for most backup software to realize a file's content has changed.
sync
- The
sync
mount option can be used to ensure data is actually written to disk on awrite()
system call, instead of written to cache and flushed to disk later. Seeman mount
for further information on this mount option.
logbufs
- The
logbufs
parameter can be used to specify the number of log buffers, if the default is too memory-intensive.
logbsize
-
The
logbsize
anddelaylog
mount options change metadata performance considerably, with some caveats. Increasinglogbsize
reduces the number of journal I/Os for a given workload, anddelaylog
will reduce them even further. The trade off for this increase in metadata performance is that more operations may be "missing" after recovery if the system crashes while actively making modifications. -
The
logbsize
mount option is recommended for file systems that are modified frequently, or in bursts. The default value is the maximum of either 32 KiB or the log stripe unit, and the maximum size is 256 KiB. A value of 256 KiB is recommended for file systems that undergo heavy modifications.
delaylog
- The
delaylog
mount option can also improve sustained metadata modification performance by reducing the number of changes to the log. It achieves this by aggregating individual changes in memory before writing them to the log: frequently modified metadata is written to the log periodically instead of on every modification. This option increases the memory usage of tracking dirty metadata and increases the potential lost operations when a crash occurs, but can improve metadata modification speed and scalability by an order of magnitude or more.
Important: The delaylog
option is default from kernel-2.6.32-504.el6
(RHEL6.6), and it's no longer necessary to explicitly use this option on releases after this.
inode64
- The
inode64
mount option allows inodes to be placed at any location on the filesystem, which can lead to performance improvements by storing a file's inode in the same location as the file's data. This option is default on RHEL7 and later releases.
nobarrier
- XFS uses write barriers to ensure file system integrity even when power is lost to a device with write caches enabled. For devices without write caches, or with battery-backed write caches, disable the barriers by using the
nobarrier
mount option. This can increase performance, but is not suggested on devices without battery-backed caches.
discard