NUMA Performance Optimization (文档 ID 1488175.1)

最新推荐文章于 2022-12-28 20:11:48 发布

badman250

最新推荐文章于 2022-12-28 20:11:48 发布

阅读量3w

点赞数

NUMA Performance Optimization (文档 ID 1488175.1)

In this Document

APPLIES TO:

Linux OS - Version 2.6.18 and later
Information in this document applies to any platform.
On architectures which support NUMA (non-uniform memory access) it is possible for an application on one cluster node can access memory physically on another node. While this is done using a high-speed transfer bus, the accesses are significantly slower than access to memory local to the node.

The Linux kernel makes extensive use of dynamic memory allocation during its normal operations. Depending on several heuristics, the memory may be allocated locally, or it may be allocated on a separate cluster node.

The kernel's memory allocation policy can be tuned using simple command-line settings.

Each cluster node can have a separate allocation policy. As a corollary, the settings should probably be changed on all cluster nodes. Pinning a process to a particular CPU has no impact on this allocation policy.

SYMPTOMS

System performance can vary with system load, as memory allocations may be satisfied by using local memory, or by using remote memory from a different cluster node.

Performance anomalies include unexpected swap store usage; unexpectedly poor performance; or system out-of-memory process terminations while sufficient memory and swap space appear available.

CAUSE

The Linux kernel may satisfy its dynamic memory allocations using either local or remote memory in a NUMA system; this is its default operation. The kernel tries to keep memory allocation local, but may choose to use remote memory. While this does allow the kernel or process to keep running, there can be a significant performance penalty.

SOLUTION

Observe the current kernel memory allocation policy:

# /sbin/sysctl vm.zone_reclaim_mode

The default value of vm.zone_reclaim_mode is 0.
Neither vm.zone_reclaim_mode nor vm.zone_reclaim_interval are present in kernel-xen-2.6.18-xxx.
vm.zone_reclaim_interval is not present in UEK 2.6.32 and later.

Add an entry in the /etc/sysctl.conf file to select a different policy:

vm.zone_reclaim_mode = 6

This will prevent the local node from allocating VM pages on a different cluster node.

After modified /etc/sysctl.conf, please make the modification take effect by:

# /sbin/sysctl -p

There is a related setting which determines how frequently local memory is scavenged:

# cat /proc/sys/vm/zone_reclaim_interval

This value can be decreased if unwanted off-node allocations still take place:

vm.zone_reclaim_interval = 10

This causes the reclaimation scan every 10 seconds as opposed to the default value of 30 seconds.

For a more complete description of the possible values for these settings, please ensure that the kernel-doc RPM package is installed, and consult the/usr/share/doc/kernel-doc-2.6.18/Documentation/sysctl/vm.txt file..

The zone_reclaim_mode is two-edged sword for performance issue.

Some file operations, such as copy, move, or backup, rely heavily on the system cache memory. With zone reclaimation disabled (vm.zone_reclaim_mode=0) memory pressure can result. If there are /var/log/messages similar to:

swapper: page allocation failure. order:1, mode:0x20

these are symptomatic of memory pressue. In such a situation, enable zone reclaimation by setting vm.zone_reclaim_mode=1 to allow the off-node allocations to succeed.