Automatic NUMA balancing in RHEL7

219 篇文章 2 订阅

 

Automatic NUMA balancing in RHEL7

 SOLUTION 已验证 - 已更新 2019年五月14日20:53 - 

English 

环境

  • Red Hat Enterprise Linux 7

问题

  • In which version, the automatic NUMA balancing for processes/memory available?

决议

  • NUMA-Aware Scheduling and Memory Allocation is available in Red Hat Enterprise Linux 7.
  • The kernel automatically relocates processes and memory between NUMA nodes in the same system, in order to improve performance on systems with non-uniform memory access (NUMA).

  • Related sysctl parameters are in below

Raw

kernel.numa_balancing = 0
kernel.numa_balancing_scan_delay_ms = 1000
kernel.numa_balancing_scan_period_max_ms = 60000
kernel.numa_balancing_scan_period_min_ms = 1000
kernel.numa_balancing_scan_size_mb = 256
kernel.numa_balancing_settle_count = 4
  • Below are the explanation for each parameters from kernel documentation

Raw

Automatic NUMA balancing scans tasks address space and unmaps pages to
detect if pages are properly placed or if the data should be migrated to a
memory node local to where the task is running.  Every "scan delay" the task
scans the next "scan size" number of pages in its address space. When the
end of the address space is reached the scanner restarts from the beginning.

In combination, the "scan delay" and "scan size" determine the scan rate.
When "scan delay" decreases, the scan rate increases.  The scan delay and
hence the scan rate of every task is adaptive and depends on historical
behaviour. If pages are properly placed then the scan delay increases,
otherwise the scan delay decreases.  The "scan size" is not adaptive but
the higher the "scan size", the higher the scan rate.

Higher scan rates incur higher system overhead as page faults must be
trapped and potentially data must be migrated. However, the higher the scan
rate, the more quickly a tasks memory is migrated to a local node if the
workload pattern changes and minimises performance impact due to remote
memory accesses. These sysctls control the thresholds for scan delays and
the number of pages scanned.
  • kernel.numa_balancing enables/disables automatic page fault based NUMA memory balancing. Memory is moved automatically to nodes that access it often. Enables/disables automatic NUMA memory balancing. On NUMA machines, there is a performance penalty if remote memory is accessed by a CPU. When this feature is enabled the kernel samples what task thread is accessing memory by periodically unmapping pages and later trapping a page fault. At the time of the page fault, it is determined if the data being accessed should be migrated to a local memory node. The unmapping of pages and trapping faults incur additional overhead that ideally is offset by improved memory locality but there is no universal guarantee. If the target workload is already bound to NUMA nodes then this feature should be disabled. Otherwise, if the system overhead from the feature is too high then the rate the kernel samples for NUMA hinting faults may be controlled by the numa_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb, and numa_balancing_settle_count sysctls.

  • numa_balancing_scan_period_min_ms is the minimum time in milliseconds to scan a tasks virtual memory. It effectively controls the maximum scanning rate for each task.

  • numa_balancing_scan_delay_ms is the starting "scan delay" used for a task when it initially forks.

  • numa_balancing_scan_period_max_ms is the maximum time in milliseconds to scan a tasks virtual memory. It effectively controls the minimum scanning rate for each task.

  • numa_balancing_scan_size_mb is how many megabytes worth of pages are scanned for a given scan.

  • numa_balancing_settle_count is how many scan periods must complete before the schedule balancer stops pushing the task towards a preferred node. This gives the scheduler a chance to place the task on an alternative node if the preferred node is overloaded.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值