The use of huge pages can significantly increase the performance of many workloads by reducing both memory-management overhead in the kernel and pressure on the system's translation lookaside buffer (TLB). The addition of transparent huge pages (THP) for the 2.6.38 kernel release in 2011 caused the kernel to allocate huge pages automatically to make their benefits available to all workloads without any effort needed on the user-space side. But it turns out that use of huge pages can make some workloads slower as the result of internal memory fragmentation, so the THP feature is often disabled. Two patch sets aimed at better targeting the use of transparent huge pages are currently working their way through the review process.
使用大页可以显著提高许多工作负载的性能,因为它减少了内核中的内存管理开销和系统转换旁路缓冲区(TLB)的压力。在 2011 年的 2.6.38 内核版本中引入了透明大页(THP),使内核能够自动分配大页,从而让所有工作负载无需用户空间的额外操作就能受益。但事实证明,大页的使用会因内部内存碎片化而让某些工作负载变慢,因此 THP 功能经常被禁用。目前,有两个补丁集旨在更精准地使用透明大页,正在进行评审。
Over the years, the kernel has evolved a number of ways to control the use of THP; they are described in Documentation/admin-guide/mm/transhuge.rst. At the global level, the /sys/kernel/mm/transparent_hugepage/enabled knob controls behavior system-wide. It can be set to "always" or "never" with obvious results. This knob also supports the "madvise" setting, which only enables THP for processes that explicitly opt in for specific memory regions with a call to madvise(). The kernel, in other words, allows for the imposition of a system-wide policy, with the possibility of restricting THP usage to places where applications have explicitly enabled it.
多年来,内核逐渐发展出了多种控制 THP 使用的方法,这些方法记录在 Documentation/admin-guide/mm/transhuge.rst 中。在全局层面,/sys/kernel/mm/transparent_hugepage/enabled 控制器可以系统范围内设置行为。它可以被设置为 “always” 或 “never”,含义显而易见。该控制器还支持 “madvise” 设置,仅为显式调用 madvise() 为特定内存区域启用 THP 的进程打开 THP。换句话说,内核允许制定系统范围的策略,并可以将 THP 使用限制在应用程序显式启用的地方。
Tweaking prctl()
调整 prctl()
There are more control points for THP usage, though, including a whole set of knobs for the khugepaged kernel thread (which builds huge pages out of base pages in the background) and a set of kernel command-line options. There is also the PR_SET_THP_DISABLE option to prctl(), which lets a process disa