linux的hugepage的配置
linux虽然没有aix,hp unix那么强悍,但linux也是非常优秀的,为了提升linux的性能,它采用了很多
io,memory的调度机制,linux使用内存的方式是采用vm的方式,即linux把物理内存和swap共同虚拟成
内存来对外提供,有时用户看似使用内存,可实际上是使用磁盘,那如何避免使用swap磁盘空间呢?
linux管理内存的单位是页(pages),一般情况下是4k的page,当我们使用的大内存时(>8G),管理这么大的内存
就会给系统造成很大的负担,再加上频繁的pagein/pageout,会成为系统的瓶颈。
1.hugepage介绍
2.实践配置
1.hugepage介绍
hugepage是在linux2.6内核被引入的,主要提供4k的page和比较大的page的选择
当我们访问内存时,首先访问”page table“,然后linux在通过“page table”的
mapping来访问真实物理内存(ram+swap)。为了提升性能,linux在cpu中申请
固定大小的buffer,被称为TLB,TLB中保存有“page table”的部分内容,这也遵循
了,让数据尽可能的靠近cpu原则。在TLB中通过hugetlb来指向hugepage。这些被分配
的hugepage作为内存文件系统hugetlbfs(类似tmpfs)提供给进程使用。
普通4k page
启用hugepage
hugepage特点
linux系统启动,hugepage就被分配并保留,不会pagein/pageout,除非人为干预,如改变hugepage的配置等;
根据linux内核的版本和HW的架构,hugepage的大小从2M到256M不等。因为采用大page,所以也减少TLB
和page table的管理压力
为什么使用hugepage
对于大内存(>8G),hugepage对于提高在linux上的oracle性能是非常有帮助的
1)Larger Page Size and Less of Pages:减少了HugeTLB 的工作量
2)No Page Table Lookups:因为hugepage是不swappable的,所有就没有page table lookups。
3)No Swapping: 在Linux下,hugepage是不支持swapping
4)No 'kswapd' Operations:在linux下进程“kswapd”是管理swap的,如果是大内存,那pages的数量就非常大, 那“kswapd”就会被频繁的调用,从而会影响性能。
0 查看系统版本 uname -r
2.6.18-128.el5
[root@node2 ~]# ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 32768 gdm 600 393216 2 dest
0x7a1b43dc 98305 grid 660 4096 0
0x596be9dc 622594 oracle 660 4833935360 32
1) 配置之前
[oracle@db101 ~]$ grep HugePages /proc/meminfo
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
(2) 首先修改limits.conf
[root@db101 ~]# vi /etc/security/limits.conf
## 等于SGA_MAX_SIZE 下面是KB 锁定15G内存
free - t 获取系统内存值
##zengmuansha add 0122
oracle soft memlock 15826672
oracle hard memlock 15826672
Oracle下使用 ulimit -t 查看
(3) [ORACLE 11G] 必须关闭AMM(自动内存管理)特性才能使用hugepage
设置如下初始化参数:
ALTER SYSTEM SET sga_max_size=15455M SCOPE=SPFILE;
ALTER SYSTEM SET sga_target=15455M SCOPE=SPFILE;
ALTER SYSTEM SET PGA_AGGREGATE_TARGET=2048M SCOPE=SPFILE;
ALTER SYSTEM SET memory_target=0 SCOPE=SPFILE;
ALTER SYSTEM SET memory_max_target=0 SCOPE=SPFILE;
11.2.0.1版本 MEMORY_TARGET=0 设置会无效 必须通过INIT.ORA来屏蔽掉 再生成SPFILE;
(4) 配置分配hugepage的数量
nr_hugepages的计算公式:nr_hugepages>=sga(mb)/Hugepagesize(mb)
echo "vm.nr_hugepages=3872" >> /etc/sysctl.conf
代码需要ORACLE 账号执行 并且所有实例以开启,而且AMM已关闭
hugepages_settings.sh
#!/bin/bash # # hugepages_settings.sh # # Linux bash script to compute values for the # recommended HugePages/HugeTLB configuration # # Note: This script does calculation for all shared memory # segments available when the script is run, no matter it # is an Oracle RDBMS shared memory segment or not. # Check for the kernel version #查看Oracle Kernel的版本,因为2.4和2.6使用的hugepages的参数是不一样的; #2.4使用vm.hugetlb_pool,而2.6使用vm.nr_hugepages。 KERN='uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'' # Find out the HugePage size #查找Hugepages的大小,x86非PAE为4096,x86+PAE以及x86_64为2048,注意这里单位为K。 HPG_SZ=`grep Hugepagesize /proc/meminfo | awk {'print $2'}` # Start from 1 pages to be on the safe side and guarantee 1 free HugePage #保证至少有1个page,也就是计数从1开始,MOS文档401749.1的初始计数从0开始。 NUM_PG=1 # Cumulative number of pages required to handle the running shared memory segments #循环计算一共需要多少hugepages #ipcs -m | awk {'print $5'} | grep "[0-9][0-9]*"的结果是列出所有的shared memory的大#小,单位为Bytes;echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q 为将shared memory处理单 #个page的大小,得到单个shared memory所需的hugepages的数量。将所有的shared memory #循环累加,最终得到总的hugepages的数量。 for SEG_BYTES in `ipcs -m | awk {'print $5'} | grep "[0-9][0-9]*"` do MIN_PG='echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q' if [ $MIN_PG -gt 0 ]; then NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q` fi done # Finish with results #根据不同的内核,提示设置不同的hugepages参数 case $KERN in '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`; echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;; '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; *) echo "Unrecognized kernel version $KERN. Exiting." ;; esac # End#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com
# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments. Before proceeding with the execution please note following:
* For ASM instance, it needs to configure ASMM instead of AMM.
* The 'pga_aggregate_target' is outside the SGA and
you should accommodate this while calculating SGA size.
* In case you changes the DB SGA size,
as the new SGA will not fit in the previous HugePages configuration,
it had better disable the whole HugePages,
start the DB with new SGA size and run the script again.
And make sure that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m
Press Enter to proceed..."
read
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
echo "The hugepages may not be supported in the system where the script is being executed."
exit 1
fi
# Initialize the counter
NUM_PG=0
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
if [ $MIN_PG -gt 0 ]; then
NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
fi
done
RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`
# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
echo "***********"
echo "** ERROR **"
echo "***********"
echo "Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:
# ipcs -m
of a size that can match an Oracle Database SGA. Please make sure that:
* Oracle Database instance is up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not configured"
exit 1
fi
# Finish with results
case $KERN in
'2.2') echo "Kernel version $KERN is not supported. Exiting." ;;
'2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
'2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
esac
# End
(5) 重启系统
reboot
(6) 启动数据库
sqlplus / as sysba
startup
(7) 检查是否生效
root@db101:[/root]grep HugePages /proc/meminfo
HugePages_Total: 3890
HugePages_Free: 17
HugePages_Rsvd: 0
为了确保HugePages配置的有效性,HugePages_Free值应该小于HugePages_Total 的值,并且应该等于HugePages_Rsvd的值。
Hugepages_Free 和HugePages_Rsvd 的值应该小于SGA 分配的gages。
11.2.0.2之前的版本,database的SGA只能选择全部使用hugepages或者完全不使用hugepages。 11.2.0.2 及以后的版本, oracle增加了一个新的参数“USE_LARGE_PAGES”来管理数据库如何使用 hugepages
2.8 故障处理
一些常见的问题如下:
Symptom | Possible Cause | Troubleshooting Action |
System is running out of memory or swapping | Not enough HugePages to cover the SGA(s) and therefore the area reserved for HugePages are wasted where SGAs are allocated through regular pages | Review your HugePages configuration to make sure that all SGA(s) are covered. |
Databases fail to start | memlock limits are not set properly | Make sure the settings in limits.conf apply to database owner account. |
One of the database fail to start while another is up | The SGA of the specific database could not find available HugePages and remaining RAM is not enough. | Make sure that the RAM and HugePages are enough to cover all your database SGAs |
Cluster Ready Services (CRS) fail to start | HugePages configured too large (maybe larger than installed RAM) | Make sure the total SGA is less than the installed RAM and re-calculate HugePages. |
HugePages_Total = HugePages_Free | HugePages are not used at all. No database instances are up or using AMM. | Disable AMM and make sure that the database instances are up. |
Database started successfully and the performance is slow | The SGA of the specific database could not find available HugePages and therefore the SGA is handled by regular pages, which leads to slow performance | Make sure that the HugePages are many enough to cover all your database SGAs |