HugePages on Linux
HugePages是linux内核的一个特性,使用hugepage可以用更大的内存页来取代传统的4K页面。使用HugePage主要带来如下好处
1,没有swap。Notswappable: HugePages are not swappable. Therefore there is no page-in/page-outmechanism overhead.HugePages are universally regarded as pinned.
2,减轻快表压力。Reliefof TLB pressure:TLB表格的更小了,效率提高
3,减轻换页表的负载。每个表单需64字节,如果管理50GB的物理内存,如果使用传统4K页面pagetable需要800M大小,而是用HugePages仅需要40M
4,提高内存的性能,降低CPU负载,原理同上
HugePages和oracle AMM(自动内存管理)是互斥的,所有使用HugePages必须设置内存参数MEMORY_TARGET / MEMORY_MAX_TARGET 为0
配置HugePages的具体步骤
1、修改内核参数memlock,单位是KB,如果内存是128G,memlock的大小要稍微小于物理内存。计划lock 100GB的内存大小。参数设置为大约SGA是没有坏处的。
vi /etc/security/limits.conf
* soft memlock 104857600
* hard memlock 104857600
2,使用数据库帐号验证大小
[oracle@dtydb5 ~]$ ulimit -a|grep lock
core file size (blocks, -c) 0
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) 104857600
file locks (-x) unlimited
3,如果使用AMM内存管理,要取消改设置。MEMORY_TARGET和 MEMORY_MAX_TARGET参数设置为0
SQL> alter system reset memory_target scope=spfile ;
SQL> alter system reset memory_max_target scope=spfile;
SQL> alter system set sga_target = 10Gscope=spfile;
SQL> alter system set pga_aggregate_target = 4G scope = spfile;
4,计算需要使用的hugepage页面的大小。hugepage目前只能用于共享内存段等少量内存类型,例如oracle SGA。PGA则不适用,这些内存一般不能用于其它用途,因此设置太小则不足够放下所有内存段,太大则空间浪费。
目前hugepage的大小
[root@dtydb5 ~]# grep Hugepagesize /proc/meminfo
Hugepagesize: 2048 kB
简单的计算原理是total SGA_MAX_SIZE(多个instance的总和)/hugepagesize + N
N为少量内存盈余,一般多出100就足够了。如果主机内存128GB,计划70GB用于SGA共享内存,则大内存页需70×1024/2=35840
也可使用oracle提供的计算公式,基本原理是使用ipcs -m来计算共享内存段的大小。统计前注意关闭AMM;
vi hugepages_settings.sh
#!/bin/bash
# # hugepages_settings.sh # # Linux bash script to compute values for the # recommended HugePages/HugeTLB configuration # # Note: This script does calculation for all shared memory # segments available when the script is run, no matter it # is an Oracle RDBMS shared memory segment or not. # # This script is provided by Doc ID 401749.1 from My Oracle Support # http://support.oracle.com # Welcome text echo " This script is provided by Doc ID 401749.1 from My Oracle Support (http://support.oracle.com) where it is intended to compute values for the recommended HugePages/HugeTLB configuration for the current shared memory segments. Before proceeding with the execution please note following: * For ASM instance, it needs to configure ASMM instead of AMM. * The 'pga_aggregate_target' is outside the SGA and you should accommodate this while calculating SGA size. * In case you changes the DB SGA size, as the new SGA will not fit in the previous HugePages configuration, it had better disable the whole HugePages, start the DB with new SGA size and run the script again. And make sure that: * Oracle Database instance(s) are up and running * Oracle Database 11g Automatic Memory Management (AMM) is not setup (See Doc ID 749851.1) * The shared memory segments can be listed by command: # ipcs -m Press Enter to proceed..." read # Check for the kernel version KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'` # Find out the HugePage size HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'` if [ -z "$HPG_SZ" ];then echo "The hugepages may not be supported in the system where the script is being executed." exit 1 fi # Initialize the counter NUM_PG=0 # Cumulative number of pages required to handle the running shared memory segments for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"` do MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q` if [ $MIN_PG -gt 0 ]; then NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q` fi done RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q` # An SGA less than 100MB does not make sense # Bail out if that is the case if [ $RES_BYTES -lt 100000000 ]; then echo "***********" echo "** ERROR **" echo "***********" echo "Sorry! There are not enough total of shared memory segments allocated for HugePages configuration. HugePages can only be used for shared memory segments that you can list by command: # ipcs -m of a size that can match an Oracle Database SGA. Please make sure that: * Oracle Database instance is up and running * Oracle Database 11g Automatic Memory Management (AMM) is not configured" exit 1 fi # Finish with results case $KERN in
'2.2') echo "Kernel version $KERN is not supported. Exiting." ;;
'2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`; echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;; '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; esac # End |
5,修改vm.nr_hugepages参数,值为上步计算的数值
参数vm.nr_hugepages指明了内存页数,如果设置大内存页为100G,则vm.nr_hugepages的大小为70G×1024×1024/2048K=35840
vi /etc/sysctl.conf
vm.nr_hugepages = 35840
sysctl -p 命令使配置生效。
6,关闭数据库,重启主机和数据库(理论上不需要重启主机,建议重启)
7,验证是否设置正确
grep HugePages /proc/meminfo
HugePages_Free小于HugePages_Total的值则表示设置成功。如果HugePages_Rsvd应该保持少量保留内存。
注意,HugePages如果配置不恰当会引起系统性能下降等风险,需要慎重。
参考资料
HugePages on Linux: What It Is... and WhatIt Is Not... [ID 361323.1]
HugePages on Oracle Linux 64-bit [ID361468.1]
转自:http://blog.csdn.net/hijk139/article/details/7656491
参考:
官方文档:http://docs.oracle.com/cd/E37670_01/E37355/html/ol_config_hugepages.html
ML的RAC配置实例:http://www.askmaclean.com/archives/exadata-linux-huge-page.html#userconsent#
另两个网友实例:
http://blog.chinaunix.net/uid-28460966-id-4311132.html
http://www.leodba.com/archives/42#userconsent#