背景:
redhat 4.8 + oracle 11g R2
准备安装oracle RAC,但是安装grid crs的时候,最好一步运行root.sh 失败了.
具体错误如下:
/opt/app/grid/cfgtoollogs/asmca/asmca-1007271PM3756.log
......
[main] [ 2010-07-27 13:37:57.828 CST ] [SQLEngine.initialize:317] Execing SQLPLUS/SVRMGR process...
[main] [ 2010-07-27 13:37:57.832 CST ] [SQLEngine.initialize:354] m_bReaderStarted: false
[main] [ 2010-07-27 13:37:57.832 CST ] [SQLEngine.initialize:358] Starting Reader Thread...
[main] [ 2010-07-27 13:37:57.847 CST ] [UsmcaLogger.logExit:122] Exiting oracle.sysman.assistants.usmca.backend.USMInstance Method : createSQLEngine
[main] [ 2010-07-27 13:37:57.848 CST ] [OracleHome.getVersion:877] OracleHome.getVersion called. Current Version: null
[main] [ 2010-07-27 13:37:57.851 CST ] [InventoryUtil.getOUIInvSession:347] setting OUI READ level to ACCESSLEVEL_READ_LOCKLESS
[main] [ 2010-07-27 13:37:57.852 CST ] [OracleHome.getVersion:896] Homeinfo /opt/app/11.2.0/grid,1
[main] [ 2010-07-27 13:37:57.962 CST ] [OracleHome.getVersion:943] Current Version From Inventory: null
[main] [ 2010-07-27 13:37:57.963 CST ] [OracleHome.getVersion:948] using sqlplus: /opt/app/11.2.0/grid/bin/sqlplus
[main] [ 2010-07-27 13:37:57.963 CST ] [OracleHome.getVersion:981] adding oracle home to sqlplus env
[main] [ 2010-07-27 13:37:57.983 CST ] [OracleHome.getVersion:988] /opt/app/11.2.0/grid/bin/sqlplus Banner:
SQL*Plus: Release 11.2.0.1.0 Production
[main] [ 2010-07-27 13:37:57.984 CST ] [OracleHome.getVersion:1006] Current version from sqlplus: 11.2.0.1.0
[main] [ 2010-07-27 13:37:57.985 CST ] [UsmcaLogger.logInfo:141] Role SYSASM
[main] [ 2010-07-27 13:37:57.985 CST ] [UsmcaLogger.logInfo:141] OS Auth true
[main] [ 2010-07-27 13:38:00.207 CST ] [SQLEngine.done:2148] Done called
[main] [ 2010-07-27 13:38:00.207 CST ] [USMInstance.configureLocalASM:2743] ORA-00600: internal error code, arguments: [SKGMHASH], [1], [1917574924], [0], [0], [], [], [], [], [], [], []
......
下面是rootcrs_%hostname%.log的摘要
2010-07-27 13:37:54: CRS-2672: Attempting to start 'ora.cssd' on 'linux41'
2010-07-27 13:37:54: CRS-2672: Attempting to start 'ora.diskmon' on 'linux41'
2010-07-27 13:37:54: CRS-2676: Start of 'ora.diskmon' on 'linux41' succeeded
2010-07-27 13:37:54: CRS-2676: Start of 'ora.cssd' on 'linux41' succeeded
2010-07-27 13:37:54: Querying for existing CSS voting disks
2010-07-27 13:37:54: Performing initial configuration for cluster
2010-07-27 13:37:55:Start of resource "ora.ctssd -init" Succeeded
2010-07-27 13:37:55: Configuring ASM via ASMCA
2010-07-27 13:37:55: Executing as grid: /opt/app/11.2.0/grid/bin/asmca -silent -diskGroupName CRS -diskList /dev/oracleasm/disks/CRSVOL1 -redundancy EXTERNAL -diskString '/dev/oracleasm/disks' -configureLocalASM
2010-07-27 13:37:55: Running as user grid: /opt/app/11.2.0/grid/bin/asmca -silent -diskGroupName CRS -diskList /dev/oracleasm/disks/CRSVOL1 -redundancy EXTERNAL -diskString '/dev/oracleasm/disks' -configureLocalASM
2010-07-27 13:37:55: Invoking "/opt/app/11.2.0/grid/bin/asmca -silent -diskGroupName CRS -diskList /dev/oracleasm/disks/CRSVOL1 -redundancy EXTERNAL -diskString '/dev/oracleasm/disks' -configureLocalASM" as user "grid"
2010-07-27 13:38:00: Configuration of ASM failed, see logs for details
2010-07-27 13:38:00: Did not succssfully configure and start ASM
2010-07-27 13:38:00: Exiting exclusive mode
2010-07-27 13:38:00: Command return code of 1 (256) from command: /opt/app/11.2.0/grid/bin/crsctl stop resource ora.crsd -init
2010-07-27 13:38:00: Stop of resource "ora.crsd -init" failed
2010-07-27 13:38:00: Failed to stop CRSD
2010-07-27 13:38:00: Command return code of 1 (256) from command: /opt/app/11.2.0/grid/bin/crsctl stop resource ora.asm -init
2010-07-27 13:38:00: Stop of resource "ora.asm -init" failed
2010-07-27 13:38:00: Failed to stop ASM
2010-07-27 13:38:19: Initial cluster configuration failed. See /opt/app/11.2.0/grid/cfgtoollogs/crsconfig/rootcrs_linux41.log for details
我已经按照oracle 官方文档关于ORA-600错误的提示,做了下面两次尝试. 都没有通过root.sh.
下面是oracle的guide:
Symptom(s)
~~~~~~~~~~
+ ORA-600 [SKGMHASH], [1], [2541092464], [0], [0], [], [], []
+ Error occurs during Startup
Cause
~~~~~~~
If shared memory segments and semaphores are cleaned out properly after
the shutdown then, the possible causes for this error are:
1. The shared memory and semaphore values are not set as recommended
2. Not enough Swap space
3. Ulimit value not set properly
4. Low Physical memory
Fix
~~~~
Note 201370.1 LINUX: Quick Start Guide - 9.2.0 RDBMS Installation..
+ The recommended settings for shared memory and semaphores are
SHMMAX = 2147483648
SHMMIN = 1
SHMMNI = 100
SEMMNS = 1000
SEMMSL = 250
SEMMNI = 100
SEMOPM = 100
Note: In Linux the new kernel settings are NOT persistent and must be reset
after each re-boot of the server.
You can set these parameters in /etc/sysctl.conf file , so that the kernel
parameter values are set when the system reboots
我尝试过的组合:
shmmax = 8589934592
kernel.shmall = 2097152
kernel.shmall = 4194304
kernel.shmmax = 4294967296
下面是系统当前kernel 参数:
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
kernel.shmall = 4194304
kernel.shmmax = 4294967296
kernel.shmmni = 4096
kernel.sem = 5010 641280 5010 128
fs.file-max = 6815744
net.core.rmem_default = 262144
net.core.wmem_default = 1048576
net.core.rmem_max = 4194304
net.core.wmem_max = 1048576
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
顺便说以下, 我已经成功通过下面的测试:
./runcluvfy.sh stage -pre crsinst -fixup -n linux41, linux42