11.2.0.4添加节点时遇到ORA-12547: TNS:lost contact

环境描述:
11.2.0.4的2个节点rac,RHEL 6 Update 5

[root@rac2 ~]# uname -a
Linux rac2 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@rac2 ~]# uname -r
2.6.32-431.el6.x86_64
[oracle@rac2 ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.188.18  rac1
192.168.188.19  rac2
192.168.188.20  rac3
192.168.188.118  rac1-vip
192.168.188.119  rac2-vip
192.168.188.120  rac3-vip
192.168.182.18    rac1-priv
192.168.182.19    rac2-priv
192.168.182.20    rac3-priv
192.168.188.105   scan
[oracle@rac2 ~]$

在添加第三个节点的dbca时遇到如下报错,然后第三个db instance添加不成功

/u01/app/11.2.0/grid/log/rac3/agent/crsd/oraagent_oracle/oraagent_oracle.log 的部分报错如下:

2015-09-10 01:38:21.978: [ora.orcl.db][3571566336]{1:28142:484} [start] crsHome = /u01/app/11.2.0/grid
2015-09-10 01:38:21.978: [ora.orcl.db][3571566336]{1:28142:484} [start] oracleHome = /u02/app/oracle/product/11.2.0/dbhome_1
2015-09-10 01:38:21.978: [ora.orcl.db][3571566336]{1:28142:484} [start] command = '/u01/app/11.2.0/grid/bin/setasmgidwrap oracle_binary_path=/u02/app/oracle/product/11.2.0/dbhome_1/bin/oracle'
2015-09-10 01:38:21.979: [ora.orcl.db][3571566336]{1:28142:484} [start] start dependency = hard(ora.DATA.dg) weak(type:ora.listener.type,global:type:ora.scan_listener.type,uniform:ora.ons,global:ora.gns,ora.FRA.dg) pullup(ora.DATA.dg)
2015-09-10 01:38:21.979: [ora.orcl.db][3571566336]{1:28142:484} [start] ASM disk group dependency found
2015-09-10 01:38:21.979: [ora.orcl.db][3571566336]{1:28142:484} [start] Utils:execCmd action = 1 flags = 6 ohome = /u01/app/11.2.0/grid cmdname = setasmgidwrap.
2015-09-10 01:38:23.937: [    AGFW][3567363840]{1:28142:484} Agent received the message: RESOURCE_MODIFY_ATTR[ora.orcl.db 3 1] ID 4355:671
2015-09-10 01:38:50.992: [ora.orcl.db][3571566336]{1:28142:484} [start] execCmd ret = 0
2015-09-10 01:38:50.992: [ USRTHRD][3571566336]{1:28142:484} InstConnection::initMutex AttachLock 00ae3210 DetachLock 00ae3228
2015-09-10 01:38:50.994: [ora.orcl.db][3571566336]{1:28142:484} [start] clsnInstConnection::makeConnectStr UsrOraEnv  m_oracleHome /u02/app/oracle/product/11.2.0/dbhome_1 Crshome /u01/app/11.2.0/grid
2015-09-10 01:38:50.994: [ora.orcl.db][3571566336]{1:28142:484} [start] makeConnectStr = (DESCRIPTION=(ADDRESS=(PROTOCOL=beq)(PROGRAM=/u02/app/oracle/product/11.2.0/dbhome_1/bin/oracle)(ARGV0=oracleorcl3)(ENVS='ORACLE_HOME=/u02/app/oracle/product/11.2.0/dbhome_1,ORACLE_SID=orcl3,LD_LIBRARY_PATH=')(ARGS='(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))'))(CONNECT_DATA=(SID=orcl3)))
2015-09-10 01:38:51.223: [ora.orcl.db][3571566336]{1:28142:484} [start] Container:start oracle home /u02/app/oracle/product/11.2.0/dbhome_1
2015-09-10 01:38:51.224: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection::connectInt: server not attached
2015-09-10 01:38:52.996: [ora.orcl.db][3571566336]{1:28142:484} [start] ORA-12547: TNS:lost contact

2015-09-10 01:38:53.030: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection::connectInt (1) Exception OCIException
2015-09-10 01:38:53.032: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection:connect:excp OCIException OCI error 12547
2015-09-10 01:38:53.033: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection::connectInt: server not attached
2015-09-10 01:38:53.712: [ora.orcl.db][3571566336]{1:28142:484} [start] ORA-12547: TNS:lost contact

2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection::connectInt (1) Exception OCIException
2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::start: 1 errcode 12547
2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::resetConnection  s_statusOfConnectionMap 00ae9760
2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::resetConnection sid orcl3 status  2
2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] Gimh::check OH /u02/app/oracle/product/11.2.0/dbhome_1 SID orcl3
2015-09-10 01:38:53.754: [ora.orcl.db][3571566336]{1:28142:484} [start] GIMH: GIM-00104: Health check failed to connect to instance.
GIM-00090: OS-dependent operation:open failed with status: 2
GIM-00091: OS failure message: No such file or directory
GIM-00092: OS failure occurred at: sskgmsmr_7

2015-09-10 01:38:53.754: [ora.orcl.db][3571566336]{1:28142:484} [start] (:CLSN00007:)DbAgent::check failed gimh state 0
2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] clsnDbAgent:checkCbk clsagfw_res_status ret 5
2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::stopConnection
2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::removeConnection connection count 0
2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::removeConnection freed 0
2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::stopConnection sid orcl3 status  1
2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::check 1 prev clsagfw_res_status 0 current clsagfw_res_status 5
2015-09-10 01:38:53.764: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::start not  logged on check state details Abnormal Termination
2015-09-10 01:38:53.764: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::start: ORA-1012 or Lost Contact try cleanOracleIpc and start force
2015-09-10 01:38:53.764: [ USRTHRD][3571566336]{1:28142:484} InstConnection:~InstConnection: this b00070c0
2015-09-10 01:38:53.766: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::start call sysresv
2015-09-10 01:38:53.766: [ora.orcl.db][3571566336]{1:28142:484} [start] Container:start scls_clean_oracle_ipc Container orcl3 dbHome /u02/app/oracle/product/11.2.0/dbhome_1

用如上的报错,到mos上搜索,不过没啥有价值的东西。
于是就改变策略,用sqlplus / as sysdba 登陆看看有啥报错:

[oracle@rac3 oracle]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Thu Sep 10 12:09:13 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-12547: TNS:lost contact

Enter user-name: 

ERROR:
ORA-12547: TNS:lost contact

Enter user-name: 
ERROR:
ORA-12547: TNS:lost contact

SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus
[oracle@rac3 oracle]$

在mos文章SYSDBA Connections Fail With ORA-12547 Error (文档 ID 782276.1)的提示下,
在 $ORACLE_HOME/rdbms/log下,找到了很多trc文件,其内容截取如下:

----此时你也许又疑问,到bdump下看看?其实此时instance尚未建立,是没有bdump目录的。

[oracle@rac3 log]$ more orcl3_ora_14292.trc
Dump file /u02/app/oracle/product/11.2.0/dbhome_1/rdbms/log/orcl3_ora_14292.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u02/app/oracle/product/11.2.0/dbhome_1
System name:    Linux
Node name:      rac3
Release:        2.6.32-431.el6.x86_64
Version:        #1 SMP Sun Nov 10 22:19:54 EST 2013
Machine:        x86_64
Instance name: orcl3
Redo thread mounted by this instance: 0 <none>
Oracle process number: 0
Unix process pid: 14292, image: oracle@rac3


*** 2015-09-10 11:32:38.641
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [500], [1200], [], [], [], [], [], [], [], [], []
----- SQL Statement (None) -----
Current SQL information unavailable - no SGA.
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+41        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFFB8AFF650 ? 7FFFB8AFF728 ?
                                                   7FFFB8B041D0 ? 000000002 ?
ksedst1()+103        call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFFB8AFF650 ? 7FFFB8AFF728 ?
                                                   7FFFB8B041D0 ? 000000002 ?

发现了比较关键的报错:

spstp: ORACLE_HOME uid does not match euid], [500], [1200], [], [], [], [], [], [], [], [], []

到mos上搜索到了文章ORA-600 [spstp: ORACLE_HOME uid does not match euid] When Changing Permissions On $ORACLE_HOME/bin/oracle (文档 ID 747456.1)
得到如下的信息:该报错中的500是uid,而1200是euid

于是就去检查该节点上的oracle用户和grid用户的id信息,如下:

[oracle@rac3 oracle]$ id oracle
uid=1200(oracle) gid=1000(oinstall) groups=1000(oinstall),1200(dba),1201(oper),1300(asmdba)
[oracle@rac3 oracle]$ id grid
uid=1100(grid) gid=1000(oinstall) groups=1000(oinstall),1200(dba),1100(asmadmin),1301(asmoper),1300(asmdba)
[oracle@rac3 oracle]$

上面输出中没有500.那500是从哪里来的?继续检查ORACLE_DB_HOME的属主,发现了问题:

[oracle@rac3 ~]$ pwd
/home/oracle
[oracle@rac3 ~]$ cd /u02/app/oracle/product/11.2.0/
[oracle@rac3 11.2.0]$ ls -lrt
total 4
drwxrwxr-x 74 500 oinstall 4096 Sep 10 01:12 dbhome_1
[oracle@rac3 11.2.0]$ cd ..
[oracle@rac3 product]$ ls -lrt
total 4
drwxrwxr-x 3 500 oinstall 4096 Sep  9 21:46 11.2.0
[oracle@rac3 product]$ cd ..
[oracle@rac3 oracle]$ ls -lrt
total 12
drwxrwxr-x 3    500 oinstall 4096 Sep  9 21:36 product  --------->此出product的属主是500,问题得到定位
drwxr-xr-x 3 oracle oinstall 4096 Sep 10 01:37 cfgtoollogs
drwxr-xr-x 3 oracle oinstall 4096 Sep 10 11:31 admin
[oracle@rac3 oracle]$ pwd
/u02/app/oracle
[oracle@rac3 oracle]$

                                                  
改变属主为oracle之后,再添加节点就没问题了。

总结一下:/u02/app/oracle/product的属主之所以会显示500,是因为rac3主机oracle用户一开始的uid是500,而其他两个节点上oracle用户的uid是1200.大家知道,rac节点的uid不一致的话,是不行的。于是就修改rac3上的uid,结果/u02/app/oracle/product的属主没改,就开始加节点。后续的就不说了。。
 

转载于:https://my.oschina.net/u/2600747/blog/591549

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值