格式是: $ORA_CRS_HOME/log//racg/ora..ons.log
跟踪日志发现如下信息:
2012-06-12 17:21:05.030: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
Adding remote host rhel:6251
1: {node = rhel2, port = 6251}
2012-06-12 17:21:05.032: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
2012-06-12 17:21:05.032: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
Adding remote host rhel:6251
1: {node = rhel2, port = 6251}
onsctl: ons failed to start
2012-06-12 17:21:05.133: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/app/oracle/crs
2012-06-12 17:21:05.133: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: cmd = /u01/app/oracle/crs/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/app/oracle/crs/bin/onsctl start
2012-06-12 17:21:05.133: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: rc = 1, time = 1.650s
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
Adding remote host rhel:6251
1: {node = rhel2, port = 6251}
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: ons is not running ...
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/app/oracle/crs
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: cmd = /u01/app/oracle/crs/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/app/oracle/crs/bin/onsctl ping
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: rc = 1, time = 0.310s
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: end for resource = ora.rhel2.ons, action = start, status = 1, time = 2.060s
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: onsctl: shutting down ons daemon ...
GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: Adding remote host rhel:6251
1: {node = rhel2, port = 6251}
onsctl: shutdown of ons failed!
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/app/oracle/crs
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: clsrcexecut: cmd = /u01/app/oracle/crs/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/app/oracle/crs/bin/onsctl stop
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: clsrcexecut: rc = 3, time = 0.470s
从上面的日志可以看出应该是两个节点的端口不匹配导致的问题,手动创建ONS服务使用的是6251端口,使用vipca创建的可能不是6251端口,所以导致两边的端口不匹配。
一.onsctl工具
下面是onsctl工具的帮助信息:
[root@rhel1 bin]#./onsctl
usage: ./onsctl start|stop|ping|reconfig|debug
start - Start opmn only.
stop - Stop ons daemon
ping - Test to see if ons daemon is running
debug - Display debug information for the ons daemon
reconfig - Reload the ons configuration
help - Print a short syntax description (this).
detailed - Print a verbose syntax description.
[root@rhel1 bin]#./onsctl detailed
usage: ./onsctl start|stop|ping|reconfig|debug
start
Start ons daemon
stop
Shutdown ons daemon
reconfig
Trigger ons to re-read it's configuration files.
ping
Test to see if ons daemon is alive
debug
Display debug information about the ons daemon
help
Print a short syntax description.
detailed
Print a verbose syntax description (this message).
在第一个节点执行onsctl ping命令:
[root@rhel1 bin]#./onsctl ping
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
GETHOSTBYNAME(rhel): 2
Adding remote host rhel:6251
GETHOSTBYNAME(rhel): 2
1: {node = rhel2, port = 6251}
Adding remote host rhel2:6251
ons is running ...
ons在第一个节点已经处于运行状态。
在第二个节点执行onsctl ping命令:
[root@rhel2 bin]#./onsctl ping
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
GETHOSTBYNAME(rhel): 2
Adding remote host rhel:6251
GETHOSTBYNAME(rhel): 2
1: {node = rhel2, port = 6251}
Remote port for local node in local config does not match that from OCR.
ons is not running ...
发现第二个节点ons因为端口与第一个节点不匹配的原因而没有启动。
二.查看节点进程:
查看第一个节点的ons进程:
[root@rhel1 bin]#ps -ef | grep ons
root 2412 1 0 16:47 ? 00:00:00 sendmail: accepting connections
oracle 13513 1 0 17:17 ? 00:00:00 /u01/app/oracle/crs/opmn/bin/ons -d
oracle 13515 13513 0 17:17 ? 00:00:00 /u01/app/oracle/crs/opmn/bin/ons -d
root 15646 3340 0 17:22 pts/0 00:00:00 grep ons
查看第二个节点的osn进程:
[root@rhel2 bin]#ps -ef | grep ons
root 2400 1 0 16:45 ? 00:00:00 sendmail: accepting connections
root 13847 3546 0 17:22 pts/0 00:00:00 grep ons
三.ONS配置文件
执行find命令找到了ons的配置文件,如下:
./opmn/conf/ons.config.tmp
./opmn/conf/ons.config
./opmn/conf/ons.config.backup.10205
[root@rhel1 crs]#cat ./opmn/conf/ons.config
localport=6113
remoteport=6200
loglevel=3
useocr=on
显然配置文件中的端口与执行racgons配置的6251不匹配。
四.RACGONS工具
RACGONS的帮助信息如下:
[root@rhel1 bin]#./racgons
To add ONS daemons configuration:
./racgons.bin add_config hostname:port [hostname:port] ...
To remove ONS daemons configuration:
./racgons.bin remove_config hostname[:port] [hostname:port] ...
在OCR中可能配置有两条ONS的信息,执行以下的命令删除原有的6251端口配置:
[root@rhel1 bin]#./racgons remove_config rhel:6251 rhel2:6251
racgons: Existing key value on rhel = 6251.
racgons: rhel:6251 removed from OCR.
racgons: Existing key value on rhel2 = 6251.
racgons: rhel2:6251 removed from OCR.
重新启动nodeapps:
[root@rhel1 bin]# ./srvctl start nodeapps -n rhel2
[root@rhel1 bin]# ./srvctl start nodeapps -n rhel1
查看两个节点的状态:
[root@rhel1 bin]#./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.rhel1.gsd application ONLINE ONLINE rhel1
ora.rhel1.ons application ONLINE ONLINE rhel1
ora.rhel1.vip application ONLINE ONLINE rhel1
ora.rhel2.gsd application ONLINE ONLINE rhel2
ora.rhel2.ons application ONLINE ONLINE rhel2
ora.rhel2.vip application ONLINE ONLINE rhel2
恢复正常。
--end--