Troubleshooting 11.2 Grid Infrastructure root.sh Issues (Doc ID 1053970.1)

To BottomTo Bottom

20-Mar-2013TROUBLESHOOTING
Rate this documentEmail link to this documentOpen document in new windowPrintable Page

In this Document

 Purpose
 Troubleshooting Steps
 Advanced Root.sh Troubleshooting
 Community Discussions
 References

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.1 to 11.2.0.3 [Release 11.2]
Information in this document applies to any platform.

PURPOSE

This document is to provide a reference for troubleshooting root.sh issues after installing an 11.2 Grid Infrastructure home for a cluster.  For versions prior to 11.2, seeNote: 240001.1

TROUBLESHOOTING STEPS

At the end of a grid infrastructure installation, the user is prompted to run the "root.sh" script.  This script configures and starts the Oracle Clusterware stack.  A root.sh script can error out and/or fail under one of the following conditions:

  • Problem with the network configuration.
  • Problem with the storage location for the OCR and/or voting files.  
  • Permission problem with/var/tmp (specifically /var/tmp/.oracle).
  • Problem with the vendor clusterware (if used).
  • Some other configuration issue.
  • An Oracle bug.

Most configuration issues should be detectable by running the Cluster Verification Utility with the following syntax (input the nodelist):

cd <GRID_HOME>/bin
./cluvfy stage -pre crsinst -n <nodelist> -r 11gR2 -verbose


Additional options can be used for a more thorough check:

USAGE:
cluvfy stage -pre crsinst -n <node_list> [-r {10gR1|10gR2|11gR1|11gR2}]
[-c <ocr_location_list>] [-q <voting_disk_list>]
[-osdba <osdba_group>]
[-orainv <orainventory_group>]
[-asm -asmgrp <asmadmin_group>]
[-asm -asmdev <asm_device_list>]
[-fixup [-fixupdir <fixup_dir>]] [-verbose]


If the Cluster Verification Utility is unable to find a configuration problem and your root.sh still fails, you may need the assistance of Oracle Support to troubleshoot further and/or see the "Advanced Root.sh Troubleshooting" section:

Advanced Root.sh Troubleshooting

The root.sh is simply a parent script that calls the following scripts:

<GRID_HOME>/install/utl/rootmacro.sh   # small - validates home and user
<GRID_HOME>/install/utl/rootinstall.sh    # small - creates some local files
<GRID_HOME>/network/install/sqlnet/setowner.sh   # small - opens up /tmp permissions
<GRID_HOME>/rdbms/install/rootadd_rdbms.sh  # small - misc file/permission checks
<GRID_HOME>/rdbms/install/rootadd_filemap.sh  # small - misc file/permission checks
<GRID_HOME>/crs/install/rootcrs.pl  # MAIN CLUSTERWARE CONFIG SCRIPT

If your root.sh is failing on one of the first 5 scripts, it should be an easy fix since those fix are small and easy to troubleshoot.  However, most problems are likely going to happen in the rootcrs.pl script which is the main clusterware config script.  This script will log useful trace data to <GRID_HOME>/cfgtoollogs/crsconfig/rootcrs_<nodename>.log.  However, you should check the clusterware alert log under <GRID_HOME>/log/<nodename> first for any obvious problems or errors.

In the following section I will show the log output of a new installation on a 2 node cluster (racbde1 and racbde2) where the OCR and Voting files will be stored in ASM on a diskgroup called +SYSTEMDG.  This log information is posted for reference.  It might be useful to compare the clusterware alert log from a working root.sh (mine) to a failing one (yours) to see where it went wrong.  I will bold the major landmarks in the clusterware alert log so that you can see how far you got:

Node 1 (racbde1) Clusterware Alert Log During root.sh:

Oracle Database 11g Clusterware Release 11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.
2009-12-23 19:24:33.844
[client(17368)]CRS-2106:The OLR location /u01/app/grid/cdata/racbde1.olr is inaccessible. Details in /u01/app/grid/log/racbde1/client/ocrconfig_17368.log.
2009-12-23 19:24:33.956
[client(17368)]CRS-2101:The OLR was formatted using version 3.
2009-12-23 19:25:02.495
[ohasd(17767)]CRS-2112:The OLR service started on node racbde1.
2009-12-23 19:25:02.833
[ohasd(17767)]CRS-2772:Server 'racbde1' has been assigned to pool 'Free'.
2009-12-23 19:25:34.801
[cssd(18791)]CRS-1713:CSSD daemon is started in exclusive mode
2009-12-23 19:25:37.126
[cssd(18791)]CRS-1709:Lease acquisition failed for node racbde1 because no voting file has been configured; Details at (:CSSNM00031:) in /u01/app/grid/log/racbde1/cssd/ocssd.log
2009-12-23 19:25:54.705
[cssd(18791)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racbde1 .
2009-12-23 19:25:55.431
[ctssd(18848)]CRS-2403:The Cluster Time Synchronization Service on host racbde1 is in observer mode.
2009-12-23 19:25:55.575
[ctssd(18848)]CRS-2407:The new Cluster Time Synchronization Service reference node is host racbde1.
2009-12-23 19:25:56.312
[ctssd(18848)]CRS-2401:The Cluster Time Synchronization Service started on host racbde1.
[client(19034)]CRS-10001:ACFS-9327: Verifying ADVM/ACFS devices.
[client(19038)]CRS-10001:ACFS-9322: done.
2009-12-23 19:30:26.790
[client(19423)]CRS-1006:The OCR location +SYSTEMDG is inaccessible. Details in /u01/app/grid/log/racbde1/client/ocrconfig_19423.log.
2009-12-23 19:30:27.883
[client(19423)]CRS-1001:The OCR was formatted using version 3.
2009-12-23 19:30:40.473
[crsd(19480)]CRS-1012:The OCR service started on node racbde1.
2009-12-23 19:31:53.331
[cssd(18791)]CRS-1605:CSSD voting file is online: /dev/sdb1; details in /u01/app/grid/log/racbde1/cssd/ocssd.log.
2009-12-23 19:31:53.373
[cssd(18791)]CRS-1605:CSSD voting file is online: /dev/sdb2; details in /u01/app/grid/log/racbde1/cssd/ocssd.log.
2009-12-23 19:31:53.417
[cssd(18791)]CRS-1605:CSSD voting file is online: /dev/sdb3; details in /u01/app/grid/log/racbde1/cssd/ocssd.log.
2009-12-23 19:31:54.413
[cssd(18791)]CRS-1626:A Configuration change request completed successfully
2009-12-23 19:31:54.424
[cssd(18791)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racbde1 .
2009-12-23 19:32:10.831
[ctssd(18848)]CRS-2405:The Cluster Time Synchronization Service on host racbde1 is shutdown by user
2009-12-23 19:32:26.536
[cssd(18791)]CRS-1603:CSSD on node racbde1 shutdown by user.
2009-12-23 19:32:26.856
[cssd(18791)]CRS-1625:Node racbde1, number 1, was manually shut down
2009-12-23 19:32:44.826
[cssd(20125)]CRS-1713:CSSD daemon is started in clustered mode
2009-12-23 19:34:07.568
[cssd(20125)]CRS-1707:Lease acquisition for node racbde1 number 1 completed
2009-12-23 19:34:07.690
[cssd(20125)]CRS-1605:CSSD voting file is online: /dev/sdb3; details in /u01/app/grid/log/racbde1/cssd/ocssd.log.
2009-12-23 19:34:07.731
[cssd(20125)]CRS-1605:CSSD voting file is online: /dev/sdb2; details in /u01/app/grid/log/racbde1/cssd/ocssd.log.
2009-12-23 19:34:07.774
[cssd(20125)]CRS-1605:CSSD voting file is online: /dev/sdb1; details in /u01/app/grid/log/racbde1/cssd/ocssd.log.
2009-12-23 19:34:25.380
[cssd(20125)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racbde1 .
2009-12-23 19:34:26.324
[ctssd(20269)]CRS-2403:The Cluster Time Synchronization Service on host racbde1 is in observer mode.
2009-12-23 19:34:26.448
[ctssd(20269)]CRS-2407:The new Cluster Time Synchronization Service reference node is host racbde1.
2009-12-23 19:34:27.278
[ctssd(20269)]CRS-2401:The Cluster Time Synchronization Service started on host racbde1.
2009-12-23 19:34:41.941
[crsd(20392)]CRS-1012:The OCR service started on node racbde1.
2009-12-23 19:34:44.734
[crsd(20392)]CRS-1201:CRSD started on node racbde1.



Node 2 (racbde2) Clusterware Alert Log During root.sh:

Oracle Database 11g Clusterware Release 11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.
2009-12-23 19:33:43.687
[client(12019)]CRS-2106:The OLR location /u01/app/grid/cdata/racbde2.olr is inaccessible. Details in /u01/app/grid/log/racbde2/client/ocrconfig_12019.log.
2009-12-23 19:33:43.700
[client(12019)]CRS-2101:The OLR was formatted using version 3.
2009-12-23 19:33:50.660
[ohasd(12058)]CRS-2112:The OLR service started on node racbde2.
2009-12-23 19:33:50.946
[ohasd(12058)]CRS-2772:Server 'racbde2' has been assigned to pool 'Free'.
2009-12-23 19:34:15.140
[ohasd(12058)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2009-12-23 19:34:17.910
[cssd(13108)]CRS-1713:CSSD daemon is started in exclusive mode
2009-12-23 19:35:39.019
[cssd(13108)]CRS-1707:Lease acquisition for node racbde2 number 2 completed
[cssd(13108)]CRS-1636:The CSS daemon was started in exclusive mode but found an active CSS daemon on node racbde1 and is terminating; details at (:CSSNM00006:) in /u01/app/grid/log/racbde2/cssd/ocssd.log
2009-12-23 19:35:39.043
[cssd(13108)]CRS-1603:CSSD on node racbde2 shutdown by user.
2009-12-23 19:35:39.152
[ohasd(12058)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'racbde2'.
2009-12-23 19:36:00.206
[cssd(13376)]CRS-1713:CSSD daemon is started in clustered mode
2009-12-23 19:36:19.648
[cssd(13376)]CRS-1707:Lease acquisition for node racbde2 number 2 completed
2009-12-23 19:36:19.762
[cssd(13376)]CRS-1605:CSSD voting file is online: /dev/sdb1; details in /u01/app/grid/log/racbde2/cssd/ocssd.log.
2009-12-23 19:36:19.810
[cssd(13376)]CRS-1605:CSSD voting file is online: /dev/sdb3; details in /u01/app/grid/log/racbde2/cssd/ocssd.log.
2009-12-23 19:36:19.857
[cssd(13376)]CRS-1605:CSSD voting file is online: /dev/sdb2; details in /u01/app/grid/log/racbde2/cssd/ocssd.log.
2009-12-23 19:36:31.342
[cssd(13376)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racbde1 racbde2 .
2009-12-23 19:36:32.707
[ctssd(13443)]CRS-2403:The Cluster Time Synchronization Service on host racbde2 is in observer mode.
2009-12-23 19:36:32.860
[ctssd(13443)]CRS-2407:The new Cluster Time Synchronization Service reference node is host racbde1.
2009-12-23 19:36:33.600
[ctssd(13443)]CRS-2401:The Cluster Time Synchronization Service started on host racbde2.
[client(13473)]CRS-10001:ACFS-9327: Verifying ADVM/ACFS devices.
[client(13477)]CRS-10001:ACFS-9322: done.
2009-12-23 19:39:27.166
[crsd(13606)]CRS-1012:The OCR service started on node racbde2.
2009-12-23 19:39:30.419
[crsd(13606)]CRS-1201:CRSD started on node racbde2.



If further analysis is needed, it might be useful to compare a working rootcrs output (mine) to one that is failing (yours) to see what went wrong.  Again the rootcrs log is in <GRID_HOME>/cfgtoollogs/crsconfig/rootcrs_<nodename>.log.  I will divide the log into the following rootcrs sections:

  • First Node Initial Setup (racbde1)
  • First Node Setup OLR for storing Oracle local registry data
  • First Node Setup GPnP wallet and profile
  • First Node Setup and copy files for OHASD daemon
  • First Node Start OHASD Daemon
  • First Node Copy required CRS resources for OHASD to start
  • First Node Start in Exclusive Mode and Configure Diskgroup
  • First Node Push GPnP Profile to Remote Node(s)
  • First Node Start Full Clusterware Stack
  • First Node Adding Clusterware Resources
  • Secondary Note Initial Setup (racbde2)
  • Secondary Note Get GPnP Profile
  • Secondary Node Setup OLR for storing Oracle local registry data
  • Secondary Node Setup and copy files for OHASD daemon
  • Secondary Node Start OHASD Daemon
  • Secondary Node Copy required CRS resources for OHASD to start
  • Secondary Node Start Full Clusterware Stack
  • Secondary Node Adding Clusterware Resources
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值