messages报logger: Waiting minutes for filesystem containing crsctl
现象:
发现每次REBOOT主机之后CRS启动都非常慢,发现messages有如下信息:
Dec 5 18:20:40 saprac4 logger: Oracle Cluster Ready Services starting up automatically.
Dec 5 18:20:40 saprac4 logger: Waiting 10 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:20:54 saprac4 ntpd[4060]: synchronized to LOCAL(0), stratum 10
Dec 5 18:21:40 saprac4 logger: Waiting 9 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:22:40 saprac4 logger: Waiting 8 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:23:40 saprac4 logger: Waiting 7 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:24:40 saprac4 logger: Waiting 6 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:25:40 saprac4 logger: Waiting 5 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:26:40 saprac4 logger: Waiting 4 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:27:40 saprac4 logger: Waiting 3 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:28:40 saprac4 logger: Waiting 2 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:29:40 saprac4 logger: Waiting 1 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
分析:
该主机之前安装过11GR2(当前11GR1),但安装完成之后未正确卸载,导致有残留的文件,导致会先启动之前的CRS(11GR2),
但/apps文件系统已经被删除,所以messages会一致wait该文件系统的挂载,一直等待时间达到9min之后才启动后安装的CRS(11GR1)。
解决方法:
清楚11GR2残留CRS文件:
#cd /etc/init.d
#mkdir bak
#mv init.ohasd bak
#mv ohasd bak
附:
How to Deconfigure/Reconfigure(Rebuild OCR) or Deinstall Grid Infrastructure
A. Grid Infrastructure Cluster - Entire Cluster
Deconfigure and reconfigure entire cluster will rebuild OCR and Voting Disk, user resources (database, instance, service, listener etc) will need to be added back to the cluster manually after reconfigure finishes.
Why is deconfigure needed?
Deconfigure is needed when:
?OCR is corrupted without any good backup
?Or GI stack will not come up on any nodes due to missing Oracle Clusterware related files in /etc or /var/opt/oracle, i.e. init.ohasd missing etc. If GI is able to come up on at least one node, refer to next Section "B. Grid Infrastructure Cluster - One or Partial Nodes".
?$GRID_HOME should be intact as deconfigure will NOT fix $GRID_HOME corruption
Steps to deconfigure
Before deconfiguring, collect the following as grid user if possible to generate a list of user resources to be added back to the cluster after reconfigure finishes:
$GRID_HOME/bin/crsctl stat res -t
$GRID_HOME/bin/crsctl stat res -p
$GRID_HOME/bin/crsctl query css votedisk
$GRID_HOME/bin/ocrcheck
$GRID_HOME/bin/oifcfg getif
$GRID_HOME/bin/srvctl config nodeapps -a
$GRID_HOME/bin/srvctl config scan
$GRID_HOME/bin/srvctl config asm -a
$GRID_HOME/bin/srvctl config listener -l <listener-name> -a
$DB_HOME/bin/srvctl config database -d <dbname> -a
$DB_HOME/bin/srvctl config service -d <dbname> -s <service-name> -v
To deconfigure:
?If OCR and Voting Disks are NOT on ASM, or If OCR and Voting Disks are on ASM but there's NO user data in OCR/Voting Disk ASM diskgroup:
On all remote nodes, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose
Once the above command finishes on all remote nodes, on local node, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose -lastnode
To reconfigure, run $GRID_HOME/crs/config/config.sh, refer to note 1354258.1 for details ?If OCR or Voting Disks are on ASM and there is user data in OCR/Voting Disk ASM diskgroup:
?If GI version is 11.2.0.3 AND fix for bug 13058611 and bug 13001955 has been applied, or GI version is 11.2.0.3.2 GI PSU (includes both fixes) or higher:
On all remote nodes, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose
Once the above command finishes on all remote nodes, on local node, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose -keepdg -lastnode
To reconfigure, run $GRID_HOME/crs/config/config.sh, refer to note 1354258.1 for details
?If fix for bug 13058611 and bug 13001955 has NOT been applied:
On all nodes, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose
To reconfigure:
For 11.2.0.1 - deinstall and reinstall with OCR/Voting Disk on a new ASM diskgroup or supported cluster/network filesystem
For 11.2.0.2 and onward - run $GRID_HOME/crs/config/config.sh and place OCR/Voting Disk on a new ASM diskgroup or support cluster/network filesystem. Refer to note 1354258.1 for more details of config.sh/config.bat
B. Grid Infrastructure Cluster - One or Partial Nodes
This procedure applies only when all the followings are true:
?One or partial nodes are having problem, but one or other nodes are running fine - so there's no need to deconfigure the entire cluster
?And GI is a fresh installation (NOT upgrad) without any patch set (interim patch or patch set update(PSU) is fine). A direct patch set installation is considered as a fresh installation regardless how long it has been running, as long as there was no Oracle Clusterware running when it is first installed.
?And cluster parameters have not been changed since original configuration, eg: OCR/VD on same location, network configuration has not been changed etc
?And $GRID_HOME is intact as deconfigure will NOT fix $GRID_HOME corruption
?If any of the above is NOT true, node removal/addition procedure should be used
Steps to deconfigure and reconfigure
As root, on each problematic node, execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force
# <$GRID_HOME>/root.sh
C. Grid Infrastructure Standalone (Oracle Restart)
Why is deconfigure needed?
Deconfigure is needed when:
?OLR is corrupted without any good backup
?GI stack will not come up due to missing Oracle Clusterware related files in /etc or /var/opt/oracle, i.e. init.ohasd is missing etc
?Nodename needs to be changed
Steps to deconfigure
Before deconfiguring, collect the following if possible:
$GRID_HOME/bin/crsctl stat res -t
$GRID_HOME/bin/crsctl stat res -p
$GRID_HOME/bin/srvctl config asm -a
$GRID_HOME/bin/srvctl config listener -l <listener-name> -a
$DB_HOME/bin/srvctl config database -d <dbname> -a
$DB_HOME/bin/srvctl config service -d <dbname> -s <service-name> -v
To deconfigure:
As root execute:
# <$GRID_HOME>/crs/install/roothas.pl -deconfig -force -verbose
To reconfigure, refer to note 1354258.1
D. Grid Infrastructure Deinstall
As grid user, execute:
$ <$GRID_HOME>/deinstall/deinstall
For details, refer to the following documentation for your platform:
Oracle Grid Infrastructure
Installation Guide
How to Modify or Deinstall Oracle Grid Infrastructure
If there's any error, deconfigure the failed GI with steps in Section A - C, and deinstall manually with note 1364419.1
For searchability: recreate OCR, recreate Voting Disk, rebuild OCR
现象:
发现每次REBOOT主机之后CRS启动都非常慢,发现messages有如下信息:
Dec 5 18:20:40 saprac4 logger: Oracle Cluster Ready Services starting up automatically.
Dec 5 18:20:40 saprac4 logger: Waiting 10 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:20:54 saprac4 ntpd[4060]: synchronized to LOCAL(0), stratum 10
Dec 5 18:21:40 saprac4 logger: Waiting 9 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:22:40 saprac4 logger: Waiting 8 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:23:40 saprac4 logger: Waiting 7 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:24:40 saprac4 logger: Waiting 6 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:25:40 saprac4 logger: Waiting 5 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:26:40 saprac4 logger: Waiting 4 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:27:40 saprac4 logger: Waiting 3 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:28:40 saprac4 logger: Waiting 2 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
Dec 5 18:29:40 saprac4 logger: Waiting 1 minutes for filesystem containing /apps/oracle/11gR2/crs/bin/crsctl.
分析:
该主机之前安装过11GR2(当前11GR1),但安装完成之后未正确卸载,导致有残留的文件,导致会先启动之前的CRS(11GR2),
但/apps文件系统已经被删除,所以messages会一致wait该文件系统的挂载,一直等待时间达到9min之后才启动后安装的CRS(11GR1)。
解决方法:
清楚11GR2残留CRS文件:
#cd /etc/init.d
#mkdir bak
#mv init.ohasd bak
#mv ohasd bak
附:
How to Deconfigure/Reconfigure(Rebuild OCR) or Deinstall Grid Infrastructure
A. Grid Infrastructure Cluster - Entire Cluster
Deconfigure and reconfigure entire cluster will rebuild OCR and Voting Disk, user resources (database, instance, service, listener etc) will need to be added back to the cluster manually after reconfigure finishes.
Why is deconfigure needed?
Deconfigure is needed when:
?OCR is corrupted without any good backup
?Or GI stack will not come up on any nodes due to missing Oracle Clusterware related files in /etc or /var/opt/oracle, i.e. init.ohasd missing etc. If GI is able to come up on at least one node, refer to next Section "B. Grid Infrastructure Cluster - One or Partial Nodes".
?$GRID_HOME should be intact as deconfigure will NOT fix $GRID_HOME corruption
Steps to deconfigure
Before deconfiguring, collect the following as grid user if possible to generate a list of user resources to be added back to the cluster after reconfigure finishes:
$GRID_HOME/bin/crsctl stat res -t
$GRID_HOME/bin/crsctl stat res -p
$GRID_HOME/bin/crsctl query css votedisk
$GRID_HOME/bin/ocrcheck
$GRID_HOME/bin/oifcfg getif
$GRID_HOME/bin/srvctl config nodeapps -a
$GRID_HOME/bin/srvctl config scan
$GRID_HOME/bin/srvctl config asm -a
$GRID_HOME/bin/srvctl config listener -l <listener-name> -a
$DB_HOME/bin/srvctl config database -d <dbname> -a
$DB_HOME/bin/srvctl config service -d <dbname> -s <service-name> -v
To deconfigure:
?If OCR and Voting Disks are NOT on ASM, or If OCR and Voting Disks are on ASM but there's NO user data in OCR/Voting Disk ASM diskgroup:
On all remote nodes, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose
Once the above command finishes on all remote nodes, on local node, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose -lastnode
To reconfigure, run $GRID_HOME/crs/config/config.sh, refer to note 1354258.1 for details ?If OCR or Voting Disks are on ASM and there is user data in OCR/Voting Disk ASM diskgroup:
?If GI version is 11.2.0.3 AND fix for bug 13058611 and bug 13001955 has been applied, or GI version is 11.2.0.3.2 GI PSU (includes both fixes) or higher:
On all remote nodes, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose
Once the above command finishes on all remote nodes, on local node, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose -keepdg -lastnode
To reconfigure, run $GRID_HOME/crs/config/config.sh, refer to note 1354258.1 for details
?If fix for bug 13058611 and bug 13001955 has NOT been applied:
On all nodes, as root execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose
To reconfigure:
For 11.2.0.1 - deinstall and reinstall with OCR/Voting Disk on a new ASM diskgroup or supported cluster/network filesystem
For 11.2.0.2 and onward - run $GRID_HOME/crs/config/config.sh and place OCR/Voting Disk on a new ASM diskgroup or support cluster/network filesystem. Refer to note 1354258.1 for more details of config.sh/config.bat
B. Grid Infrastructure Cluster - One or Partial Nodes
This procedure applies only when all the followings are true:
?One or partial nodes are having problem, but one or other nodes are running fine - so there's no need to deconfigure the entire cluster
?And GI is a fresh installation (NOT upgrad) without any patch set (interim patch or patch set update(PSU) is fine). A direct patch set installation is considered as a fresh installation regardless how long it has been running, as long as there was no Oracle Clusterware running when it is first installed.
?And cluster parameters have not been changed since original configuration, eg: OCR/VD on same location, network configuration has not been changed etc
?And $GRID_HOME is intact as deconfigure will NOT fix $GRID_HOME corruption
?If any of the above is NOT true, node removal/addition procedure should be used
Steps to deconfigure and reconfigure
As root, on each problematic node, execute:
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force
# <$GRID_HOME>/root.sh
C. Grid Infrastructure Standalone (Oracle Restart)
Why is deconfigure needed?
Deconfigure is needed when:
?OLR is corrupted without any good backup
?GI stack will not come up due to missing Oracle Clusterware related files in /etc or /var/opt/oracle, i.e. init.ohasd is missing etc
?Nodename needs to be changed
Steps to deconfigure
Before deconfiguring, collect the following if possible:
$GRID_HOME/bin/crsctl stat res -t
$GRID_HOME/bin/crsctl stat res -p
$GRID_HOME/bin/srvctl config asm -a
$GRID_HOME/bin/srvctl config listener -l <listener-name> -a
$DB_HOME/bin/srvctl config database -d <dbname> -a
$DB_HOME/bin/srvctl config service -d <dbname> -s <service-name> -v
To deconfigure:
As root execute:
# <$GRID_HOME>/crs/install/roothas.pl -deconfig -force -verbose
To reconfigure, refer to note 1354258.1
D. Grid Infrastructure Deinstall
As grid user, execute:
$ <$GRID_HOME>/deinstall/deinstall
For details, refer to the following documentation for your platform:
Oracle Grid Infrastructure
Installation Guide
How to Modify or Deinstall Oracle Grid Infrastructure
If there's any error, deconfigure the failed GI with steps in Section A - C, and deinstall manually with note 1364419.1
For searchability: recreate OCR, recreate Voting Disk, rebuild OCR