HAS 服务起不来

[i=s] 本帖最后由 wei-xh 于 2013-3-6 15:28 编辑

记录一下错误信息和大体的分析过程。目前为止还没解决。
oslevel -s
6100-05-01-1016
database version
11.2.0.3

先只安装GI软件,不配置ASM。安装软件过程中无任何报错,自检也都非常顺利,数据库需要的补丁均已经打上。

CODE:

# /opt/grid/products/11.2.0/root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /opt/grid/products/11.2.0

Enter the full pathname of the local bin directory: [/usr/local/bin]:
Creating /usr/local/bin directory...
Copying dbhome to /usr/local/bin ...
Copying oraenv to /usr/local/bin ...
Copying coraenv to /usr/local/bin ...


Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.

To configure Grid Infrastructure for a Stand-Alone Server run the following command as the root user:
/opt/grid/products/11.2.0/perl/bin/perl -I/opt/grid/products/11.2.0/perl/lib -I/opt/grid/products/11.2.0/crs/install /opt/grid/products/11.2.0/crs/install/roothas.pl


To configure Grid Infrastructure for a Cluster execute the following command:
/opt/grid/products/11.2.0/crs/config/config.sh
This command launches the Grid Infrastructure Configuration Wizard. The wizard also supports silent operation, and the parameters can be passed through the response file that is available in the installation media.

# /opt/grid/products/11.2.0/perl/bin/perl -I/opt/grid/products/11.2.0/perl/lib -I/opt/grid/products/11.2.0/crs/install /opt/grid/products/11.2.0/crs/install/roothas.pl
Using configuration parameter file: /opt/grid/products/11.2.0/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
LOCAL ADD MODE
Creating OCR keys for user 'grid', privgrp 'oinstall'..
Operation successful.
LOCAL ONLY MODE
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'system'..
Operation successful.
CRS-4664: Node wxlab92 successfully pinned.
Adding Clusterware entries to inittab
ohasd failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow:
2013-03-05 23:17:01.115
[client(4194682)]CRS-2101:The OLR was formatted using version 3.
2013-03-05 23:17:03.617
[client(4915276)]CRS-1001:The OCR was formatted using version 3.

ohasd failed to start at /opt/grid/products/11.2.0/crs/install/roothas.pl line 365, line 4.然后查看各种LOG文件:

CODE:

# cat clscfg.log
Oracle Database 11g Clusterware Release 11.2.0.3.0 - Production Copyright 1996, 2011 Oracle. All rights reserved.
2013-03-05 23:17:01.533: [ CLSCFG][1]clscfg_main: Configuration type [3]
2013-03-05 23:17:01.539: [ CLSCFG][1]Using VSNNUM for AV [11.2.0.3.0].
2013-03-05 23:17:01.550: [ OCRINI][1]Using batch protocol to update the keys.
2013-03-05 23:17:01.551: [ OCRINI][1]Creating the key [SYSTEM.crs.usersecurity].
2013-03-05 23:17:01.552: [ OCRINI][1]Creating the key [SYSTEM.ORA_CRS_HOME].
2013-03-05 23:17:01.552: [ OCRINI][1]Creating the key [SYSTEM.crs.deny].
2013-03-05 23:17:01.553: [ OCRINI][1]Creating the key [SYSTEM.WALLET].
2013-03-05 23:17:01.553: [ OCRINI][1]Creating the key [SYSTEM.GNS].
2013-03-05 23:17:01.553: [ OCRINI][1]Creating the key [SYSTEM.crs.user_default_dir].
2013-03-05 23:17:01.554: [ OCRINI][1]Creating the key [SYSTEM.version.localhost].
2013-03-05 23:17:01.554: [ OCRINI][1]Creating the key [SYSTEM.version.activeversion].
2013-03-05 23:17:01.555: [ OCRINI][1]Creating the key [SYSTEM.GPnP].
2013-03-05 23:17:01.555: [ OCRINI][1]Creating the key [SYSTEM.GPnP.profiles].
2013-03-05 23:17:01.556: [ OCRINI][1]Creating the key [SYSTEM.css].
2013-03-05 23:17:01.556: [ OCRINI][1]Creating the key [SYSTEM.network].
2013-03-05 23:17:01.945: [ OCRINI][1]Successfully executed the batch.
# cat clscfg1.log
Oracle Database 11g Clusterware Release 11.2.0.3.0 - Production Copyright 1996, 2011 Oracle. All rights reserved.
2013-03-05 23:17:03.419: [ CLSCFG][1]clscfg_main: Configuration type [3]
2013-03-05 23:17:03.427: [ OCROSD][1]utread:3: Problem reading buffer 11f9e000 buflen 4096 retval 0 phy_offset 102400 retry 0
2013-03-05 23:17:03.427: [ OCROSD][1]utread:3: Problem reading buffer 11f9e000 buflen 4096 retval 0 phy_offset 102400 retry 1
2013-03-05 23:17:03.427: [ OCROSD][1]utread:3: Problem reading buffer 11f9e000 buflen 4096 retval 0 phy_offset 102400 retry 2
2013-03-05 23:17:03.427: [ OCROSD][1]utread:3: Problem reading buffer 11f9e000 buflen 4096 retval 0 phy_offset 102400 retry 3
2013-03-05 23:17:03.427: [ OCROSD][1]utread:3: Problem reading buffer 11f9e000 buflen 4096 retval 0 phy_offset 102400 retry 4
2013-03-05 23:17:03.427: [ OCROSD][1]utread:3: Problem reading buffer 11f9e000 buflen 4096 retval 0 phy_offset 102400 retry 5
2013-03-05 23:17:03.427: [ OCRRAW][1]propriogid:1_1: Failed to read the whole bootblock. Assumes invalid format.
2013-03-05 23:17:03.427: [ OCRRAW][1]proprioini: all disks are not OCR/OLR formatted
2013-03-05 23:17:03.427: [ OCRRAW][1]proprinit: Could not open raw device
2013-03-05 23:17:03.427: [ default][1]a_init:7!: Backend init unsuccessful : [26]
2013-03-05 23:17:03.615: [ OCRRAW][1]iniconfig:No 92 configuration
2013-03-05 23:17:03.617: [ OCRINI][1]Using batch protocol to update the keys.
2013-03-05 23:17:03.618: [ OCRINI][1]Creating the key [SYSTEM.version].
2013-03-05 23:17:03.618: [ OCRINI][1]Creating the key [SYSTEM.versionstring].
2013-03-05 23:17:03.619: [ OCRINI][1]Creating the key [SYSTEM.version.activeversion].
2013-03-05 23:17:03.619: [ OCRINI][1]Creating the key [SYSTEM.WALLET].
2013-03-05 23:17:03.620: [ OCRINI][1]Creating the key [SYSTEM.GNS].
2013-03-05 23:17:03.620: [ OCRINI][1]Creating the key [SYSTEM.css].
2013-03-05 23:17:03.620: [ OCRINI][1]Creating the key [SYSTEM.css.interfaces].
2013-03-05 23:17:03.621: [ OCRINI][1]Creating the key [SYSTEM.crs.versions].
2013-03-05 23:17:03.622: [ OCRINI][1]Creating the key [SYSTEM.ACFS].
2013-03-05 23:17:03.622: [ OCRINI][1]Creating the key [SYSTEM.ORA_CRS_HOME].
2013-03-05 23:17:03.622: [ OCRINI][1]Creating the key [SYSTEM.evm.debug].
2013-03-05 23:17:03.623: [ OCRINI][1]Creating the key [SYSTEM.DIAG.status].
2013-03-05 23:17:03.623: [ OCRINI][1]Creating the key [SYSTEM.local_only].
2013-03-05 23:17:03.624: [ OCRINI][1]Creating the key [SYSTEM.WLM].
2013-03-05 23:17:03.624: [ OCRINI][1]Creating the key [SYSTEM.GPnP].
2013-03-05 23:17:03.624: [ OCRINI][1]Creating the key [SYSTEM.GPnP.profiles].
2013-03-05 23:17:03.625: [ OCRINI][1]Creating the key [SYSTEM.JAZNFILE].
2013-03-05 23:17:04.124: [ OCRINI][1]Successfully executed the batch.
# cat crsctl_root.log
Oracle Database 11g Clusterware Release 11.2.0.3.0 - Production Copyright 1996, 2011 Oracle. All rights reserved.
2013-03-05 23:17:06.908: [ CRSCTL][1]crsctlcss_doclusterkeys: the current number of pinned nodes 0, set is 1

2013-03-05 23:17:07.194: [ CRSCTL][1]crsctl_pin_ocr: set OCR keys succ.
2013-03-05 23:17:17.620: [ OCRMSG][1]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)
2013-03-05 23:17:17.621: [ OCRMSG][1]GIPC error [29] msg [gipcretConnectionRefused]
2013-03-05 23:17:17.621: [ OCRMSG][1]prom_connect: error while waiting for connection complete [24]
# cat crswrapexece.log
05-Mar-13 23:17 Executed cmd: /opt/grid/products/11.2.0/bin/crswrapexece.pl /opt/grid/products/11.2.0/crs/install/s_crsconfig_wxlab92_env.txt /opt/grid/products/11.2.0/bin/ohasd.bin reboot
05-Mar-13 23:17 executing "/opt/grid/products/11.2.0/bin/ohasd.bin reboot"
# cat ocrconfig_4194682.log
Oracle Database 11g Clusterware Release 11.2.0.3.0 - Production Copyright 1996, 2011 Oracle. All rights reserved.
2013-03-05 23:17:00.819: [ OCRCONF][1]ocrconfig starts...
2013-03-05 23:17:00.849: [ OCRCONF][1]Upgrading OLR data
2013-03-05 23:17:00.849: [ OCRCONF][1]Verifying if OLR is already in latest version.
2013-03-05 23:17:00.893: [ OCRRAW][1]proprioini: all disks are not OCR/OLR formatted
2013-03-05 23:17:00.893: [ OCRRAW][1]proprinit: Could not open raw device
2013-03-05 23:17:00.893: [ default][1]a_init:7!: Backend init unsuccessful : [26]
2013-03-05 23:17:00.904: [ OCRCONF][1]OLR is not in latest version. Upgrade is required.
2013-03-05 23:17:00.904: [ OCRCONF][1]Exporting OLR data to [OCRUPGRADEFILE]
2013-03-05 23:17:00.904: [ OCRAPI][1]a_init:7!: Backend init unsuccessful : [33]
2013-03-05 23:17:00.904: [ OCRCONF][1]There was no previous version of OLR. error:[PROCL-33: Oracle Local Registry is not configured]
2013-03-05 23:17:00.904: [ OCRCONF][1]Verifying if OLR is already in latest version.
2013-03-05 23:17:00.904: [ OCRRAW][1]proprioini: all disks are not OCR/OLR formatted
2013-03-05 23:17:00.904: [ OCRRAW][1]proprinit: Could not open raw device
2013-03-05 23:17:00.904: [ default][1]a_init:7!: Backend init unsuccessful : [26]
2013-03-05 23:17:00.905: [ OCRCONF][1]OLR was not previously formatted. OCRCONFIG will now attempt to format OLR.
2013-03-05 23:17:00.905: [ OCRRAW][1]ibctx: Failed to read the whole bootblock. Assumes invalid format.
2013-03-05 23:17:00.905: [ OCRRAW][1]proprinit:problem reading the bootblock or superbloc 22

2013-03-05 23:17:01.119: [ OCRAPI][1]a_init:6a: Backend init successful
2013-03-05 23:17:01.119: [ OCRCONF][1]The OLR was successfully formatted.
2013-03-05 23:17:01.335: [ OCRCONF][1]Successfully initialized DATABASE keys
2013-03-05 23:17:01.335: [ OCRCONF][1]The OLR was successfully populated.
2013-03-05 23:17:01.335: [ OCRCONF][1]The OLR was successfully upgraded.
2013-03-05 23:17:01.335: [ OCRCONF][1]Exiting [status=success]...
# cd ..
# ls
acfslog acfssec alertwxlab92.log crfmond ctssd evmd gpnpd racg
acfsrepl admin client crsd cvu gipcd mdnsd srvm
acfsreplroot agent crflogd cssd diskmon gnsd ohasd
# cd ohasd
# ls
ohasd.log ohasdOUT.log
# cat ohasd.log
Oracle Database 11g Clusterware Release 11.2.0.3.0 - Production Copyright 1996, 2011 Oracle. All rights reserved.
2013-03-05 23:27:19.769: [ default][1] Created alert : (:OHAS00117:) : TIMED OUT WAITING FOR OHASD MONITOR
# cat ohasdOUT.log
OHASD stderr redirected to ohasdOUT.log
2013-03-05 23:17:19
Changing directory to /opt/grid/products/11.2.0/log/wxlab92/ohasd
OHASD starting
Timed out waiting for init.ohasd script. to start; posting an alert在网上 https://desaitaral.wordpress.com/category/installation/ 有类似这样的处理办法可以解决此问题。但是MOS上并没有此案例。通过这个办法解决HAS的服务启动问题后,发现重启OS后,依然不能正常启动HAS,还需要通过这个办法来启动HAS。

中间尝试过升级OS到6100-05-06,报错,继续升级到6100-05-08,报错,升级到6100-07-04,报错。
在微博、论坛发帖后,有人说是存储的问题导致的,不太相信这个说法,因为我仅仅只是配置HAS服务,还没到配置ASM那块,跟磁盘应该关系不大。不过为了验证不是存储的原因,让SA把存储的东西都从主机层面去掉了,然后重装OS。依然报错。如果这是个BUG,这个问题也太严重了,在AIX的几乎大多数版本都是有这个错的,我试过的版本已经有4个了。但是好像遭遇这个问题的案例又不是太多,不太怀疑是ORACLE的BUG。MOS没有一个通过DD来解决这个问题的案例。那问题到底出在哪了呢?
用truss跟踪安装进程不停的在读取/opt/grid/products/11.2.0/crs/mesg/crsus.msb文件。根据关键字到GOOGLE搜索,也没找到有价值的信息,他们遇到的案例跟我的情况不符。这问题连续搞了快两周,有点想吐了。[code]access("/tmp/.oracle/sOHASD_UI_SOCKET", 0) = -1
access("/tmp/o/sOHASD_UI_SOCKET", 0) = -1
close(5) = 0
close(3) = 0
kopen("/opt/grid/products/11.2.0/crs/mesg/crsus.msb", O_RDONLY) = 3
kfcntl(3, F_SETFD, 0x0000000000000001) = 0
lseek(3, 0, 0) = 0
kread(3, "1513 "011303\t\t\0\0\0\0".., 256) = 256
lseek(3, 512, 0) = 512
kread(3, "1B [ (\n\0\0\0\0\0\0\0\0".., 512) = 512
lseek(3, 1024, 0) = 1024
kread(3, "\096\0 ?

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22034023/viewspace-755386/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/22034023/viewspace-755386/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值