今天碰到一个故障
环境:ORACLE 10.2.0.5 RAC
节点1是RAC1
节点2是RAC2
在rac2上使用命令crs_stat -t 初看起来毫无问题
[oracle@rac2 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora.racdb.db application ONLINE ONLINE rac1
ora....b1.inst application ONLINE ONLINE rac1
ora....b2.inst application ONLINE ONLINE rac2
在RAC1上使用命令crs_stat -t 说我不能和crs服务进程通讯
[oracle@rac1 ~]$ crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.
Usage: crs_stat [resource_name [...]] [-v] [-l] [-q] [-c cluster_member]
crs_stat [resource_name [...]] -t [-v] [-q] [-c cluster_member]
crs_stat -p [resource_name [...]] [-q]
crs_stat [-a] application -g
crs_stat [-a] application -r [-c cluster_member]
crs_stat -f [resource_name [...]] [-q] [-c cluster_member]
crs_stat -ls [resource_name [...]] [-q]
运用客户端去登录,始终登录不了节点1,节点2能正常登录
现在去看alertrac1.log
[oracle@rac1 rac1]$ tail -lf alertrac1.log
2010-12-02 12:59:23.585
[crsd(3835)]CRS-1012:The OCR service started on node rac1.
2010-12-02 12:59:23.614
[evmd(3743)]CRS-1401:EVMD started on node rac1.
2010-12-02 12:59:44.425
[crsd(3835)]CRS-1201:CRSD started on node rac1.
2010-12-06 11:46:24.187
[crsd(3835)]CRS-1205:Auto-start failed for the CRS resource ora.rac1.LISTENER_RAC1.lsnr. Details in /opt/oracle/product/10gr2/crs_1/log/rac1/crsd/crsd.log.
2010-12-06 11:46:24.530
[crsd(3835)]CRS-1205:Auto-start failed for the CRS resource ora.racdb.racdb1.inst. Details in /opt/oracle/product/10gr2/crs_1/log/rac1/crsd/crsd.log.
看/opt/oracle/product/10gr2/crs_1/log/rac1/crsd/crsd.log
[oracle@rac1 crsd]$ tail -lf crsd.log
2010-12-06 15:19:53.953: [ COMMCRS][1240127808]Authorization failed, network error
2010-12-06 15:19:53.953: [ OCRSRV][1240127808]th_select_answer: Failure in answer. clsc ret [3]
2010-12-06 15:19:54.092: [ COMMCRS][1485596992]Authorization failed, network error
2010-12-06 15:19:54.093: [ CRSMAIN][1485596992]0Command Server for 0xa14e400 accept error, recycling
2010-12-06 15:19:54.093: [ CRSMAIN][1485596992]0IOException : Error answering connect request 3
在节点1上利用SQLPLUS登录
[oracle@rac1 crsd]$ sqlplus /nolog
SQL*Plus: Release 10.2.0.5.0 - Production on Mon Dec 6 15:26:14 2010
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
SQL> conn sys as sysdba
Enter password:
ERROR:
ORA-09817: Write to audit file failed.
Linux-x86_64 Error: 28: No space left on device
ORA-01075: you are currently logged on
从上面的错误可以看出是磁盘空间有问题了!
查看磁盘空间,发现原来下午在往linux机器里面传文件的时候,把linux下的根分区给撑满了
[oracle@rac1 crsd]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 11G 9.6G 0 100% /
/dev/sda1 190M 12M 169M 7% /boot
tmpfs 250M 0 250M 0% /dev/shm
/dev/sdf1 7.9G 1.3G 6.3G 17% /opt/backup
/dev/hdc 3.4G 3.4G 0 100% /media/RHEL_5.4 x86_64 DVD
删除掉一些/目录下的资源后,再用crs_stat -t
[oracle@rac1 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora.racdb.db application ONLINE ONLINE rac1
ora....b1.inst application ONLINE ONLINE rac1
ora....b2.inst application ONLINE ONLINE rac2
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24212278/viewspace-681018/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/24212278/viewspace-681018/