最近装了两套AIX平台的11.2.0.3的数据库,在最后使用dbca图形化工具创建数据库的时候都遇到了同样的错误:ORA-03113: end-of-file on communication channel,真的是非常讨论在AIX平台安装Oracle RAC,问题太多,不过话说回来,问题多成长才快嘛,下面把整个过程记录下来。
使用DBCA工具将数据库创建在存储设备对应的ASM磁盘组时遇到了ORA-03113错误。之后回想起之前将数据库创建在本地文件系统时非常的顺利,于是尝试先将数据库创建在本地文件系统,然后利用RMAN工具将所有文件转存到ASM磁盘组中。
# id grid
uid=205(grid) gid=204(oinstall) groups=205(asmadmin),206(asmdba),207(asmoper),208(dba)
# id oracle
uid=206(oracle) gid=204(oinstall) groups=206(asmdba),208(dba),209(oper)
# oslevel -s
6100-07-05-1228
数据库成功创建到本地之后,首先做了以下的尝试:
#su - oracle
$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Sep 10 19:37:38 2012
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> create tablespace test datafile '+DATA01' size 5m ;
create tablespace test datafile '+DATA01' size 5m
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 45023418
Session ID: 131 Serial number: 293
通过上面这里例子很明显的感觉到oracle用户没有向ASM磁盘组写数据的权限。通过这两次的安装我个人认为dbca执行过程中出现ORA-03113错误很大可能是因为oracle用户下的数据库实例没有向grid用户下的磁盘组写数据的权限。
这时检查Oracle数据库的告警日志,可以明显看到有ORA-600的错误报出:ORA-00600 [kfioTranslateIO03],根据这个错误在METALINK很容易到了下面这篇文章:
ORA-00600 [kfioTranslateIO03] [17090] [ID 1336846.1]
修改时间:2012-3-23
类型:PROBLEM
状态:PUBLISHED
优先级:3
Applies to:
Oracle Server - Enterprise Edition - Version: 11.2.0.2 and later [Release: 11.2 and later ]
Information in this document applies to any platform.
Symptoms
In 11.2.0.2 where role separation between GRID and RDBMS is implemented, the following ORA-600 error prevents database from starting up.
ORA-00600: internal error code, arguments: [kfioTranslateIO03]
ORA-00600: internal error code, arguments: [17090]
Cause
group permission of "oracle" executable from RDBMS home should have the same group information for ASM devices according to note 1084186.1.
$ ls -l $GRID_HOME/bin/oracle
-rwsr-s--x 1 grid oinstall 228954465 Jul 1 13:37 /oh1/grid/product/11.2.0/bin/oracle
$ ls -l $RDBMS_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 228954465 Jul 1 13:37 /oh1/oracle/product/11.2.0/bin/oracle
$ ls -l $ASM_DEVICE/*
brw-rw---- 1 grid asmadmin 8, 33 Feb 15 08:11 /dev/oracleasm/disks/ASMD1
brw-rw---- 1 grid asmadmin 8, 49 Feb 15 08:11 /dev/oracleasm/disks/ASMD2
brw-rw---- 1 grid asmadmin 8, 17 May 4 22:30 /dev/oracleasm/disks/CRSD1
But in this case, "oracle" executable from RDBMS shows different group information which is different from group information for ASM devices.
ORA-600[kfioTranslateIO03] and [17090] occurrs due to the permission issue.
$ ls -l $RDBMS_HOME/bin/oracle
-rwsr-s--x 1 oracle oinstall 228954465 Jul 1 13:37 /oh1/oracle/product/11.2.0/bin/oracle
^^^^^^^ it should be "asmadmin" or at least should be the same group of all ASM devices.
Solution
group information for $RDBMS_HOME/bin/oracle should be changed to the group that can read/write to ASM devices.
Please execute the following action plan from note 1084186.1.
$ su - grid
$ cd /bin
$ ./setasmgidwrap o=<11.2 RDBMS Home>/bin/oracle
References
NOTE:1084186.1 - Database Creation on 11.2 Grid Infracture with Role Separation ( ORA-15025, KFSG-00312, ORA-15081 )
根据上面的文章内容做了如下操作:
$ cd $GRID_HOME/bin
$ ls -al oracle
-rwsr-s--x 1 grid oinstall 264678476 Sep 10 18:58 oracle
$ exit
# cd /dev/
# ls -al rhdisk*
crw-rw---- 2 grid oinstall 15, 0 Jul 19 12:22 rhdisk0
crw-rw---- 1 grid oinstall 15, 1 Jul 19 12:22 rhdisk1
crw-rw---- 1 grid oinstall 15, 16 Sep 10 18:40 rhdisk10
crw-rw---- 1 grid oinstall 15, 10 Sep 10 18:40 rhdisk11
crw-rw---- 1 grid oinstall 15, 14 Sep 10 18:40 rhdisk12
crw-rw---- 1 grid oinstall 15, 7 Sep 10 18:40 rhdisk13
......
# chown -R grid:asmadmin rhdisk*
# ls -al rhdisk*
crw-rw---- 2 grid asmadmin 15, 0 Jul 19 12:22 rhdisk0
crw-rw---- 1 grid asmadmin 15, 1 Jul 19 12:22 rhdisk1
crw-rw---- 1 grid asmadmin 15, 16 Sep 10 18:40 rhdisk10
crw-rw---- 1 grid asmadmin 15, 10 Sep 10 18:40 rhdisk11
crw-rw---- 1 grid asmadmin 15, 14 Sep 10 18:40 rhdisk12
crw-rw---- 1 grid asmadmin 15, 7 Sep 10 18:40 rhdisk13
......
# su - grid
$ cd $GRID_HOME/dbs
$ cd ../bin
$ ls -al oracle
-rwsr-s--x 1 grid oinstall 264678476 Sep 10 18:58 oracle
$ pwd
/u01/app/11.2.0/grid/bin
$ ./setasmgidwrap o=/u01/app/oracle/product/11.2.0/db_1/bin/oracle
$ exit
# su - oracle
$ cd $ORACLE_HOME/bin
$ ls -al oracle
-rwsr-s--x 1 oracle asmadmin 300832186 Sep 10 19:17 oracle
通过以上的调整之后,GRID_HOME/bin/oracle和$RDBMS_HOME/bin/oracle两个程序都具备了对/dev/rhdisk*的读写权限,这是出现ORA-03113问题的根源。
$ sql
SQL*Plus: Release 11.2.0.3.0 Production on Mon Sep 10 19:57:07 2012
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> create tablespace test datafile '+DATA01' size 5m;
create tablespace test datafile '+DATA01' size 5m
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 25690140
Session ID: 212 Serial number: 23
$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Sep 10 19:59:52 2012
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 9887760384 bytes
Fixed Size 2229944 bytes
Variable Size 1577060680 bytes
Database Buffers 8287944704 bytes
Redo Buffers 20525056 bytes
Database mounted.
Database opened.
SQL> create tablespace test datafile '+DATA01' size 5m;
Tablespace created.
SQL> drop tablespace test including contents and datafiles;
Tablespace dropped.
实例经过重启,之前所做的更改才会生效。
通过这个例子,可以总结出试图将数据库存放到Grid软件下的ASM磁盘组的时候一定要注意以下两点:
1).按照Oracle文档的要求,对共享磁盘执行以下两个修改:
chown grid:asmadmin /dev/rhdisk*
chmod 660 /dev/rhdisk*
2).确保$GRID_HOME/bin/oracle和$RDBMS_HOME/bin/oracle的两个程序都具备读写共享磁盘文件/dev/rhdisk*的权限。
相信部署满足了以上两个条件的Oracle数据库不再会出现ORA-03113的错误。
相关文章:《Oracle RAC 11gR2 ORA-15055 ORA-27140 ORA-27300 ORA-27301 ORA-27302 ORA-27303》http://space.itpub.net/?uid-23135684-action-viewspace-itemid-751960,文章重点讨论的是$GRID_HOME/bin/oracle和$ORACLE_HOME/bin/oracle两个文件的权限和/dev/rhdisk*的对应关系;而这篇文章重点讨论的是$GRID_HOME/bin/oracle和$ORACLE_HOME/bin/oracle两个文件的所有者、组和/dev/rhdisk*的对应关系。