Exadata遭遇ORA-27603和ORA-27626

27 篇文章 0 订阅
10 篇文章 0 订阅

今天CRM反应说他们有个作业今天没跑出来报错了,一查发现Exadata的节点2上的alert日志里面在同一时间点果然报错了:

Wed Oct 17 06:22:51 2012

Errors in file /u01/app/oracle/diag/rdbms/srcbfin/SRCBFIN2/trace/SRCBFIN2_arc2_72209.trc:

ORA-27603: Cell storage I/Oerror, I/O failed on disk o/172.11.211.9/DATA_DM01_CD_01_dm01cel01 at offset464519168 for data length 1048576

ORA-27626: Exadata error: 201(Generic I/O error)

WARNING: Read Failed. group:1 disk:25AU:110 offset:3145728 size:1048576

WARNING: failed to read mirror side 1 ofvirtual extent 29 logical extent 0 of file 266 in group [1.2063103479] fromdisk DATA_DM01_CD_01_DM01CEL01 allocation unit 110 reason error; if possible, will try another mirrorside

NOTE: successfully read mirror side 2 ofvirtual extent 29 logical extent 1 of file 266 in group [1.2063103479] fromdisk DATA_DM01_CD_10_DM01CEL03 allocation unit 113

相应的那个trace日志文件里面也是只用相同的寥寥几行:

*** 2012-10-17 06:22:51.655

ORA-27626: Exadata error: 201 (Generic I/Oerror)

WARNING: Read Failed. group:1 disk:25AU:110 offset:3145728 size:1048576

path:o/172.11.211.9/DATA_DM01_CD_01_dm01cel01

          incarnation:0xe9688586 asynchronousresult:'I/O error'

          subsys:OSS iop:0x2b58752a2680 bufp:0x2b5879517000osderr:0xc9 osderr1:0x0

          Exadata error:'Generic I/O error'

          IO elapsed time: 12334426 usec Time waited onI/O: 12334426 usec

WARNING: failed to read mirror side 1 ofvirtual extent 29 logical extent 0 of file 266 in group [1.2063103479] from diskDATA_DM01_CD_01_DM01CEL01  allocationunit 110 reason error; if possible, will try another mirror side

NOTE: successfully read mirror side 2 ofvirtual extent 29 logical extent 1 of file 266 in group [1.2063103479] fromdisk DATA_DM01_CD_10_DM01CEL03 allocation unit 113

在MOS上查了一下发现这可能是Exadata的一个Bug,无语了,上线两个月各种Bug各种宕机,官方还吹嘘的那么牛叉!!!

Bug 8782572  ARCH fenced in ASM disks causing internalerrors at shutdown

 This note gives a brief overview of bug8782572. 
 The content was last updated on: 17-JUN-2011
 Click here for details of each of the sectionsbelow.

Affects:

Product (Component)

Oracle Server (Rdbms)

Range of versions believed to be affected

Versions BELOW 12.1

Versions confirmed as being affected

Platforms affected

Generic (all / most platforms affected)

Fixed:

This issue is fixed in

Symptoms:

Related To:

Description

This bug causes ARCH processes to keep issuing IO's (or archive logs)
even after an RDBMS instance has been dismounted. Such IO's are
fenced off in ASM disks after instance is no longer part of the cluster
(i.e. after dismount), and the ASM diskgroup can be dismounted
as a result of these IOs.
 
Here is an example excerpt from alert log showing the IO errors due to fence:
 
ORA-27603: Cell storage I/O error, I/O failed on disk o/<IP Address>/<ASM Disk> at offset <offset#> for data length <length> WARNING: IO Failed. group:<group#> disk(number.incarnation):<number.inc> disk_path:o/<IP Address>/<ASM disk>
         AU:<AU> disk_offset(bytes):<bytes> io_size:<IO size> operation:Read type:asynchronous
         result:I/O error process_id:<pid>
         Exadata error:221 (I/O request fenced)
 
Another example from a cell alert log:
 
Information: Cellsrv canceling OSSMSG_COMMAND_BREAD request from host 
xxxx[pid:<pid>] for fencing, send port <port#> open fd 2
 

Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. For questions about this bug please consult Oracle Support.

References

Bug:8782572 (This link will only work forPUBLISHED bugs)
Note:245840.1 Information on the sections in thisarticle

 

这个Bug可以早到相应的Patch:

 

另一篇文章,不过好像不是这么一回事,应该是个Bug,不过还是把文章附上:


Exadata/Rac - Ora-27603: Cell Storage I/O Error [ID 1445223.1]

转到底部


修改时间: 2012-5-30 类型: PROBLEM 状态:PUBLISHED 优先级:3

注释 (0)

Appliesto:

Oracle Exadata Hardware - Version 11.2.0.1 to11.2.0.1 [Release 11.2]
Information in this document applies to any platform.

Symptoms


Getting following errors in the exadata environment running 11.2.0.1 database:

Errors in file /u01/app/oracle/diag/rdbms/test/test2/trace/test2_ora_22485.trc(incident=242913):
ORA-00600: internal error code, arguments: [kssadpm1], [], [], [], [], [], [], [],[], [], [], []
Incident details in:/u01/app/oracle/diag/rdbms/test/test2/incident/incdir_242913/test2_ora_22485_i242913.trc
Errors in file /u01/app/oracle/diag/rdbms/test/test2/trace/test2_ora_22485.trc(incident=242914):
ORA-00600: internal error code, arguments: [kfddsGet03], [56861], [], [], [],[], [], [], [], [], [], []
Incident details in:/u01/app/oracle/diag/rdbms/test/test2/incident/incdir_242914/test2_ora_22485_i242914.trc

Changes

No recent changes

Cause

IOs are fenced even before thetxn state object gets deleted which needs to performs IOs in order to do thetxn rollback. This is causing the error.

ORA-600[kssadpm1] is raised because of bug 9750033
The second error, ORA-600[kfddsGet03], is caused by ORA-600[kssadpm1].


We can conclude on the bug 9750033 based on the following criteria:

1. Callstack matches as follows:
kssadpm 
ksz_gen_reid 
kfddsGet 
kfioTranslateIO 
kfioRqSetPrepare 

2. Problematic state object is 'ksz parent'

This can be verified from the trace file.
Example:
SO: 0x6cbc55f40, type: 22, owner: (nil), flag: -/FLST/-/0x00 if: 0x0 c: 0x0
proc=(nil), name=ksz parent, file=ksz2.h LINE:394, pg=0
Dump of memory from 0x00000006CBC55F40 to 0x00000006CBC55F98



Solution


Impact of the bug is the process failure. As the error is from serverprocess(not background process), no effect at the instance level and nocorruption as well.
Only the process encountering the error will be terminated. Also the error isduring normal server process exit. So, the impact is very minimal.

Fix included in 11.2.0.1 BP12. 
Other option is to upgrade to 11.2.0.2 or above which includes the fix.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
在没有给出数据库的具体定义和结构的情况下,无法准确回答dq-001属于什么类型的问题。数据库的类型可以根据不同的分类标准来确定,以下是一些常见的数据库类型: 1. 关系型数据库(RDBMS):关系型数据库是最常见的数据库类型,使用表格(表)来存储和管理数据,并使用结构化查询语言(SQL)进行数据访问和操作。常见的关系型数据库有MySQL,Oracle,SQL Server等。 2. 非关系型数据库(NoSQL):非关系型数据库与关系型数据库不同,它们使用不同的数据模型来存储和管理数据,如键值对、文档、图形等。非关系型数据库常用于大规模和分布式的数据存储和处理,如MongoDB,Redis,Cassandra等。 3. 面向对象数据库(OODBMS):面向对象数据库将数据视为对象,而不是表格,支持对象的继承、封装和多态性。它们通常用于支持面向对象的软件开发,并具有更好的数据模型映射能力。常见的面向对象数据库有db4o,ObjectDB等。 4. 数据仓库(Data Warehouse):数据仓库是一种用于存储和管理大量历史数据的数据库,通常用于数据分析和决策支持系统。数据仓库使用特定的数据模型和ETL(抽取、转换和加载)过程来整合和转换数据。常见的数据仓库有TeradataOracle Exadata等。 要准确回答dq-001属于什么类型的问题,需要了解数据库的具体定义和结构,以及dq-001的具体含义和用途。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值