mysql os waits 6437_ora-00028 – 搜索结果 – 提供7*24专业数据库(Oracle,SQL Server,MySQL等)恢复和Oracle技术服务@Tel:+86 1...

今天一朋友让我帮忙分析他们的9.2.0.2 rac 节点2异常down掉原因,相关信息如下:

前提信息

OS:HP-UX B.11.31

DB:9.2.0.2.0 RAC

节点2alert日志信息

Fri Sep 7 13:13:49 2012

ARC0: Completed archiving log 11 thread 2 sequence 11651

Fri Sep 7 13:31:56 2012

Errors in file /oracle/admin/agent/udump/agent2_ora_797.trc:

ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []

ORA-00028: your session has been killed

Fri Sep 7 13:31:58 2012

Errors in file /oracle/admin/agent/bdump/agent2_pmon_5938.trc:

ORA-30013: undo tablespace 'UNDOTBS1' is currently in use

Fri Sep 7 13:31:58 2012

PMON: terminating instance due to error 30013

Fri Sep 7 13:31:58 2012

Errors in file /oracle/admin/agent/bdump/agent2_lms7_6033.trc:

ORA-30013: undo tablespace '' is currently in use

Fri Sep 7 13:31:58 2012

…………

Errors in file /oracle/admin/agent/bdump/agent2_lms0_6027.trc:

ORA-30013: undo tablespace '' is currently in use

Fri Sep 7 13:31:58 2012

System state dump is made for local instance

Fri Sep 7 13:32:03 2012

Instance terminated by PMON, pid = 5938

Fri Sep 7 14:34:35 2012

这里可以看到因为ORA-30013的错误使得pmon进程异常,从而使得该rac的节点2 down掉.同时这里还发现了ORA-00600[kgavsd_3]错误,是否是因为该ORA-600导致了数据库异常down还是一个偶然机会,我们继续分析

查看ORA-600[kgavsd_3]相关trace文件

*********START PLSQL RUNTIME DUMP************

***Got ORA-28 while running PLSQL***

***********END PLSQL RUNTIME DUMP************

*** 2012-09-07 13:31:56.740

ksedmp: internal or fatal error

ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []

ORA-00028: your session has been killed

Current SQL statement for this session:

--用户补档

DECLARE

OUT_ERR_CODE NUMBER;

OUT_ERR_MSG VARCHAR2(1000);

V_COUNT NUMBER;

BEGIN

WHILE TRUE LOOP

SELECT COUNT(*) INTO V_COUNT FROM amc_stat_log where proc_name in('pRunOdsChannelWareData')

and run_param=201208 AND STATE='A';

IF V_COUNT>0 THEN

dbms_output.put_line('exit loop '|| sysdate);

EXIT;

END IF;

sys.Dbms_Lock.sleep(600);

dbms_output.put_line('wake up '|| sysdate);

END LOOP;

PKG_AME_ODS_DATA.P_Add_TO_AgentServ(201208,OUT_ERR_CODE,OUT_ERR_MSG);

PKG_AME_ODS_DATA.P_Update_Serv_Ware_ID(201208, OUT_ERR_CODE, OUT_ERR_MSG);

PKG_AMS_SETTLE.P_COMMISION_51PRE_FLAG(201208,OUT_ERR_CODE, OUT_ERR_MSG);

END;

----- PL/SQL Call Stack -----

object line object

handle number name

c000000e84ec59d0 14 anonymous block

----- Call Stack Trace -----

calling call entry argument values in hex

location type point (? means dubious value)

-------------------- -------- -------------------- ----------------------------

Cannot find symbol in .

Cannot find symbol in .

Cannot find symbol in .

ksedmp()+512 call 9fffffffffff3940 000000000 ?

C000000000000A17 ?

40000000025B6540 ?

ksfdmp()+64 call 9fffffffffff3940 000000003 ?

kgerinv()+352 call 9fffffffffff3940 60000000000466B0 ?

000000003 ?

C000000000000714 ?

4000000004EBA6A0 ?

00001821B ?

6000000000468EA8 ?

kgesinv()+48 call 9fffffffffff3940 60000000000466B0 ?

600000000059BD98 ?

600000000046B070 ?

60000000000179C0 ?

6000000000017950 ?

kgesin()+112 call 9fffffffffff3940 60000000000466B0 ?

600000000059BD98 ?

4000000000B44C10 ?

000000001 ?

9FFFFFFFFFFF4310 ?

$cold_kgavsd_stackl call 9fffffffffff3940 60000000000466B0 ?

et_done()+1184 600000000059BD98 ?

4000000000B44C10 ?

000000001 ? 000000000 ?

000000000 ? 000000000 ?

00001FE19 ?

pbesd_stacklet_done call 9fffffffffff3940 60000000000466B0 ?

()+96 000000002 ? 000000000 ?

9FFFFFFFBEC6AE68 ?

pfrrun()+3328 call 9fffffffffff3940 9FFFFFFFBEC6AE68 ?

C000000000001D42 ?

4000000001ABBDA0 ?

9FFFFFFFBEC6B06C ?

9FFFFFFFFFFF64F0 ?

6000000000468EA8 ?

000000000 ? 000000000 ?

peicnt()+480 call 0000000000000000 9FFFFFFFBEC6AEE8 ?

C000000000000996 ?

40000000019F3680 ?

kkxexe()+816 call 0000000000000000 9FFFFFFFFFFF64F0 ?

9FFFFFFFBEC6AE68 ?

C00000000000099B ?

4000000001DD90F0 ?

00000FE4F ?

9FFFFFFFFFFF5F00 ?

60000000000467F0 ?

4000000000B603F0 ?

opiexe()+11168 call 0000000000000000 000000000 ?

C000000000002C60 ?

4000000001BEF980 ?

00000800F ?

9FFFFFFFFFFF6470 ?

9FFFFFFFBEC6AEB2 ?

6000000000040900 ?

9FFFFFFFBEC6B534 ?

opiall0()+3184 call _etext_f()+23058430 000000004 ? 0000000C4 ?

09110686928 9FFFFFFFFFFF7B40 ?

C000000000002BDF ?

4000000001B26CD0 ?

000000000 ? 00000C893 ?

9FFFFFFFFFFF6690 ?

Cannot find symbol in .

kpoal8()+2064 call 9fffffffffff7ad0 000000001 ?

9FFFFFFFFFFF8304 ?

FFFFFFFFBFFFFFFF ?

9FFFFFFFFFFF83E4 ?

FFFFFFFFFFE7FBDF ?

9FFFFFFFFFFF7B88 ?

000000000 ?

6000000000474528 ?

opiodr()+3584 call 9fffffffffff81fc 6000000000040950 ?

000000000 ? 000000000 ?

C000000000002C60 ?

4000000001C09FE0 ?

00000C50B ?

9FFFFFFFFFFF81F0 ?

9FFFFFFFFFFF81D8 ?

ttcpip()+3776 call _etext_f()+23058430 00000005E ? 000000014 ?

09114957288 9FFFFFFFFFFFA5F0 ?

6000000000040918 ?

C000000000001ABD ?

4000000001AB3BA0 ?

000000000 ? 00000C59B ?

opitsk()+1872 call 9fffffffffffa200 6000000000049FC0 ?

000000001 ?

9FFFFFFFFFFFA5F0 ?

000000001 ?

9FFFFFFFFFFFA740 ?

9FFFFFFFFFFFA564 ?

9FFFFFFFBF780058 ?

000000000 ?

opiino()+3184 call 000000000000057b 000000000 ? 000000000 ?

C00000000000132B ?

4000000001F78730 ?

000008001 ?

opiodr()+3584 call 0000000000000000 6000000000548A38 ?

4000000000B606F0 ?

6000000000548A38 ?

C000000000002C60 ?

4000000001C09FE0 ?

00000A201 ?

9FFFFFFFFFFFBC90 ?

4000000000B606F0 ?

opidrv()+976 call _etext_f()+23058430 00000003C ? 000000004 ?

09114957288 9FFFFFFFFFFFEFB0 ?

6000000000040918 ?

sou2o()+80 call _etext_f()+23058430 000000004 ? 000000004 ?

09114957288 9FFFFFFFFFFFEFB0 ?

main()+352 call _etext_f()+23058430 9FFFFFFFFFFFEFD0 ?

09114957288 9FFFFFFFFFFFEFD4 ?

60000000004744F0 ?

9FFFFFFFFFFFEFB0 ?

main_opd_entry()+80 call _etext_f()+23058430 000000000 ?

09114957288 9FFFFFFFFFFFF498 ?

C000000000000004 ?

C00000000002BE30 ?

--------------------- Binary Stack Dump ---------------------

Process global information:

process: c000000d6428c0c0, call: c000000e46e772a8, xact: 0000000000000000,

curses: c000000d6437d020, usrses: c000000d6437d020

----------------------------------------

SO: c000000d6428c0c0, type: 2, owner: 0000000000000000, flag: INIT/-/-/0x00

(process) Oracle pid=282, calls cur/top: c000000e46e772a8/c000000e46e772a8, flag: (0) -

int error: 28, call error: 0, sess error: 0, txn error 0

(post info) last post received: 0 0 0

last post received-location: No post

last process to post me: c000000d64234f18 1 6

last post sent: 0 0 104

last post sent-location: kglpsl: in loop

last process posted by me: c000000d6428e900 23 0

(latch info) wait_event=0 bits=0

Process Group: DEFAULT, pseudo proc: c000000d62234ee0

O/S info: user: oracle, term: UNKNOWN, ospid: 797

OSD pid info: Unix process pid: 797, image: oracle@gzagent2 (TNS V1-V3)

----------------------------------------

SO: c000000d6437d020, type: 4, owner: c000000d6428c0c0, flag: INIT/-/-/0x00

(session) trans: 0000000000000000, creator: c000000d6428c0c0, flag: (41) USR/- BSY/-/-/-/KIL/-

DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000

txn branch: 0000000000000000

oct: 0, prv: 0, sql: c000000e76b03f10, psql: 0000000000000000, user: 31/CUSTOM

O/S info: user: huangqianhai_lc, term: SVCTAG-D1MLV2X, ospid: 11124:11796, machine: WORKGROUP\SVCTAG-D1MLV2X

program: plsqldev.exe

application name: PL/SQL Developer, hash value=1190136663

action name: 测试窗口 - 新建, hash value=3604520210

last wait for 'null event' blocking sess=0x0 seq=142 wait_time=567341620

=ea60, =0, =0

temporary object counter: 0

----------------------------------------

通过这里可以看出来是因为pl/sql dev进行一个plsql的操作导致该错误发生,查询MOS[ID 403575.1]发现

Applies to:

Oracle Server - Enterprise Edition - Version: 9.2.0.7 and later [Release: 9.2 and later ]

Information in this document applies to any platform.

***Checked for relevance on 27-Oct-2010***

Symptoms

The following errors appears in the alert log file :

Probe:read_pipe: receive failed, status 3

Probe:S:debug_loop: timeout. Action 1

*********START PLSQL RUNTIME DUMP************

***Got ORA-604 while running PLSQL***

***********END PLSQL RUNTIME DUMP************

ksedmp: internal or fatal error

ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []

ORA-00604: error occurred at recursive SQL level 2

Current SQL statement for this session:

begin :id := sys.dbms_transaction.local_transaction_id; end;

.

Cause

The ora-600 kgavsd_3 appears when calling kgavsd_stacklet_done function which is related to PLSQL DEBUG.

From traces, dbms_debug package is being used during trace generation.

The return code of 3 further indicates that the dbms_pipe message was interrupted.

Probably user cancelled a plsql program, so the appeared while trying to dump the stack

Solution

There is no data corruption over here.

The error appears to be due to abnormal termination of aPL/SQL Developer application while executing a PL/SQL block.

Changing the PL/SQL and/or the procedure code could help in avoiding this error message.

Hence, this error can be safely ignored.

查找trace文件确实发现有name=SYS.DBMS_DEBUG,进一步表明该错误是由于plsql dev工具使用debug模式运行上面的plsql而引起该错误的发生,但是因为mos中记录和错误不是完全的一致,所以不能十分确定是该错误导致数据库down掉

继续分析ORA-30013

Error: ORA-30013 (ORA-30013)

Text: undo tablespace '%s' is currently in use

---------------------------------------------------------------------------

Cause: the specified undo tablespace is currently used by another instance.

Action: Wait for the undo tablespace to become available or change to another name and reissue the statement.

这个说明是没有疑问的:因为2节点配置的当前undo是UNDOTBS2,而UNDOTBS1是1节点使用的,证明这里的undo确实发生了错误,继续查询mos发现Bug 3368552

Hdr: 3368552 9.2.0.3 RDBMS 9.2.0.3 RAC PRODID-5 PORTID-23

Abstract: RAC: ORA-30013 WHEN INSTANCE 2 ATTEMPTS TO ACCESS UNDO TABLESPACE OF INSTANCE 1

*** 01/12/04 06:21 am ***

TAR:

----

3554549.995

PROBLEM:

--------

The RAC database has been stable, but experienced an instance termination due

to ORA-30031 error in the alert log (instance 2):

...

Tue Dec 23 03:01:46 2003

ARC1: Evaluating archive log 4 thread 2 sequence 1116

ARC1: Beginning to archive log 4 thread 2 sequence 1116

Creating archive destination LOG_ARCHIVE_DEST_1:

'/oracle/oradata/VLDB/logs/archives/VLDBN2/VLDB_0000001116_0002.arc'

ARC1: Completed archiving log 4 thread 2 sequence 1116

Tue Dec 23 08:14:09 2003

Errors in file /oracle/admin/VLDB/bdump/vldbn2_pmon_22860.trc:

ORA-30013: undo tablespace 'UNDOTBS1' is currently in use

Tue Dec 23 08:14:09 2003

PMON: terminating instance due to error 30013

Tue Dec 23 08:14:10 2003

System state dump is made for local instance

Tue Dec 23 08:14:12 2003

Trace dumping is performing id=[cdmp_20031223081410]

Tue Dec 23 08:14:14 2003

Instance terminated by PMON, pid = 22860

Instance 1 alert log shows only the reconfiguration and the cdump info:

..

Tue Dec 23 03:54:13 2003

Errors in file /oracle/admin/VLDB/udump/vldbn1_ora_13564.trc:

ORA-27037: unable to obtain file status

SVR4 Error: 2: No such file or directory

Additional information: 4

Tue Dec 23 08:14:10 2003

Trace dumping is performing id=[cdmp_20031223081410]

Tue Dec 23 08:14:12 2003

Reconfiguration started

List of nodes: 0,

Global Resource Directory frozen

one node partition

Communication channels reestablished

...

因为在9.2.0.3的RAC中有着该bug,那么我们可以大胆猜测在9.2.0.2中应该存在该bug,那么结合上面的ORA-00600[kgavsd_3]错误,我们大概还原该事故的全部:

1.节点1 dml操作了程序中报错的plsql中要范围的部分表对象,但是未提交(或者正在执行)

2.节点2 有用户使用pl/sql dev去执行程序中的plsql,因为是debug模式执行,需要UNDOTBS1的块来构建cr,从而使得节点2去访问UNDOTBS1,引发了Bug 3368552 从而使得数据库直接kill掉该plsql dev会话,进而出现ORA-00600[kgavsd_3]错误和pmon进程异常使得节点2 down掉

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值