oracle avl 导入,使用ErrorStack进行Oracle错误跟踪及诊断

在使用Oracle数据库的过程中,可能会遇到各种各样的错误或异常,很多异常的提示并不具体,我们有必要了解一下Oracle的ErrorStack跟踪方式。

68782343_1

ErrorStack是Oracle提供的一种对于错误堆栈进行跟踪的方法,通过设置跟踪可以将一些错误的后台信息详尽地转储出来,写入跟踪文件,对于错误的研究与诊断非常有效。

设置ErrorStack主要有4个级别:0 - 仅转储错误堆栈(0级已经被逐渐废弃)

1-转储错误堆栈和函数调用堆栈

2-Level1 + ProcessState

3-Level2 + Context area(显示所有cursors,着重显示当前cursor)

ErrorStack可以在实例级或会话级别设置,也可以在参数文件中设置,这个设置仅当某个特定的错误出现时才被触发,如设置ORA-00942事件的跟踪:

alter session set events '942 trace name errorstack level 1';

一个客户曾经出现如下ORA-01438错误,提示数据的精度超过允许值,是后台JOB调度的任务:

Mon Jul 13 10:27:312009

Errors in file /admin/erpdb/bdump/erpdb1_j000_447020.trc:

ORA-12012: error on auto execute of job 22

ORA-01438: value larger than specified precision allowed for this column

ORA-06512: at 'ERP.TIMRDU', line 13

ORA-06512: at line 1

跟踪文件中默认的不会记录具体的SQL、绑定变量等信息,我们可以通过ErrorStack进行后台跟踪,获得更详细的信息,执行如下代码中的SQL:

alter system set events='1438 trace name errorstack forever,level 3';

然后可以手工执行出错的存储过程(代码如下),获得跟踪文件,再关闭跟踪:

alter system set events='1438 trace name errorstack off';

在Oracle 10g中,这样的操作会被记录到日志文件中:

Mon Jul 13 10:48:392009

OS Pid: 541528 executed alter system set events '1438 trace name Errorstack forever,level 3'

Mon Jul 13 10:56:06 2009

Errors in file /admin/erpdb/udump/erpdb1_ora_267056.trc:

ORA-01438: value larger than specified precision allowed for this column

Mon Jul 13 10:56:08 2009

Trace dumping is performing id=[cdmp_20090713105608]

Mon Jul 13 10:57:15 2009

OS Pid: 541528 executed alter system set events '1438 trace name Errorstack off'

接下来分析获得的跟踪文件,就可以获得SQL文本线索,找到根本问题。在这个案例中,我们得到的跟踪文件,其关键SQL内容如下,通过这个SQL对照数据表很快就找到了精度超过的Number型字段:

***SESSION ID:(857.16304) 2009-07-13 10:56:06.429

*** 2009-07-13 10:56:06.429

ksedmp: internal or fatal error

ORA-01438: value larger than specified precision allowed for this column

Current SQL statement for this session:

INSERT INTO CONTRAPAYM (IHCODE,GTICODE,IDX,HCODE,PORDATE,FCODE,FCY,CCODE,ECODE,FLAG,MATCHFLAG,BCODE,MONTHZL,STATUS,AUTOFLAG ,REMARK) SELECTDISTINCT IHCODE,'0000000000000000' GTICODE,(IDX+100) IDX,HCODE,:B3 PORDATE,FCODE,:B2FCY,CCODE,ECODE, FLAG,0 MATCHFLAG,BCODE,MONTHZL,STATUS,4 AUTOFLAG ,'Proc_AdjustContractpayment' REMARK FROM CONTRAPAYM WHERE IHCODE=:B1 ANDIDX=(SELECT MAX(IDX) FROM CONTRACTPAYMENT WHERE IHCODE=:B1 ) AND ROWNUM=1

----- PL/SQL Call Stack -----

object      line object

handle    number name

7000002ca366e80       100   procedure ERP.PROC_ADJCONTRAPAYM

700000336a1a070       236   procedure ERP.PROC_AUTOBATPROC

7000002ca367df0         5   procedure ERP.TIMRDU

700000342eb7c20         1   anonymous block

在跟踪文件中,还有大量的堆栈信息,对于复杂的问题,还可以通过进一步细致的堆栈分析进行深入追踪。

可以很容易地测试这一功能的使用,比如使用如下代码中的测试过程:

SQL>  alter system set events='1438 trace name errorstack forever,level 3';

SQL> connect eygle/eygle

SQL> create table t (name varchar2(10),id number);

Table created.

SQL> insert into t values(a,1);

insert into t values(a,1)

*

ERROR at line 1:

ORA-00984: column not allowed here

SQL> alter system set events '984 trace name errorstack off';

在告警日志文件中就可以获得如下信息:

Mon Jul 13 22:55:592009

OS Pid: 2431 executed alter system set events '984 trace name ERRORSTACK level3'

Mon Jul 13 22:59:12 2009

Errors in file /opt/oracle/admin/mmstest/udump/mmstest_ora_2520.trc:

ORA-00984: column not allowed here

Mon Jul 13 23:01:01 2009

OS Pid: 2431 executed alter system set events '984 trace name errorstack off'

获得的跟踪文件里记录了insert的相关信息:

*** 2009-07-13 22:59:12.928

ksedmp: internal or fatal error

ORA-00984: column not allowed here

Current SQL statement for this session:

insert into t values(a,1)

----- Call Stack Trace -----

calling                 call      entry                   argument values in hex

location                type     point                    (?means dubious value)

--------------------    --------  --------------------     ----------------------------

ksedst()+27             all       ksedst1()                0? 1 ?

以下引用在ITPUB上的一个讨论,是使用ErrorStack跟踪解决问题的典型案例(原文链接指向ITPUB):

问题概述

在进行数据库例行维护时,遇到数据库正常导出,但导入时遇到有张表不能导入的问题。这张表存储工艺路线,大约共有160万条记录。表的不能导入影响一部分功能的实现,各单位反映强烈。

初步分析

首先排除是系统bug所带来的问题,因为前期导入导出未发现问题;另外由于是对同一数据库进行的导入导出,也排除了字符集问题。

单独再次导入此表,出现错误如下;[oracle@zhx25 ~]$ imp system/ceshi fromuser=dba_mgr touser=dba_mgr tables=t_routdef ignore=y file=1.dmp

Import: Release 10.2.0.1.0 - Production on Tue Mar 18 14:19:49 2008

Copyright (c) 1982, 2005, Oracle.All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production

With the Partitioning, OLAP and Data Mining options

Export file created by EXPORT:V10.02.01 via conventional path

import done in ZHS16GBK character set and ZHS16GBK NCHAR character set

. importing DBA_MGR's objects into DBA_MGR

. . importing table 'T_ROUTDEF'

IMP-00058: ORACLE error 1438 encountered

ORA-01438: value larger than specified precision allowed for this column

IMP-00028: partial import of previous table rolled back: 20501 rows rolled back

解决方向:有可能是数据中存在问题,很有可能是输入一些特殊字符,这些字符和ORACLE自身使用的通配符一致,所以导致系统不能识别。

问题解决

1、对问题进行跟踪

对系统设置1438 errorstack跟踪:SQL> alter system set events='1438 trace name Errorstack forever,level 10';

System altered

然后重新执行IMP;

2、查看UDUMP下日志,如下:/opt/oracle/admin/orcl/udump/orcl_ora_32355.trc

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production

With the Partitioning, OLAP and Data Mining options

ORACLE_HOME = /opt/oracle/product/10.2.0

System name:Linux

Node name:zhx25.cccc.com

Release:2.6.9-22.ELsmp

Version:#1 SMP Mon Sep 19 18:00:54 EDT 2005

Machine:x86_64

Instance name: orcl

Redo thread mounted by this instance: 1

Oracle process number: 17

Unix process pid: 32355, image: oracle@zhx25.cccc.com (TNS V1-V3)

*** SERVICE NAMESYS$USERS) 2008-03-18 17:11:02.014

*** SESSION ID135.949) 2008-03-18 17:11:02.014

*** 2008-03-18 17:11:02.014

ksedmp: internal or fatal error

ORA-01438: value larger than specified precision allowed for this column

Current SQL statement for this session:

INSERT /*+NESTED_TABLE_SET_REFS+*/ INTO 'T_ROUTDEF'('PRTNO', 'SEQNO', 'SEQDESC', 'PRTEFLAG', 'SEQCC', 'SEQNXTNO', 'SEQDEPT', 'WKCCODE', 'WKCALTCODE', 'SEQSETHR', 'LAB_SEQSETHR', 'SEQRUNHR', 'LAB_SEQRUNHR', 'SEQQUEHR', 'LAB_SEQQUEHR', 'SEQTRANHR', 'LAB_SEQTRANHR', 'SEQNOQLTRAT', 'SEQCHGDAT', 'SEQCHGRSN', 'SEQCHGCTLR', 'MACHCODE', 'MACHALTCODE', 'RESCODE', 'KEYSEQ', 'SEQCRYLOT', 'INITOVERRATE', 'COMWORK', 'SEQLOTFLAG', 'SEQEFDAT', 'SEQIEFDAT', 'SEQEFLOTNO', 'SEQIEFLOTNO', 'SEQECONO', 'PRTSEQCST', 'SEQSELFCST', 'INVYORN', 'SEQQTY', 'SEQNOTE', 'NOPRTAVELZ', 'OUTSEQFLAG', 'WXCST', 'SEQGSCHGCTLR', 'SEQGSCHGDAT', 'SEQSHCTL', 'SEQSHDAT', 'SEQSHSIGN', 'PRICE', 'ACCNO', 'ROUSTAT', 'FIXMFLAG', 'BANCL' VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14, :15, :16, :17, :18, :19, :20, :21, :22, :23, :24, :25, :26, :27, :28, :29, :30, :31, :32, :33, :34, :35, :36, :37, :38, :39, :40, :41, :42, :43, :44, :45, :46, :47, :48, :49, :50, :51, :52)

----- Call Stack Trace -----

calling

call

entry

argument values in hex

location

type

point

(? means dubious value)

…………

oacdef = 6a9477a0

Offsi = 48, Offsi = 1440

Opcode = 2

Bind Twotask Scalar Sql In (may be out) Copy

oacdef = 6a9477d0

Offsi = 48, Offsi = 1472

Opcode = 2

Bind Twotask Scalar Sql In (may be out) Copy

oacdef = 6a947800

Offsi = 48, Offsi = 1504

Opcode = 2

Bind Twotask Scalar Sql In (may be out) Copy

oacdef = 6a947830

Offsi = 48, Offsi = 1536

Opcode = 2

Bind Twotask Scalar Sql In (may be out) Copy

oacdef = 6a947860

Offsi = 48, Offsi = 1568

Opcode = 2

Bind Twotask Scalar Sql In (may be out) Copy

oacdef = 6a947890

Offsi = 48, Offsi = 1600

Opcode = 2

Bind Twotask Scalar Sql In (may be out) Copy

oacdef = 6a9478c0

Offsi = 48, Offsi = 1632

kkscoacd

Bind#0

oacdty=01 mxl=32(24) mxlc=00 mal=00 scl=00 pre=00

oacflg=03 fl2=1000010 frm=01 csi=852 siz=3344 off=0

kxsbbbfp=2a974dc398

bln=32

avl=23

flg=05

value='?

Bind#1

oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00

oacflg=03 fl2=1000000 frm=00 csi=00 siz=0 off=32

kxsbbbfp=2a974dc3b8

bln=22 avl=22 flg=01

value=###

An invalid number has been seen.Memory contents are :

Dump of memory from 0x0000002A974CDE88 to 0x0000002A974CDE9E

2A974CDE80 3331312E 0406C102 [.113....]

2A974CDE90 C9BBC6C8 52024D01 30310248 00003304[.....M.RH.10.3..]

Bind#2

oacdty=01 mxl=32(30) mxlc=00 mal=00 scl=00 pre=00

oacflg=03 fl2=1000010 frm=01 csi=852 siz=0 off=56

kxsbbbfp=2a974dc3d0

bln=32

avl=00

flg=01

Bind#3

oacdty=01 mxl=32(01) mxlc=00 mal=00 scl=00 pre=00

oacflg=03 fl2=1000010 frm=01 csi=852 siz=0 off=88

kxsbbbfp=2a974dc3f0

bln=32

avl=01

flg=01

value='M'

…………

从跟踪文件可以看到,问题是系统认为地址为2A974CDE88的数值非法,在导入会提示ORA-1438错误。本值在数据库中定义为number(8)类型,但从结果来看,系统把这个值当成了number(24)类型,并且由于格式不对,不能取出值。

3、找到DMP中相关数据

用winhex打开dmp文件,搜索十六进制串“2E313133 02C10604 C8C6BBC9 014D0252 48023130 0433F3FF”(对trace文件中的值要进行顺序调整)

4、对导出数据进行更改

由于“2E313133 02C10604 C8C6BBC9 014D0252 48023130 0433F3FF”不是一个有效的number类型,需要对它进行调整。调整如下,只要确保调整的值是number类型的即可。

5、对数据库相关数据项的类型进行调整

把数据项的类型由number(*)改为number(38);

6、重新进行数据导入[oracle@zhx25 ~]$ imp system/ceshi fromuser=dba_mgr touser=dba_mgr tables=t_routdef ignore=y file=1.dmp

Import: Release 10.2.0.1.0 - Production on Tue Mar 18 14:19:49 2008

Copyright (c) 1982, 2005, Oracle.All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production

With the Partitioning, OLAP and Data Mining options

Export file created by EXPORT:V10.02.01 via conventional path

import done in ZHS16GBK character set and ZHS16GBK NCHAR character set

. importing DBA_MGR's objects into DBA_MGR

. . importing table

'T_ROUTDEF'

1646404

本文通过介绍ErrorStack的应用,将这一常用的诊断工具介绍给大家,作为DBA,在遇到复杂问题时,通过这些手段深入分析,不仅有助于问题的解决,还将完善自己的知识体系,祝大家学习能够不断进步、深入。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值