记录一次ORA-00600 [kdsgrp1]导致的large tracefile的处理过程

1.问题描叙

收到check_mk告警,约在15:17左右硬盘空间消耗暴增
在这里插入图片描述

2 查因过程

2.1 查看磁盘空间状况

u1很快被耗尽

[oracle@gsierp ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              45G   16G   27G  38% /
/dev/sda1             2.0G   52M  1.8G   3% /boot
tmpfs                  16G   16K   16G   1% /dev/shm
/dev/sdd               99G   78G   16G  84% /u3
/dev/sde               99G   30G   64G  33% /u4
/dev/sdb1             197G  196G  114M 100% /u1
/dev/sdc1              74G   68G  2.6G  97% /u2

2.2 检查表空间增长情况

从输出结果来看,并没有异常增长,看来问题不在数据库数据文件

select a.snap_id,a.dbid,a.tablespace_id,a.tablespace_size max_tablespace_size,b.tablespace_size min_tablespace_size,
a.tablespace_usedsize max_tablespace_usedsize,b.tablespace_usedsize min_tablespace_usedsize,
round(a.tablespace_size-b.tablespace_size,2) diff_tablespace_size,
round((a.tablespace_usedsize-b.tablespace_usedsize)*8/1024/1024,2) diff_tablespace_usedsize
from 
(select su.snap_id,su.dbid,su.tablespace_id,su.tablespace_size,su.tablespace_usedsize from dba_hist_tbspc_space_usage su
join dba_hist_snapshot sn on su.snap_id=sn.snap_id
join (select MAX(SNAP_ID) MAX_SNAP_ID,MIN(SNAP_ID) MIN_SNAP_ID from dba_hist_snapshot where to_char(begin_interval_time,'YYYY-MM-DD HH24:MI:SS') LIKE '2023-08-29%') msn
on su.snap_id=msn.max_snap_id) a,
(select su.snap_id,su.dbid,su.tablespace_id,su.tablespace_size,su.tablespace_usedsize from dba_hist_tbspc_space_usage su
join dba_hist_snapshot sn on su.snap_id=sn.snap_id
join (select MAX(SNAP_ID) MAX_SNAP_ID,MIN(SNAP_ID) MIN_SNAP_ID from dba_hist_snapshot where to_char(begin_interval_time,'YYYY-MM-DD HH24:MI:SS') LIKE '2023-08-29%') msn
on su.snap_id=msn.min_snap_id) b
where a.dbid=a.dbid and a.tablespace_id=b.tablespace_id
  SNAP_ID       DBID TABLESPACE_ID MAX_TABLESPACE_SIZE MIN_TABLESPACE_SIZE MAX_TABLESPACE_USEDSIZE MIN_TABLESPACE_USEDSIZE DIFF_TABLESPACE_SIZE DIFF_TABLESPACE_USEDSIZE
---------- ---------- ------------- ------------------- ------------------- ----------------------- ----------------------- -------------------- ------------------------
     76920 2922139705             0               92800               92800                   92344                   92344                    0                        0
     76920 2922139705             1             4971008             4971008                   67408                   55632                    0                      .09
     76920 2922139705             2              179200              179200                  160072                  160560                    0                        0
     76920 2922139705             4              145440              145440                  144776                  144776                    0                        0
     76920 2922139705             5            11308928            11308928                10512584                10512584                    0                        0
     76920 2922139705             6             1876224             1876224                 1544152                 1544152                    0                        0
     76920 2922139705             8             3055360             3055360                 2682048                 2682048                    0                        0
     76920 2922139705             9             1462016             1462016                 1179416                 1179416                    0                        0
     76920 2922139705            11            12390896            12390896                12068792                12022072                    0                      .36
     76920 2922139705            12             1420800             1420800                  976688                  976688                    0                        0
     76920 2922139705            14             2706560             2706560                 2659040                 2659040                    0                        0

   SNAP_ID       DBID TABLESPACE_ID MAX_TABLESPACE_SIZE MIN_TABLESPACE_SIZE MAX_TABLESPACE_USEDSIZE MIN_TABLESPACE_USEDSIZE DIFF_TABLESPACE_SIZE DIFF_TABLESPACE_USEDSIZE
---------- ---------- ------------- ------------------- ------------------- ----------------------- ----------------------- -------------------- ------------------------
     76920 2922139705            15              749440              749440                  735440                  735440                    0                        0
     76920 2922139705            16                1280                1280                     232                     232                    0                        0
     76920 2922139705            17             1310720             1310720                       0                       0                    0                        0

2.3 检查$ORACLE_BASE相关文件

最终检查到udump目录的大小达27G之多

[oracle@gsierp pyia]$ du -sh $ORACLE_BASE/admin/pyia/*
63M     ./adump
20G     ./adump.old
364M    ./bdump
24K     ./cdump
27G     ./udump
[oracle@gsierp udump]$ ls -alh --full-time|grep "2023-08-29"
-rw-r----- 1 oracle oinstall   25G 2023-08-29 15:49:52.771222574 +0700 pyia_ora_11160.trc

2.4查看trace file内容

由于文件size太大,先使用head命令查看头部信息

[oracle@gsierp udump]$ head -n 100 pyia_ora_11160.trc |more
/u1/admin/pyia/udump/pyia_ora_11160.trc
Oracle Database 10g Release 10.2.0.5.0 - 64bit Production
ORACLE_HOME = /oracle/app/product/10.2.0
System name:    Linux
Node name:      gsierp
Release:        2.6.32-300.10.1.el5uek
Version:        #1 SMP Wed Feb 22 17:37:40 EST 2012
Machine:        x86_64
Instance name: pyia
Redo thread mounted by this instance: 1
Oracle process number: 56
Unix process pid: 11160, image: oracle@gsierp

*** 2023-08-29 15:02:41.341
*** ACTION NAME:() 2023-08-29 15:02:41.341
*** MODULE NAME:(app_sis_sap.exe) 2023-08-29 15:02:41.341
*** SERVICE NAME:(SYS$USERS) 2023-08-29 15:02:41.341
*** SESSION ID:(920.29479) 2023-08-29 15:02:41.341
            row 088e32c5.1 continuation at
            file# 34 block# 930501 slot 2 not found
**************************************************
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 2 ..... nrows: 15
**************************************************
*** 2023-08-29 15:02:41.341
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Current SQL statement for this session:
AND TRIM(NVL(TRIM(A.POSNR), TRIM(SIZE_NO))) = TRIM(DECODE(TRIM(B.POSNR),'000000', B.WO_SIZE
FROM SAP_WO A,BARUM(B.QTY) QTY,03', 'ZP04', 'ZP10', 'ZP14')) = 'REL'
SELEN ALLA.AUFNRIN ('ZP02', 'ZP03', 'ZP04', 'ZP10', 'ZP14')) = 'REL'
AND NOT A.AUART IN ('ZP02', 'ZP03', 'ZP04', 'ZP10ASTNR,'N')) = 'REL'
AND NOTSTR(A.PLNBEZ, 1, 2) = 'FG' AND TRIM(NVL(A.ASTNR,'N')) = 'REL'
AND SUBSTR(A.PLNBEZ, 1, 2) = 'FG'  B.PROD_BATCH ='10422479-0010'
/*ERE A.AUFNR = B.AUFNR(+), 'ZP03', 'ZP04', 'ZP10', 'ZP14')
SELECT A.AUFNR,A.WO_SIZE,A.GAMNG,B.QTY BAR_QTY
  FROM
  (SELECT B.AUFNR, B.WO_SIZE WO_SIZE, SUM(B.GAMNG) GAMNG
     FROM ODRD A, SAP_WO B
    WHERE A.FACT_ODR_NO = B.PROD_BATCH
      AND TRIM(NVL(TRIM(A.POSNR), TRIM(SIZE_NO))) = TRIM(DECODE(TRIM(B.POSNR),'000000', B.WO_SIZE, B.POSNR))
      AND B.PROD_BATCH = :AS_PROD_BATCH
      AND SUBSTR(B.PLNBEZ, 1, 2) = :AS_KIND  AND TRIM(NVL(B.ASTNR,'N')) = 'REL'
      AND NOT B.AUART IN ('ZP02', 'ZP03', 'ZP04', 'ZP10', 'ZP14')
   GROUP BY B.AUFNR, B.WO_SIZE) A,
  (SELECT A.AUFNR,SUM(B.QTY) QTY
     FROM SAP_WO A,BARM_1 B
    WHERE A.AUFNR = B.AUFNR
      AND A.PROD_BATCH = :AS_PROD_BATCH
      AND SUBSTR(A.PLNBEZ, 1, 2) = :AS_KIND  AND TRIM(NVL(A.ASTNR,'N')) = 'REL'
      AND NOT A.AUART IN ('ZP02', 'ZP03', 'ZP04', 'ZP10', 'ZP14')
   GROUP BY A.AUFNR
UNION ALL
   SELECT A.AUFNR,SUM(B.QTY) QTY
     FROM SAP_WO A,BARM_3 B
    WHERE A.AUFNR = B.AUFNR
      AND A.PROD_BATCH = :AS_PROD_BATCH 
      AND SUBSTR(A.PLNBEZ, 1, 2) = :AS_KIND  AND TRIM(NVL(A.ASTNR,'N')) = 'REL'
      AND NOT A.AUART IN ('ZP02', 'ZP03', 'ZP04', 'ZP10', 'ZP14')
   GROUP BY A.AUFNR
   UNION ALL
   SELECT A.AUFNR,SUM(B.QTY) QTY
     FROM SAP_WO A,BARM_4 B
    WHERE A.AUFNR = B.AUFNR
      AND A.PROD_BATCH = :AS_PROD_BATCH
      AND SUBSTR(A.PLNBEZ, 1, 2) = :AS_KIND  AND TRIM(NVL(A.ASTNR,'N')) = 'REL'
      AND NOT A.AUART IN ('ZP02', 'ZP03', 'ZP04', 'ZP10', 'ZP14')
   GROUP BY A.AUFNR
   UNION ALL
   SELECT A.AUFNR,SUM(B.QTY) QTY
     FROM SAP_WO A,BARM_5 B
    WHERE A.AUFNR = B.AUFNR
      AND A.PROD_BATCH = :AS_PROD_BATCH
      AND SUBSTR(A.PLNBEZ, 1, 2) = :AS_KIND  AND TRIM(NVL(A.ASTNR,'N')) = 'REL'
      AND NOT A.AUART IN ('ZP02', 'ZP03', 'ZP04', 'ZP10', 'ZP14')
   GROUP BY A.AUFNR) B
 WHERE A.AUFNR = B.AUFNR(+)
/*
SELECT B.AUFNR,B.WO_SIZE WO_SIZE,SUM(B.GAMNG) GAMNG,
       NVL(SUM((SELECT SUM(QTY) FROM BARM_1 Z WHERE Z.AUFNR=B.AUFNR)),0)+
       NVL(SUM((SELECT SUM(QTY) FROM BARM_3 Z WHERE Z.AUFNR=B.AUFNR)),0)+
       NVL(SUM((SELECT SUM(QTY) FROM BARM_4 Z WHERE Z.AUFNR=B.AUFNR)),0)+
       NVL(SUM((SELECT SUM(QTY) FROM BARM_5 Z WHERE Z.AUFNR=B.AUFNR)),0) BAR_QTY 
FROM ODRD A,SAP_WO B
WHERE A.FACT_ODR_NO       = B.PROD_BATCH   AND 
                TRIM(NVL(TRIM(A.POSNR),TRIM(SIZE_NO)))=TRIM(DECODE(TRIM(B.POSNR),'000000',B.WO_SIZE,B.POSNR)) AND
           B.PROD_BATCH        =:AS_PROD_BATCH  AND 
      SUBSTR(B.PLNBEZ,1,2)=:AS_KIND        AND
                b.AUART NOT IN ('ZP02','ZP03','ZP04','ZP10','ZP14') 
GROUP BY B.AUFNR,B.WO_SIZE 
*/

从输出看出:
file# 34 block# 930501 slot 2 not found
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
触发错误的是一个select查询语句,由此推断,select在执行过程中,遇到file# 34 block# 930501 slot 2 not found,导致ORA-00600错误
这很可能是一个物理错误或者逻辑错误

3.处理过程

3.1数据文件检查

先对涉及的file# 34作一个dbv检验,看看是否有物理损坏

[oracle@gsierp ~]$ dbv userid='system/"XXX123OSrr07g4KaFLCCs7c/"'  file='/u4/oradata/pyia/sisdata03.dbf'  blocksize=8192

DBVERIFY: Release 10.2.0.5.0 - Production on Wed Aug 30 15:58:47 2023
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u4/oradata/pyia/sisdata03.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 1937920
Total Pages Processed (Data) : 965136
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 793739
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 172036
Total Pages Processed (Seg)  : 354
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 6655
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 335941877 (2475.335941877)

从检查结果来看,数据文件正常,没有发现问题

3.2 表及索引段检查

先定位出file# 34 block# 930501 slot 2 not found所涉及的表

SQL> select * from dba_extents where file_id=34 and 930501 between block_id and block_id+blocks;

OWNER                SEGMENT_NAME                   PARTITION_NAME                 SEGMENT_TYPE       TABLESPACE_NAME       EXTENT_ID    FILE_ID   BLOCK_ID      BYTES     BLOCKS RELATIVE_FNO
-------------------- ------------------------------ ------------------------------ ------------------ -------------------- ---------- ---------- ---------- ---------- ---------- ------------
PGDSISSAP            SAP_WO                                                        TABLE              SIS_DATA                      2         34     929801   41943040       5120           34

分析涉事的表及其索引,结果也正常

SQL> analyze table pgdsissap.sap_wo validate structure cascade;
Table analyzed.

清空buffer_cache后,做一次全表扫描涉事的表,还是正常

ALTER SYSTEM FLUSH BUFFER_CACHE;
select * from pgdsissap.sap_wo
 COUNT(*)
----------
    298678

4.结论:

最后,结合整理网上网友观点:
坏块可能出现再内存中,经过一段时间,buffer cache中的坏块内容已经被age out(老化),所以已经无法重现(甚至不用做任何操作)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值