生产故障:ADG备库通过索引查询数据报ORA-01555问题分析

IT数据库行业小学生,记录日常工作中数据库知识及一些故障案例,如有不对请指正,欢迎关注小编,小编微信xh870545795,CSDN:dba_notes

750a0f51452e65ad2c7f474ae2510c3b.jpeg


一、环境信息及现象

      客户一套oracle11.2.0.4.0 rac的adg环境,操作系统为AIX,在做完主备切换的时候备库alert日志经常出现ORA-01555错误,由于备库承担部分查询业务,已经影响到业务的正常进行。

二、故障分析及解决办法

2.2.1 初步分析

       通过alert日志发现,大部分ORA-01555报错时的Query Duration0或者很短的时间,初步感觉不应该是由于UNDO被覆盖导致的(undo_retention值为7小时,undo表空间使用率为60%左右),这里选取一条最简单的语句进行分析,alert日志报错信息如下:

Fri Sep 24 09:42:53 2021
ORA-01555 caused by SQL statement below (SQL ID: 44zgyk79038wn, Query Duration=1 sec, SCN:0x08de.232508ad):
SELECT "PROPOSALNO" FROM "LIS"."LCRNEWSTATEHISTORY""LCR"

通过MOS文档IF: ORA-1555 Reported with Query Duration = 0 , or a Few Seconds (Doc ID 1950577.1)提示,造成这个问题的原因主要有索引/表不一致,或者主键索引损坏。

08b69cd79dc6aa45decfaed5b16599e4.png

2.2.2 进一步分析

对以上语句分别进行全表扫描查询和通过索引查询,结果如下:

SQL> select /*+ FULL(t1) */ count(distinct PROPOSALNO) from LIS.LCRNEWSTATEHISTORY t1;
COUNT(DISTINCTPROPOSALNO)
-------------------------
 720922
select /*+ index(LIS.LCRNEWSTATEHISTORYPK_LCRNEWSTATEHISTORY) */ count(distinct PROPOSALNO) from LIS.LCRNEWSTATEHISTORY where PROPOSALNO is not null;
select /*+ index(LIS.LCRNEWSTATEHISTORY PK_LCRNEWSTATEHISTORY) */ count(distinct PROPOSALNO) from LIS.LCRNEWSTATEHISTORY where PROPOSALNO is not null
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 247 with name "_SYSSMU247_3266529229$" too small

通过以上测试可以发现,只有通过索引进行查询就会报ORA-01555(错误可复现),这里初步判断为索引的问题,可能会有坏块,于是分析索引结构。

SQL> analyze index LIS.PK_LCRNEWSTATEHISTORY validate structure;
analyze index LIS.PK_LCRNEWSTATEHISTORY validate structure
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 247 with name "_SYSSMU247_3266529229$" too small

同样报错,于是使用dbv进行检测(有关dbv用法请参考MOS:DBVERIFY - Database file Verification Utility (Doc ID 35512.1)

oracle@uchod1053:/oradb>dbv userid=grid/xxxxxx file=+DATA/lispr/datafile/lis_indx.461.1045046387


DBVERIFY: Release 11.2.0.4.0 - Production on Fri Sep 24 23:59:08 2021


Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


DBVERIFY - Verification starting : FILE = +DATA/lispr/datafile/lis_indx.461.1045046387
itl[8] has higher commit scn(0x08de.b36e6585) than block scn (0x08de.20676e2b)
Page 253980 failed with check code 6056
itl[31] has higher commit scn(0x08de.f08d561f) than block scn (0x08de.2604d295)
Page 277252 failed with check code 6056
itl[3] has higher commit scn(0x08de.dcbb6062) than block scn (0x08de.21e73996)
Page 516258 failed with check code 6056
itl[2] has higher commit scn(0x08de.f10978c4) than block scn (0x08de.19db6530)
Page 807019 failed with check code 6056
itl[30] has higher commit scn(0x08de.b394bdd7) than block scn (0x08de.23d3d46f)
Page 1245654 failed with check code 6056
itl[3] has higher commit scn(0x08de.d9da9fb7) than block scn (0x08de.17dbaf65)
Page 1632911 failed with check code 6056
itl[105] has higher commit scn(0x08de.f95d825f) than block scn (0x08de.23b26365)
Page 1642993 failed with check code 6056
itl[15] has higher commit scn(0x08de.fbad5538) than block scn (0x08de.1726f072)
Page 1682799 failed with check code 6056
itl[7] has higher commit scn(0x08de.b2bb2abc) than block scn (0x08de.210ff5e6)
Page 1697414 failed with check code 6056
itl[19] has higher commit scn(0x08de.d96db16a) than block scn (0x08de.1860cf5f)
Page 1708346 failed with check code 6056
itl[7] has higher commit scn(0x08de.a3cce991) than block scn (0x08de.054255e0)
Page 1710322 failed with check code 6056
itl[3] has higher commit scn(0x08de.c20adcb7) than block scn (0x08de.21e737c2)
Page 1757396 failed with check code 6056
itl[2] has higher commit scn(0x08de.f35465c3) than block scn (0x08de.17cfa5ff)
Page 1808863 failed with check code 6056
itl[45] has higher commit scn(0x08de.f9405a45) than block scn (0x08de.19c88446)
Page 1865058 failed with check code 6056
itl[24] has higher commit scn(0x08de.e971967f) than block scn (0x08de.25acba10)
Page 1865120 failed with check code 6056
itl[40] has higher commit scn(0x08de.f2ca7710) than block scn (0x08de.21b26967)
Page 1958970 failed with check code 6056
itl[68] has higher commit scn(0x08de.e7338b41) than block scn (0x08de.17081303)
Page 1960838 failed with check code 6056
itl[53] has higher commit scn(0x08de.e20917fa) than block scn (0x08de.1e8d0a9e)
Page 1962776 failed with check code 6056
itl[3] has higher commit scn(0x08de.fbd92d2a) than block scn (0x08de.00973487)
Page 1993799 failed with check code 6056
itl[46] has higher commit scn(0x08de.e169fa87) than block scn (0x08de.1c70daad)
Page 2058441 failed with check code 6056
itl[56] has higher commit scn(0x08de.f8467e62) than block scn (0x08de.1c5fdfaa)
Page 2058569 failed with check code 6056
itl[46] has higher commit scn(0x08de.f2953225) than block scn (0x08de.1776ebdd)
Page 2058626 failed with check code 6056
itl[71] has higher commit scn(0x08de.ecf63db2) than block scn (0x08de.17e3979e)
Page 2058786 failed with check code 6056
itl[66] has higher commit scn(0x08de.f2a63874) than block scn (0x08de.17aaa322)
Page 2059971 failed with check code 6056
itl[24] has higher commit scn(0x08de.ef7a9ddb) than block scn (0x08de.1e224667)
Page 2060778 failed with check code 6056
itl[85] has higher commit scn(0x08de.f96cb454) than block scn (0x08de.0f4da702)
Page 2060828 failed with check code 6056
itl[60] has higher commit scn(0x08de.df8eb33d) than block scn (0x08de.1d53f8b6)
Page 2061723 failed with check code 6056
itl[30] has higher commit scn(0x08de.fb578f9b) than block scn (0x08de.1ffd8cc7)
Page 2063118 failed with check code 6056
itl[51] has higher commit scn(0x08de.f28da248) than block scn (0x08de.1c705496)
Page 2064079 failed with check code 6056
itl[55] has higher commit scn(0x08de.eba5752a) than block scn (0x08de.17e9094d)
Page 2064575 failed with check code 6056
itl[57] has higher commit scn(0x08de.ef0758d7) than block scn (0x08de.178d1c6f)
Page 2065667 failed with check code 6056
itl[2] has higher commit scn(0x08de.f87983b0) than block scn (0x08de.1fb8fc7a)
Page 2071748 failed with check code 6056
itl[126] has higher commit scn(0x08de.f9c8ca4f) than block scn (0x08de.018d26c6)
Page 2073865 failed with check code 6056




DBVERIFY - Verification complete


Total Pages Examined         : 3840000
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 2194626
Total Pages Failing   (Index): 33
Total Pages Processed (Other): 4789
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1640585
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 0 (0.0)

这里可以看到,没有标记坏块的出现,但是检测过程中报了6056错误,通过MOS查询该错误,发现以下文档ALERT Bug 22241601 ORA-600 [kdsgrp1] / ORA-1555 / ORA-600 [ktbdchk1: bad dscn] / ORA-600 [2663] due to Invalid Commit SCN in INDEX (Doc ID 1608167.1)内容跟本次问题高度相似,以下为文档摘录。

98c885f15e4e79e031ff8a9f95dd2d41.png

通过该文档的WORKAROUND,首选方法还是打补丁,但是由于客户原因,客户希望在不打补丁的情况下看有没有方法解决。经过和客户讨论,可以在主库将有问题的表的索引进行删除并重建,自动同步到备库,达到备库索引重建的目的(待验证)

2.2.3确认影响范围

在备库采用hint方式扫描所有的索引,发现有问题的索引39个,由于涉及到隐私性,在此就不一一列举。

2.2.4 处理方案验证

主库选取一张问题表,复制该表和所有索引,DG同步到备库,备库查询验证,具体测试如下:

SQL>select a.uniqueness,b.index_name,b.column_name from dba_indexesa,dba_ind_columns b where a.table_name=b.table_name anda.index_name=b.index_name and a.table_owner='LIS' anda.table_name='LCRNEWSTATEHISTORY' order by a.uniqueness desc;UNIQUENES        INDEX_NAME               COLUMN_NAME---------    ------------------------------ --------------------UNIQUE        PK_LCRNEWSTATEHISTORY          PROPOSALNONONUNIQUE    INX_RNEWHISTORY_POLNO            POLNONONUNIQUE    INX_RNEWHISTORY_CONTNO           CONTNOSQL>create table LIS.LCRNEWSTATEHISTORY_TT as select * from LIS.LCRNEWSTATEHISTORY;Table created.SQL>alter table LIS.LCRNEWSTATEHISTORY_TT add constraint PK_LCRNEWSTATEHISTORY_TT primary key(PROPOSALNO);Table altered.SQL>create index LIS.INX_RNEWHISTORY_POLNO_TT on LIS.LCRNEWSTATEHISTORY_TT(POLNO);Index created.SQL>create index LIS.INX_RNEWHISTORY_CONTNO_TT on LIS.LCRNEWSTATEHISTORY_TT(CONTNO);Index created.

备库通过索引查询,未发现报错。

SQL>select /*+ index(t PK_LCRNEWSTATEHISTORY_TT) */ count(*) from LIS.LCRNEWSTATEHISTORY_TT t; COUNT(*)---------- 721260SQL>select /*+ index(t INX_RNEWHISTORY_CONTNO_TT) */ count(*) from LIS.LCRNEWSTATEHISTORY_TT t; COUNT(*)---------- 721260SQL>select /*+ index(t INX_RNEWHISTORY_POLNO_TT) */ count(*) from LIS.LCRNEWSTATEHISTORY_TT t; COUNT(*)---------- 721260

以上方式验证了通过重建索引解决该问题的可行性。

2.2.5 后续处理

后面通过dbms_metadata.get_ddl获取到索引的创建语句,然后在非业务高峰期在主库删除有问题表的索引进行创建。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

dba_notes

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值