背景:某个数据抽取系统(OLAP)一个跑批存储过程在老环境(40分钟)迁到 新环境(140分钟)
老环境是一体机,新环境是双节点RAC.其实不算是迁移,可以看作俩系统同时存在 一条SQL分别在两个系统上跑,性能差异很大
SQL语句如下:
INSERT INTO TTTT_AAA
SELECT POST_DATE,
……
……
BY3
FROM TTTT_AAA_T I
LEFT JOIN RRR_NNNNNNN_428 R
ON R.SEQUENCE_NO=I.SEQUENCE_NO_EFTM;
1.为什么会变慢?
猜想是因为统计信息过期导致的SQL执行计划变了
2.如何验证我们猜想?
使用以下SQL查询SQL的历史执行计划
SQL> SELECT * FROM DBA_HIST_SQL_PLAN WHERE SQL_ID ='1612mq5nyjcuh';
DBID SQL_ID PLAN_HASH_VALUE ID OPERATION OPTIONS OBJECT_NAME OBJECT_ALIAS
---------- ------------- --------------- ---------- ------------------------------ --------------- ------------- ------------------------------
1049186783 1612mq5nyjcuh 2807630616 0 INSERT STATEMENT
1049186783 1612mq5nyjcuh 2807630616 1 LOAD TABLE CONVENTIONAL
1049186783 1612mq5nyjcuh 2807630616 2 HASH JOIN RIGHT OUTER
1049186783 1612mq5nyjcuh 2807630616 3 TABLE ACCESS FULL REP_NETJRNL_428 R@SEL$1
1049186783 1612mq5nyjcuh 2807630616 4 TABLE ACCESS FULL INCT_ALL_T I@SEL$2
3.执行计划没有变的情况下 应该从哪几个方向入手去查?
使用以下SQL查询问题SQL的等待事件信息
SQL> SELECT EVENT,COUNT(1) FROM DBA_HIST_ACTIVE_SESS_HISTORY A
2 WHERE TO_CHAR(SAMPLE_TIME,'YYYY-MM-DD HH24:MI:SS')>='2018-03-21 04:06:51'
3 AND TO_CHAR(SAMPLE_TIME,'YYYY-MM-DD HH24:MI:SS')<='2018-03-21 06:25:03'
4 AND SQL_ID='1612mq5nyjcuh'
5 GROUP BY EVENT ORDER BY 2 DESC;
EVENT COUNT(1)
---------------------------------------------------------------- ----------
db file sequential read 502
244
gc current grant 2-way 49
gc current request 3
log file switch completion 1
gc current multi block request 1
6 rows selected
4.全表扫描为什么会出单块读(db file sequential read ),哪几种情况下会出现这种情况?
1)读undo
2)维护索引
3)行链接、行迁移(如果是这样db file sequential read 的比例应该非常小)
所以 需要定位等待事件发生在哪个object上面(undo file 还是 index file)
5.怎么查询等待事件发生在哪个object上面?
SQL> SELECT SQL_ID, P1,P2,P3,P1TEXT,P2TEXT,P3TEXT FROM DBA_HIST_ACTIVE_SESS_HISTORY A
2 WHERE TO_CHAR(SAMPLE_TIME,'YYYY-MM-DD HH24:MI:SS')>='2018-03-21 04:06:51'
3 AND TO_CHAR(SAMPLE_TIME,'YYYY-MM-DD HH24:MI:SS')<='2018-03-21 06:25:03'
4 AND SQL_ID='1612mq5nyjcuh'
5 AND EVENT ='db file sequential read';
SQL_ID P1 P2 P3 P1TEXT P2TEXT P3TEXT
------------- ---------- ---------- ---------- ------------ ------------- ---------
1612mq5nyjcuh 1622 2112160 1 file# block# blocks
1612mq5nyjcuh 312 2632329 1 file# block# blocks
1612mq5nyjcuh 271 3688861 1 file# block# blocks
1612mq5nyjcuh 299 2735754 1 file# block# blocks
1612mq5nyjcuh 279 4059154 1 file# block# blocks
1612mq5nyjcuh 308 4054881 1 file# block# blocks
1612mq5nyjcuh 340 2609359 1 file# block# blocks
1612mq5nyjcuh 1627 558361 1 file# block# blocks
1612mq5nyjcuh 287 2595376 1 file# block# blocks
1612mq5nyjcuh 1631 489405 1 file# block# blocks
1612mq5nyjcuh 283 2650425 1 file# block# blocks
1612mq5nyjcuh 1633 733414 1 file# block# blocks
1612mq5nyjcuh 327 2662011 1 file# block# blocks
1612mq5nyjcuh 284 2560285 1 file# block# blocks
1612mq5nyjcuh 1620 2063611 1 file# block# blocks
1612mq5nyjcuh 528 1547673 1 file# block# blocks
1612mq5nyjcuh 292 2659311 1 file# block# blocks
1612mq5nyjcuh 277 4019727 1 file# block# blocks
1612mq5nyjcuh 303 2707764 1 file# block# blocks
1612mq5nyjcuh 368 144874 1 file# blocks
...
...
...
^ 502 ROWS
SELECT * FROM DBA_EXTENTS WHERE FILE_ID=1622 AND 2112160 BETWEEN BLOCK_ID AND BLOCK_ID+BLOCKS;
OWNER SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------- ------------- ---------------- ------------------ ---------------- ---------- ---------- ---------- ---------- ---------- ------------
RRRRRR IDX_TTTT_AAA TTTT_AAA_6050 INDEX PARTITION INDX 108 1622 2112064 8388608 256 599
可以看出等待事件发生在index file上面
查一下TTTT_AAA表的索引信息:
RRRRRR IDX_TTTT_AAA Normal ACCT_NO, TRAN_CODE N N Y
RRRRRR TTTT_AAA_PARUK_02 Unique POST_DATE, ACCT_NO, REC_NO N N N
RRRRRR TTTT_AAA_PAR_IDX2 Normal POST_DATE, ACCT_NO, JRNL_NO N N N
RRRRRR LOC_TTTT_AAA_PAR_IDX1 Normal POST_DATE, ACCT_NO N N Y
索引设计不合理,相当冗余
6.解决问题:删除冗余的索引 (以 POST_DATE开头的组合索引 只留一个)
查询索引的使用频率
SQL> SELECT OBJECT_NAME, COUNT(1)
2 FROM DBA_HIST_SQL_PLAN
3 WHERE OPERATION LIKE '%INDEX%'
4 AND OBJECT_NAME IN ('IDX_TTTT_AAA',
5 'TTTT_AAA_PARUK_02',
6 'TTTT_AAA_PAR_IDX2',
7 'LOC_TTTT_AAA_PAR_IDX1')
8 GROUP BY OBJECT_NAME
9 ORDER BY 2 DESC;
OBJECT_NAME COUNT(1)
------------------------------- ----------
TTTT_AAA_PARUK_02 387
LOC_TTTT_AAA_PAR_IDX1 70
IDX_TTTT_AAA 8
TTTT_AAA_PAR_IDX2 5
所以DROP两个冗余的LOC_TTTT_AAA_PAR_IDX1 和 TTTT_AAA_PAR_IDX2 索引即可
7.难道老环境上面不需要维护索引?
1) 查询老环境发现表TTTT_AAA 上面只有两个索引(新环境是有开发人员发现有SQL慢私自建了两个索引)
2) Exadata 维护索引的时候使用flash cache执行效率很高