涉及知识点:
1.NL连接的优化
2.收集统计信息方式
3.查看执行计划的6种方式及其优缺点
4.SQL执行计划的执行顺序查看方法
问题现象:在进行压力测试登录的时候,发现有大量查询消息的操作。SQL卡顿时间长达1分钟。
image.png
具体SQL如下:
select
*
from
( select
row_. * ,
rownum rownum_
from
( select
。。COLORKEY ,
RESENDTIMES
from
sm_msg_content
where
receiver = '1001WW1000000000054U'
and msgtype = 'nc'
and destination = 'inbox'
and msgsourcetype in (
select
typecode
from
sm_msg_msgtype
where
displocation = 'worklist'
)
and (
pk_detail in (
select
nt.pk_checkflow
from
pub_workflownote nt
where
nt.PK_BILLTYPE in (
select
bt.pk_billtypecode
from
bd_billtype bt
where
(
bt.pk_billtypecode in (
select
acbt.billtype
from
wfm_acceptnctype acbt
where
acbt.ext4 = 'ALLOW_MSG'
oracbt.ext4 is null
)
andbt.parentbilltype = '~'
)
or (
bt.parentbilltype < > '~'
andbt.parentbilltype in (
select
accbt.billtype
from
wfm_acceptnctype accbt
where
accbt.ext4 = 'ALLOW_MSG'
oraccbt.ext4 is null
)
)
)
)
or pk_detail is null
)
and not exists (
select
PK_CHECKFLOW
from
pub_workflownote nt
where
nt.ACTIONTYPE ='MAKEBILL'
and nt.PK_CHECKFLOW =PK_DETAIL
)
order by
sendtime desc ) row_
where
rownum < = 100
)
where
rownum_ > 0
问题分析:
针对问题现象,可以得到初步结论。该SQL返回结果集很少,SQL解析的时间长达50秒左右。所以应该是可以优化到2s以内。
由于可以执行出SQL结果,并且不需要查看物理读和逻辑读。
因此,通过statistics_level=all的方式查看执行计划。
查看执行顺序的时候,我们可以将执行计划拷贝到TXT文本里,用光标大法进行查询。
找到缩进的最深的第一个语句,就是SQL执行的第一条语句,然后按照从上到下,逐步向外合并的顺序进行阅读。
| Id | Operation | Name |Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
| 0 | SELECT STATEMENT | | 1 | | 100 |00:00:49.67 | 952K| | | |
|* 1 | VIEW | | 1 | 100 | 100 |00:00:49.67 | 952K| | | |
|* 2 | COUNT STOPKEY | | 1 | | 100 |00:00:49.67 | 952K| | | |
| 3 | VIEW | | 1 | 199 | 100 |00:00:49.67 | 952K| | | |
|* 4 | SORT ORDER BY STOPKEY | | 1 | 199 | 100 |00:00:49.67 | 952K| 160K| 160K| 142K (0)|
|* 5 | FILTER | | 1 | | 14017 |00:00:49.34 | 952K| | | |
|* 6 | HASH JOIN RIGHT ANTI | | 1 | 3933 | 14487 |00:00:01.63 | 41121 | 2616K| 2616K| 161K (0)|
|* 7 | TABLE ACCESS FULL |PUB_WORKFLOWNOTE | 1 | 1| 0 |00:00:00.67 | 26438 | | | |
| 8 | NESTED LOOPS | | 1 | 3934 | 14487 |00:00:00.85 | 14683 | | | |
| 9 | NESTED LOOPS | | 1 | 3934 | 14487 |00:00:00.19 | 203 | | | |
| 10 | SORT UNIQUE | | 1 | 11 | 11 |00:00:00.01 | 6 | 2048 | 2048 | 2048 (0)|
|* 11 | TABLE ACCESS FULL | SM_MSG_MSGTYPE | 1 | 11 | 11 |00:00:00.01 | 6 | | | |
|* 12 | INDEX RANGE SCAN | I_RCVMSGTYPE | 11 | 183 | 14487 |00:00:00.18 | 197 | | | |
|* 13 | TABLE ACCESSBY INDEX ROWID| SM_MSG_CONTENT | 14487 | 1311 | 14487 |00:00:00.61 | 14480 | | | |
|* 14 | FILTER | | 14487 | | 14017 |00:00:47.06 | 911K| | | |
| 15 | NESTED LOOPS | | 14487 | 1| 14017 |00:00:46.80 | 911K| | | |
| 16 | TABLE ACCESS BY INDEX ROWID| PUB_WORKFLOWNOTE | 14487 | 1 | 14017 |00:00:02.25 | 56851 | | | |
|* 17 | INDEX UNIQUE SCAN | PK_PUB_WORKFLOWNOT| 14487 | 1 | 14017 |00:00:01.72| 42834 | | | |
|* 18 | TABLE ACCESSFULL | BD_BILLTYPE | 14017 | 1 | 14017 |00:00:44.46 | 854K| | | |
|* 19 | TABLE ACCESS FULL |WFM_ACCEPTNCTYPE | 1 | 1| 1 |00:00:00.01 | 2 | | | |
|* 20 | TABLE ACCESS FULL |WFM_ACCEPTNCTYPE | 0 | 1| 0 |00:00:00.01 | 0 | | | |
可以明显发现,时间在第| 15 | NESTED LOOPS | | 14487 | 1 | 14017 |00:00:46.80 | 911K| | | |有了突然的上涨。
该处属于NL连接。而细看我们可以发现,驱动表为PK_PUB_WORKFLOWNOT 而 被驱动表为WFM_ACCEPTNCTYPE 。
驱动表为1W4的行数,而被驱动表为1行。由于NL的机制,在两表进行合并操作的时候会进行被驱动表次数的循环。
这里显然是驱动表的顺序错误,变成了大表驱动小表。而这样情况的发生,很多时候是由于统计信息不准确导致。
这也就是为什么采用statistics_level=all的方式进行执行计划的查看。该方法是唯一可以看到预估行数E-ROWS和实际行数A-ROWs的方式。
这里我们看到,PK_PUB_WORKFLOWNOT 预估值为1,而实际值为14017。显然是这里的预估出现了问题。
问题解决:
知道是统计信息不准确导致的问题,就比较好解决了。这里注意,需要使用oracle 11g推荐的方式来收集统计信息。
exec dbms_stats.gather_table_stats(‘nc_user','PUB_WORKFLOWNOTE',CASCADE=>true,method_opt=>'for all indexed columns');
同时,由于NL的机制比较关注关联列的效率问题,而第18
18 - filter("NT"."PK_BILLTYPE"="BT"."PK_BILLTYPECODE")走了filter。而这里是可以通过走索引来避免全表扫描的。于是在这两个关联列上都添加了索引。
最终,该SQL的执行时间从50秒变为了1秒以内,达到了优化的效果。使得压力测试顺利进行。
| Id | Operation | Name |Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
| 0 | SELECT STATEMENT | | 1 | | 100 |00:00:00.58 | 181K| | | |
|* 1 | VIEW | | 1 | 1| 100 |00:00:00.58 | 181K| | | |
|* 2 | COUNT STOPKEY | | 1 | | 100 |00:00:00.58 | 181K| | | |
| 3 | VIEW | | 1 | 1| 100 |00:00:00.58 | 181K| | | |
|* 4 | SORT ORDER BY STOPKEY | | 1 | 1| 100 |00:00:00.58 | 181K| 142K| 142K| 126K (0)|
|* 5 | FILTER | | 1 | | 17988 |00:00:00.55 | 181K| | | |
| 6 | NESTED LOOPS ANTI | | 1 | 4| 18458 |00:00:00.33 | 73030 | | | |
|* 7 | HASH JOIN | | 1 | 376 | 18458 |00:00:00.14 | 18589 | 1645K| 1645K| 1298K (0)|
| 8 | SORT UNIQUE | | 1 | 11 | 11|00:00:00.01 | 6 | 2048 | 2048 | 2048 (0)|
|* 9 | TABLE ACCESS FULL | SM_MSG_MSGTYPE | 1 | 11 | 11|00:00:00.01 | 6 | | | |
|* 10 | TABLE ACCESSBY INDEX ROWID| SM_MSG_CONTENT | 1 | 1377 | 18458 |00:00:00.12 | 18583 | | | |
|* 11 | INDEX RANGE SCAN |I_RCV_ISREAD | 1 | 548 | 18459 |00:00:00.01 | 133 | | | |
|* 12 | TABLE ACCESS BYINDEX ROWID | PUB_WORKFLOWNOTE | 18458 | 6048 | 0 |00:00:00.17| 54441 | | | |
|* 13 | INDEX UNIQUESCAN | PK_PUB_WORKFLOWNOT | 18458 | 1| 17988 |00:00:00.10 | 36453 | | | |
|* 14 | FILTER | | 18458 | | 17988 |00:00:00.18 | 108K| | | |
| 15 | NESTED LOOPS | | 18458 | 2| 17988 |00:00:00.17 | 108K| | | |
| 16 | TABLE ACCESS BY INDEX ROWID| PUB_WORKFLOWNOTE | 18458| 1 | 17988 |00:00:00.08 | 72722 | | | |
|* 17 | INDEX UNIQUE SCAN |PK_PUB_WORKFLOWNOT | 18458 | 1 | 17988 |00:00:00.06 | 54734 | | | |
| 18 | TABLE ACCESS BY INDEX ROWID| BD_BILLTYPE | 17988 | 2| 17988 |00:00:00.07 | 35981 | | | |
|* 19 | INDEX RANGE SCAN |PK_BILLTYPEIDCODE | 17988 | 2| 17988 |00:00:00.04 | 17993 | | | |
|* 20 | TABLE ACCESS FULL | WFM_ACCEPTNCTYPE | 1 | 1 | 1 |00:00:00.01 | 2 | | | |
|* 21 | TABLE ACCESS FULL | WFM_ACCEPTNCTYPE | 0 | 1 | 0 |00:00:00.01 | 0 | | | |