Statspack之问题发现
在一次例行Statspack检查中,发现数据库出现了性能异常,从以下15分钟的采样报告中可看出:
Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 2554 30-Jul-0909:58:04 983 515.5
End Snap: 2555 30-Jul-0910:13:04 978 518.2
Elapsed: 15.00 (mins)
系统的CPU消耗上升,以下是系统的负载概要信息:
Load Profile
~~~~~~~~~~~~ Per Second PerTransaction
--------------- ---------------
Redosize: 876,497.06 25,821.52
Logicalreads: 39,103.18 1,151.98
Blockchanges: 3,662.94 107.91
Physicalreads: 287.40 8.47
Physical writes: 180.96 5.33
User calls: 8,700.70 256.32
Parses: 715.31 21.07
Hard parses: 0.16 0.00
Sorts: 143.47 4.23
Logons: 0.03 0.00
Executes: 4,277.99 126.03
Transactions: 33.94
在接下来的性能指标中,Rows per Sort引起了我们的重点关注,平均每次排序的行数高达9 058行:
% Blocks changed per Read: 9.37 Recursive Call %: 37.52
Rollback per transaction %: 0.13 Rows per Sort: 9058.18
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
BufferNowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 99.63 In-memorySort %: 100.00
Library Hit %: 99.99 Soft Parse %: 99.98
Executeto Parse %: 83.28 Latch Hit %: 99.88
Parse CPU to Parse Elapsd %: 67.45 % Non-ParseCPU: 98.75
CPU使用率的飙升,会不会和排序有关呢?
作为一个产品数据库,竟然会出现这么高的rows/sort简直不可思议!要知道排序是需要CPU计算,要消耗大量CPU的。从这个出发点去定位SQL,继续看statspack找到最消耗CPU的SQL:
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time(s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 3,878 88.43
db file sequential read 20,986 170 3.88
log file sync 62,767 103 2.36
db file scattered read 10,886 55 1.25
direct path write 61,261 40 .92
-------------------------------------------------------------
SQL ordered by Gets forDB: MESWP Instance: meswp Snaps: 2554 -2555
-> End Buffer GetsThreshold: 10000
-> Note that resourcesreported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possibleand valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets perExec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
14,991,160 8,799 1,703.7 42.6 2888.67 3172.58 2784903626
Module: InSiteXMLServer.exe
select PRODUCTMATERIALLISTITEMIDFrom orderedmateriallist where
bomid = :"SYS_B_0" order by DisplayPriority
注意到Hash_Value=2784903626的SQL消耗的CPU Time已经达到总消耗的2 888.67/3 878 =74.5%,消耗的Logical Reads 已经达到总消耗的14 991 160/(39 103.18*60*15)=42.6%。
这个SQL的CPU Time / Elapsd Time =2 888.67/3 172.58= 91.1% 也就是说这个SQL执行的总消耗中CPU的时间消耗占到了91.1%,消耗这么多的CPU时间到底做了什么呢?会不会是排序呢?
SQL排序之问题定位
首先做一个查询验证,记录之前的排序统计数据:
SQL> SELECT * FROMv$statname WHERE NAME = 'sorts (rows)';
STATISTIC# NAME CLASS
---------- -------------------------------------------------- ----------
247 sorts (rows) 64
SQL> select * fromv$mystat where statistic#=247;
SID STATISTIC# VALUE
---------- ---------- ----------
1004 247 63
执行一次这个可疑的SQL:
SELECT productmateriallistitemid
FROM orderedmateriallist
WHERE bomid = '0018a68000093231'
ORDER BY displaypriority;
PRODUCTMATERIALLISTITEMID
-------------------------
0017c080000b5176
0017c080000b5177
0017c080000b5178
0017c080000b517b
0017c080000b517c
0017c080000b5172
0017c080000b517d
0017c080000b519d
0017c080000b519e
0017c080000b519f
0017c080000b5171
0017c080000b5173
0017c080000b5174
0017c080000b5175
0017c080000b5179
0017c080000b517a
16 rows selected
Executed in 0.765 seconds
再次查出统计数据:
select * from v$mystat where statistic#=247;
SID STATISTIC# VALUE
---------- ---------- ----------
1004 247 250175
Executed in 0.031 seconds
可见一个返回16行的查询竟然做了250 175-63 = 250 112行的排序。
文章来源:《OracleDBA手记1》第19章 一次排序的调整与优化 作者:段林仲
配图来源:http://rioit.co.uk/sites/default/files/remote-IT-support-figure_710x396.jpg
回复“手记1”可下载本书前三篇内容:DBA工作手记/诊断案例篇/SQL调优篇(建议复制链接至电脑端)