某客户数据库性能诊断报告

一、故障描述
***数据库近期由于业务量加大,持续出现事务响应缓慢,在业务高峰期(9:00左右)甚至出现卡住现象,导致前台无法及时响应。通过远程拨号持续观察3天,并对数据库性能做出了响应的临时性优化,在一定程度上得到了缓解。以目前数据库性能缓慢主要有两方面原因:
1、 数据库soft parse过多,导致library cache争用。
2、 数据库部分SQL执行效率不高,在大并发环境下导致热点块(hot block)争用。

二、故障分析
1、library cache争用
Library cache争用主要集中体现在2011年3月15日9:00-11:00,由于此原因(此原因在此次故障占主导地位),导致前台业务hang住,后台SQLPLUS无法连接。最终通过后台杀进程,重启数据库解决。
故障发生时,数据库主要体现为:
Cache Sizes
Begin End
Buffer Cache: 2,512M 2,512M Std Block Size: 8K
Shared Pool Size: 3,584M 3,584M Log Buffer: 14,336K

Top 5 Timed Events
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
latch: library cache 4,216 1,056 251 25.6 Concurrency
db file sequential read 190,794 658 3 15.9 User I/O
db file scattered read 87,955 313 4 7.6 User I/O
read by other session 147,808 291 2 7.0 User I/O
CPU time 279 6.7
Wait Events
? s - second
? cs - centisecond - 100th of a second
? ms - millisecond - 1000th of a second
? us - microsecond - 1000000th of a second
? ordered by wait time desc, waits desc (idle events last)
Event Waits %Time -outs Total Wait Time (s) Avg wait (ms) Waits /txn
latch: library cache 4,216 0.00 1,056 251 0.39
db file sequential read 190,794 0.00 658 3 17.54
db file scattered read 87,955 0.00 313 4 8.09
read by other session 147,808 0.00 291 2 13.59
latch free 587 1.87 129 220 0.05
latch: shared pool 1,046 0.00 73 70 0.10
enq: TX - row lock contention 25 96.00 70 2813 0.00
latch: library cache lock 180 0.00 51 281 0.02
buffer busy waits 72 58.33 47 648 0.01
direct path read temp 10,960 0.00 45 4 1.01
db file parallel write 4,561 0.00 38 8 0.42
latch: cache buffers chains 2,259 0.00 34 15 0.21
latch: session allocation 97 0.00 26 265 0.01

SQL ordered by Version Count
? Only Statements with Version Count greater than 20 are displayed
Version Count Executions SQL Id SQL Module SQL Text
11,265 gdy6tymxpwjz0
** SQL Text Not Available **
10,658 7zw1mj3v5wqcq
** SQL Text Not Available **
2,194 bfp3m30c0c936
** SQL Text Not Available **
1,965 461mvk5ra20xg
** SQL Text Not Available **
892 dvqq0hguxt4ua
SELECT COUNT(:"SYS_B_00") FROM...

? 说明:
(1) Oracle数据库shared pool设置为3,584M,过大的shared pool设置,导致过高的latch管理成本。
(2) Oracle数据库cursor_sharing参数设置为similar,此参数的设置在避免过多硬解析的同时,在表格选择性列存在柱状图的条件下,又带来执行计划产生的低效率。在version count高达11265的SQL下,又带来了library cache扫描时间的加长,又进一步加重了library cache的争用。

? 解决办法:
(1) 将Oracle数据库SGA设置为3G,即从6G降至3G,在不影响业务前提下,降低latch管理成本。
SQL> show sga

Total System Global Area 3456106496 bytes
Fixed Size 2087968 bytes
Variable Size 1107297248 bytes
Database Buffers 2332033024 bytes
Redo Buffers 14688256 bytes
(2) 将cursor_sharing参数设置为force,同时将session_cached_cursors设置为100,进一步降低library cache的争用。
通过以上临时解决办法,数据库library cache的争用得到了极大的缓解。

3、 热点块争用
目前Oracle数据库存在着严重的热点块争用,即大量的并发会话同时读取表格中同一个block,导致会话间相互等待。由于会话间等待时间较高,进而加剧了事务行级锁(enq: TX - row lock contention)的发生。目前系统中等待事件主要如下:
Top 5 Timed Events
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
read by other session 1,811,592 2,797 2 50.0 User I/O
enq: TX - row lock contention 311 895 2,879 16.0 Application
db file scattered read 428,596 857 2 15.3 User I/O
CPU time 561 10.0
db file sequential read 221,017 385 2 6.9 User I/O

Instance Efficiency Percentages (Target 100%)
Buffer Nowait %: 94.72 Redo NoWait %: 99.99
Buffer Hit %: 86.97 In-memory Sort %: 100.00
Library Hit %: 99.62 Soft Parse %: 99.43
Execute to Parse %: -16.93 Latch Hit %: 90.88
Parse CPU to Parse Elapsd %: 6.05 % Non-Parse CPU: 99.91
? 说明:
由红色字体标出可以看出目前系统中存在着大量的热点块争用,而这些热点块争用是主要是由于SQL语句执行效率不高(全表扫描,Buffer Hit为86.97)导致的。以下为引起该问题的SQL。
SQL1:
select 1 from hz_qyhznr where trim(zch)=:1 and qylxdl not between21 and 26。
由于表格W_HZ_QYHZNR字段ZGH没有建立函数索引trim,导致了全表扫描。
解决办法:
在字段ZGH建立了函数索引trim(zch),SQL执行效率比原先得到了极大提高,但SQL中存在判断条件qylxdl not between21 and 26,使得执行效率不理想。
在表格W_HZ_QYHZNR存在着如下数据分布,可以看到此判断条件显得多余,如由于业务逻辑关系,必需此判断条件,建议在程序代码中加以判断,而不是在SQL中判断。
SQL> select QYLXDL,count(*) from W_HZ_QYHZNR group by QYLXDL order by QYLXDL;

QYLXDL COUNT(*)
---------- ----------
11 86698
12 13811
13 4517
14 2074
16 3
31 29132
32 638
33 7268
34 129

9 rows selected.

SQL2:
select 1 from hz_qyhznr where trim(zch) =:1 and trim(qymc) = :2
解决办法:
SQL> select count(distinct qymc) from W_HZ_QYHZNR;

COUNT(DISTINCTQYMC)
-------------------
149858

SQL> select count(*) from W_HZ_QYHZNR;

COUNT(*)
----------
149858
由于在列qymc选择性较好,在其上建立函数索引,效率得到极大提高。

SQL3:
SELECT T.ZCH, T.BGZXRQ FROM BF_QYHZNR T WHERE T.ZCH = :B1 AND T.ZT = 'D';
SELECT COUNT(1) FROM BF_QYHZNR T WHERE T.ZCH = :B1 AND T.ZT = 'X';
解决办法:
SQL> conn qn_yc/xxb1002
Connected.
SQL> select count(*) from BF_QYHZNR;

COUNT(*)
----------
527350

SQL> select zt,count(*) from BF_QYHZNR group by zt;

ZT COUNT(*)
-- ----------
9
D 54164
Q 41
B 338511
X 134621
C 4

6 rows selected.
SQL> explain plan for select T.ZCH, T.BGZXRQ FROM BF_QYHZNR T WHERE T.ZCH = :B1 AND T.ZT = 'D';

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 24 | 9428 (1)|
|* 1 | TABLE ACCESS FULL| W_BF_QYHZNR | 1 | 24 | 9428 (1)|
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
1 - filter("T"."ZCH"=:B1 AND "T"."ZT"='D')
可以看到ZT列选择性不佳,建议在程序代码中加以判断,而不是在SQL中判断。
否则执行计划为全表扫描,容易导致热点块冲突。
临时解决办法:
SQL> conn hz_yc/xxb1002
Connected.
SQL> select count(*) from W_BF_QYHZNR;

COUNT(*)
----------
527350

SQL> create index W_BF_QYHZNR_zch_zt on W_BF_QYHZNR(zch,zt);

Index created.
SQL> conn qn_yc/xxb1002
Connected.
SQL> explain plan for select T.ZCH, T.BGZXRQ FROM BF_QYHZNR T WHERE T.ZCH = :B1 AND T.ZT = 'D';

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
-------
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)|
--------------------------------------------------------------------------------
-------
| 0 | SELECT STATEMENT | | 1 | 24 | 4
(0)|
| 1 | TABLE ACCESS BY INDEX ROWID| W_BF_QYHZNR | 1 | 24 | 4
(0)|

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|* 2 | INDEX RANGE SCAN | W_BF_QYHZNR_ZCH_ZT | 1 | | 3
(0)|
--------------------------------------------------------------------------------
-------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T"."ZCH"=:B1 AND "T"."ZT"='D')
目前在T"."ZCH"=:B1 AND "T"."ZT"='D'建立联合索引,执行计划效率较高。

SQL4:
仔细检查存储过程CALL pkg_check.chk_grade(:1,:2,:3),根据统计资料显示,该执行计划是引起数据库性能缓慢的主要原因,需要开发商进一步优化。
RECOMMENDATION 1: SQL Tuning, 64% benefit (3566 seconds)
ACTION: Investigate the SQL statement with SQL_ID "57xrg07tjm0qn" for possible performance improvements.
RELEVANT OBJECT: SQL statement with SQL_ID 57xrg07tjm0qn
CALL pkg_check.chk_grade(:1,:2,:3)
RATIONALE: SQL statement with SQL_ID "57xrg07tjm0qn" was executed 30
times and had an average elapsed time of 117 seconds.
RATIONALE: Waiting for event "read by other session" in wait class "User
I/O" accounted for 49% of the database time spent in processing the
SQL statement with SQL_ID "57xrg07tjm0qn".
RATIONALE: Waiting for event "enq: TX - row lock contention" in wait
class "Application" accounted for 26% of the database time spent in
processing the SQL statement with SQL_ID "57xrg07tjm0qn".
RATIONALE: Waiting for event "db file scattered read" in wait class
"User I/O" accounted for 9% of the database time spent in processing
the SQL statement with SQL_ID "57xrg07tjm0qn".
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值