http://www.db2china.net/club/thread-18522-1-3.html

客户环境:db2 v82 fp13, aix 5.3一个采购管理系统,

问题描述:某些页面响应时间很慢,系统管理员通过系统监控,发现 CPU、I/O资源使用率高。


问题分析:这是典型的性能问题,分析的思路自然从database snapshot入手,发现rows read/rows selected过高,达到600左右,观察bufferpool同步读I/O,发现很低。这说明database有性能瓶颈。接着抓取sql snapshot快照,发现很多语句有问题。以下是找到的比较有意思的一条。

通过sql快照:


Number of executions = 3

Number of compilations = 1

Worst preparation time (ms) = 18

Best preparation time (ms) = 18

Internal rows deleted = 0

Internal rows inserted = 0

Rows read = 225976098

Internal rows updated = 0

Rows written = 77448

Statement sorts = 5520

Statement sort overflows = 6

Total sort time = 108

Buffer pool data logical reads = 166248

Buffer pool data physical reads = 0

Buffer pool temporary data logical reads = 3391188

Buffer pool temporary data physical reads = 0

Buffer pool index logical reads = 404475

Buffer pool index physical reads = 3467

Buffer pool temporary index logical reads = 0

Buffer pool temporary index physical reads = 0

Total execution time (sec.ms) = 134.657283

Total user cpu time (sec.ms) = 96.790617

Total system cpu time (sec.ms) = 1.878419

Statement text = select sum(a.zctamount) amount,aa.invoicecode code,aa.status from invoicedetail a,invoice aa,billing b,deliverydetail c,orderdetail d, ord dd,ordertype e WHERE 1=1 and a.invoiceid=aa.invoiceid and a.billingid=b.billingid and b.relatedtrandetailid=c.deliverydetailid and c.orderdetailid=d.orderdetailid and dd.ordertypeid=e.ordertypeid and dd.orderid=d.orderid and e.saleflag>0 and aa.partnerid=dd.billto and aa.orgid=dd.sellerid and b.billto=dd.billto and a.relatedtrantype=10 and aa.status=80 and dd.sellerid=100000001 and dd.billto=100002441 and dd.status in ( 50, 60, 70) and dd.ordertypeid in (100000242,100000000,100000001,100000002,100000003,100000004,100000011,100000005,100000170,100000006,100000007,100000008,100000151,100000155,100000156,100000157,100000158,100000070,100000163,100000165,100000194,100000202,100000203,100000159,100000132) GROUP BY aa.invoicecode ,aa.status
复制代码


发现该语句每次执行大概需要45秒,每次读的行数7100万行,发生排序1800次,根据这些指标,说明该SQL存在严重的性能问题。Rows_read对系统的性能影响很大,通常是由于过多的表扫描引起。


找到了语句,接下来很自然的看它的执行计划。通过explain + db2exfmt,发现语句的cost为537.262,也不算高啊。


继续分析,发现很多连接操作符是MSJOIN和NLJOIN,根据经验,这些操作符有可能是出问题的地方。我们找到比较可疑的一个NLJOIN(21)。(篇幅所限,只粘贴了最关键的部分)


这个NLJOIN的预测rows只有0.0529,继续定位它的左子树,Fetch的cost达到165,而IXSCAN 25)的估计返回值是83行,以下是25)操作符的详细解释:


25) IXSCAN: (Index Scan)

Cumulative Total Cost: 25.7581

Cumulative CPU Cost: 210702

Cumulative I/O Cost: 2

Cumulative Re-Total Cost: 0.0427741

Cumulative Re-CPU Cost: 155241

Cumulative Re-I/O Cost: 0

Cumulative First Row Cost: 25.7183

Estimated Bufferpool Buffers: 3


Arguments:

---------

MAXPAGES: (Maximum pages for prefetch)

1

PREFETCH: (Type of Prefetch)

NONE

ROWLOCK : (Row Lock intent)

NONE

SCANDIR : (Scan Direction)

FORWARD

TABLOCK : (Table Lock intent)

INTENT NONE

TBISOLVL: (Table access Isolation Level)

CURSOR STABILITY


Predicates:

----------

7) Start Key Predicate

Relational Operator: Equal (=)

Subquery Input Required: No

Filter Factor: 0.000172325


Predicate Text:

--------------

(Q2."BILLTO" = 100002441)


7) Stop Key Predicate

Relational Operator: Equal (=)

Subquery Input Required: No

Filter Factor: 0.000172325


Predicate Text:

--------------

(Q2."BILLTO" = 100002441)


Input Streams:

-------------

24) From Object DB2ADMIN.ORD_IDX9


Estimated number of rows: 486884

Number of columns: 2

Subquery predicate ID: Not Applicable


Column Names:

------------

+Q2.$RID$+Q2."BILLTO"


Output Streams:

--------------

25) To Operator #24


Estimated number of rows: 83.9021

Number of columns: 1

Subquery predicate ID: Not Applicable


Column Names:

------------

+Q2."BILLTO"
复制代码

附件: 您需要登录才可以下载或查看附件。没有帐号?注册
本主题由 System 于 2012-5-15 05:00 撤销鉴定





发短消息
加为好友
飞天 当前离线

UID6761 帖子185 主题11 金钱1333 米 阅读权限50 注册时间2009-9-15
2#
五星级会员 飞天发表于 2011-6-20 18:20 | 只看该作者
本帖最后由 飞天 于 2011-6-20 18:28 编辑

回复 1# 飞天


观察到该操作符执行的过滤条件为Q2."BILLTO" = 100002441,那么我们要看一下该条件的返回行数是多少,结果是2580,而估计的值是83.9,差了几十倍。


db2 “select count(*) from ord where BILLTO=100002441”

2957


这是典型的优化器低估问题,那么为什么会出现这种情况呢?统计信息,没错,就是统计信息,如果统计信息不准确,就会误导优化器选择错误的路径。


这时,检查syscat.tables的stats_time统计时间,发现该表最近做过统计信息收集。那么为什么造成这种情况呢?继续通过以下语句查看数据分布情况,

db2 “select billto, count(*) from ord group by billto order by desc”


发现数据及其不均匀,这时可以怀疑分布统计信息问题。查看sysstat.coldist系统表,果然不出所料,该表并没有做分布统计信息。这时,执行 runstats with distribution。但奇怪的是,执行后,访问计划没有变,时间仍然为45秒。


继续查sysstat.coldist,发现收集了10个frequency value,但查询条件100002441并没有出现在该表中,也就是并没有为该值收集统计信息。


TABNAME COLNAME TYPE SEQNO COLVALUE VALCOUNT DISTCOUNT

---------- ---------- ---- ------ ------------ --------- ----------

ORD BILLTO F 1 100000002 5145 -

ORD BILLTO F 2 100004319 4533 -

ORD BILLTO F 3 100008682 4129 -

ORD BILLTO F 4 100000003 4001 -

ORD BILLTO F 5 100004037 3696 -

ORD BILLTO F 6 100017876 3435 -

ORD BILLTO F 7 100005877 3309 -

ORD BILLTO F 8 100000001 3195 -

ORD BILLTO F 9 100004046 3071 -

ORD BILLTO F 10 100004030 3008 -

ORD BILLTO Q 1 100000001 3195 1

ORD BILLTO Q 2 100001808 26186 85

ORD BILLTO Q 3 100002389 51546 179

ORD BILLTO Q 4 100003137 76945 276

ORD BILLTO Q 5 100003272 102535 355

ORD BILLTO Q 6 100003993 128771 513

ORD BILLTO Q 7 100004078 154800 559

ORD BILLTO Q 8 100004558 179535 706

ORD BILLTO Q 9 100005661 205113 964

ORD BILLTO Q 10 100006660 231133 1192

ORD BILLTO Q 11 100007976 256415 1489

ORD BILLTO Q 12 100008682 282549 1695

ORD BILLTO Q 13 100009725 307643 1923

ORD BILLTO Q 14 100011096 333224 2257

ORD BILLTO Q 15 100011827 359223 2592

ORD BILLTO Q 16 100013373 384522 2923

ORD BILLTO Q 17 100015551 410143 3295

ORD BILLTO Q 18 100016969 436002 3806

ORD BILLTO Q 19 100018193 461560 4293

ORD BILLTO Q 20 100024510 487015 5827
复制代码
这时更改数据库参数NUM_FREQVALUES到25,观察sysstat.coldist,发现已经为100002441 值收集了统计信息。
TABNAME COLNAME TYPE SEQNO COLVALUE VALCOUNT DISTCOUNT

---------- ---------- ---- ------ ------------ --------- ----------

ORD BILLTO F 1 100000002 5145 -

ORD BILLTO F 2 100004319 4533 -

ORD BILLTO F 3 100008682 4129 -

ORD BILLTO F 4 100000003 4001 -

ORD BILLTO F 5 100004037 3696 -

ORD BILLTO F 6 100017876 3435 -

ORD BILLTO F 7 100005877 3309 -

ORD BILLTO F 8 100000001 3195 -

ORD BILLTO F 9 100004046 3071 -

ORD BILLTO F 10 100004030 3008 -

ORD BILLTO F 11 100002441 2957 -

ORD BILLTO F 12 100011195 2690 -

ORD BILLTO F 13 100002354 2383 -

ORD BILLTO F 14 100004078 2311 -

ORD BILLTO F 15 100003218 2300 -

ORD BILLTO F 16 100002402 2252 -

ORD BILLTO F 17 100004087 2094 -

ORD BILLTO F 18 100004127 1950 -

ORD BILLTO F 19 100010910 1934 -

ORD BILLTO F 20 100003300 1894 -

ORD BILLTO F 21 100004006 1877 -

ORD BILLTO F 22 100003932 1835 -

ORD BILLTO F 23 100001852 1830 -

ORD BILLTO F 24 100002367 1796 -

ORD BILLTO F 25 100002342 1710 -

ORD BILLTO Q 1 100000001 3195 1

ORD BILLTO Q 2 100001808 26186 85

ORD BILLTO Q 3 100002389 51546 179

ORD BILLTO Q 4 100003137 76945 276

ORD BILLTO Q 5 100003272 102535 355

ORD BILLTO Q 6 100003993 128771 513

ORD BILLTO Q 7 100004078 154800 559

ORD BILLTO Q 8 100004558 179535 706

ORD BILLTO Q 9 100005661 205113 964

ORD BILLTO Q 10 100006660 231133 1192

ORD BILLTO Q 11 100007976 256415 1489

ORD BILLTO Q 12 100008682 282549 1695

ORD BILLTO Q 13 100009725 307643 1923

ORD BILLTO Q 14 100011096 333224 2257

ORD BILLTO Q 15 100011827 359223 2592

ORD BILLTO Q 16 100013373 384522 2923

ORD BILLTO Q 17 100015551 410143 3295

ORD BILLTO Q 18 100016969 436002 3806

ORD BILLTO Q 19 100018193 461560 4293

ORD BILLTO Q 20 100024510 487015 5827
复制代码

这时,再次执行该sql,响应时间降为1秒,返回的行数为90850。


Rows selected = 1317

Rows read = 90850

Rows written = 13183


Total User CPU Time used by agent (s) = 1.077983

Total System CPU Time used by agent (s) = 0.000324

Host execution elapsed time = 0.000023
复制代码

总结:

当数据分别不均匀时,需要收集分布式统计信息,否则可能会误导优化器选择错误的路径。当发现cost很低,并出现MSJOIN和NLJOIN操作符时,很大的可能是优化器低估。这时需要找到哪个操作符发生的问题,并一步一步地验证优化器估计值与实际值的偏差。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值