一个sql语句的优化

最新推荐文章于 2024-09-20 16:43:14 发布

hellen1029

最新推荐文章于 2024-09-20 16:43:14 发布

阅读量956

点赞数

分类专栏： db2 文章标签： sql db2 date workflow header reference

本文链接：https://blog.csdn.net/hellen1029/article/details/232954

版权

db2 专栏收录该内容

0 篇文章 0 订阅

订阅专栏

by hellen

一、优化前的sql 及存取方案图

1. 原Sql:

SELECT * FROM (SELECT ROW_NUMBER() OVER (ORDER BY INNER_SET.ORDER_DATE DESC) AS ROW_NO, INNER_SET.* FROM

( select order_id,order_date

from

(select order_Id, order_Type_Id, status_Id, owner_Id, order_date,created_By from Order_Header where 1=1 and order_Type_Id = 'DRPR_ORDER' order by order_date desc ) ord

where 1=1 and

(1=2 or exists

(select order_id from Order_Attribute where attr_name='ORDER_CORPORATION_ID' and (1=2 or attr_Value = '10000' or attr_Value = '11000' ) and ord.order_id=order_id )

or exists (select order_id from Order_Attribute where attr_name='BUSINESS_MODE' and attr_Value='BM_DDCG' and ord.order_id=order_id ) )

and exists(select work_effort_id from Work_Effort where workflow_Activity_Id='purchase_approve' and current_Status_Id='WF_RUNNING' and ord.order_id = source_Reference_Id)

and exists(select oi.order_id from Order_Item oi ,product prdt where 1=1 and oi.product_id=prdt.product_id and prdt.product_Name like '%海尔洗衣机%' ) ORDER BY ord.ORDER_DATE DESC ) AS INNER_SET) AS OUTER_SET WHERE OUTER_SET.ROW_NO > 0 AND OUTER_SET.ROW_NO <= 15

2.原语句sql分析:

图1

二、优化过程:

分析1：

分析：图1 WorkEffort表：使用的是TBSCAN ,未利用索引，效率低，

尝试加索引 pk_effort (workflow_Activity_Id、current_Status_Id )

处理：sql不变，对WorkEffort加索引后，sql解释如下

对比：

Work_Effort索引状况	查找时间	总成本	效率
未加索引	122,451.53	240,639.38	低
加索引	120,614.60	238,177.41	改善

分析2： ORDER BY ord.ORDER_DATE DESC

有两个地方提到，显然重复，去掉第一个ORDER BY ord.ORDER_DATE DESC

去掉重复的order by	总成本	效率
去掉前	238,177.41
去掉后	158,733.84	改善

分析3：

分析：本查询采用select * from order_header where …, where语句在基本表order_header基础上加了大量的判断.

处理：采取减少基本表 order_header的大小。Sql改为：

SELECT * FROM (SELECT ROW_NUMBER() OVER (ORDER BY INNER_SET.ORDER_DATE DESC) AS ROW_NO, INNER_SET.* FROM

( select order_id,order_date

from

(select ord.order_Id, order_Type_Id, ord.status_Id, owner_Id, order_date,created_By

from Order_Header ord, Work_Effort effort

where ord.order_Type_Id = 'DRPR_ORDER' and

effort.workflow_Activity_Id='purchase_approve' and current_Status_Id='WF_RUNNING'

and ord.order_id = effort.source_Reference_Id

) tmp

where 1=1 and

(1=2 or exists (select order_id from Order_Attribute where attr_name='ORDER_CORPORATION_ID' and (1=2 or attr_Value = '10000'

or attr_Value = '11000' or attr_Value = '20000' or attr_Value = '30000' or attr_Value = '31000' or attr_Value = '50000' or

attr_Value = '60000' or attr_Value = '90000' ) and tmp.order_id=order_id )

or exists (select order_id from Order_Attribute where attr_name='BUSINESS_MODE' and attr_Value='BM_DDCG' and tmp.order_id=order_id ) )

and exists(select oi.order_id from Order_Item oi ,product prdt where 1=1 and oi.product_id=prdt.product_id and prdt.product_Name like '%海尔洗衣机%' )

ORDER BY tmp.ORDER_DATE DESC ) AS INNER_SET) AS OUTER_SET WHERE OUTER_SET.ROW_NO > 0 AND OUTER_SET.ROW_NO <= 15

对比：

Order_header 处理	总成本	效率
原sql	158,733.84	低
现sql	146,423.36	改善

分析4：

分析：分析3的sql 对order_attribute 的条件限制有重复；

处理: 修改sql

SELECT *

FROM

(SELECT ROW_NUMBER() OVER (ORDER BY INNER_SET.ORDER_DATE

DESC) AS ROW_NO, INNER_SET.*

FROM (

select order_id,order_date

from

(select header.order_Id, order_Type_Id, header.status_Id,

owner_Id, order_date,created_By

from Order_Header header,Work_Effort effort

where 1=1 and header.order_Type_Id = 'DRPR_ORDER' and

workflow_Activity_Id='purchase_approve' and current_Status_Id='WF_RUNNING'

and header.order_id = effort.source_Reference_Id

) ord

where 1=1 and (1=2 or exists

(select order_id

from Order_Attribute

where ((attr_name='ORDER_CORPORATION_ID' and (1=2 or

attr_Value = '10000' or attr_Value = '11000' or

attr_Value = '20000' or attr_Value = '30000' or

attr_Value = '31000' or attr_Value = '50000' or

attr_Value = '60000' or attr_Value = '90000' ))

or ( attr_name='BUSINESS_MODE' and attr_Value='BM_DDCG'

)) and ord.order_id=order_id )) and exists(select

oi.order_id

from Order_Item oi ,product prdt

where 1=1 and oi.product_id=prdt.product_id and prdt.product_Name

like '%海尔%' )

ORDER BY ord.ORDER_DATE DESC ) AS INNER_SET) AS OUTER_SET

WHERE OUTER_SET.ROW_NO > 0 AND OUTER_SET.ROW_NO <= 15

对比：

Sql修改	总成本	效率
原sql	146,423.36	低
现sql	136,035.56	改善

分析5：

提高本机的内存后分析4中的sql执行效率为： 26,783.24

分析 IXSCAN的时间，表order_item 索引查找时间11,778异常长，远高于其他表的查找时间。因此分析表order_item。表order_item在product_id上建有索引

select oi.order_id from Order_Item oi ,product prdt

where 1=1 and oi.product_id=prdt.product_id and prdt.product_Name

like '%海尔洗衣机%'

表中有13万条记录，product表有10万条记录，在product_id也建有索引

查看order_item表的统计信息。看出该表的统计信息未更新。

处理：

1 . 察看order_item 表的统计信息

db2 => reorgchk current statistics on table db2inst1.order_item

表统计信息：

F1: 100 * OVERFLOW / CARD < 5

F2: 100 * TSIZE / ((FPAGES-1) * (TABLEPAGESIZE-76)) > 70

F3: 100 * NPAGES / FPAGES > 80

CREATOR NAME CARD OV NP FP TSIZE F 1 F 2 F 3 REORG

--------------------------------------------------------------------------------

DB2INST1 ORDER_ITEM 138064 81 7116 7116 27888928 0 97 100 ---

--------------------------------------------------------------------------------

索引统计信息：

F4: CLUSTERRATIO 或正常化的 CLUSTERFACTOR > 80

F5: 100 * (KEYS * (ISIZE+8) + (CARD-KEYS) * 4) / (NLEAF * INDEXPAGESIZE) > 50

F6: (100-PCTFREE) * (INDEXPAGESIZE-96) / (ISIZE+12) ** (NLEVELS-2) * (INDEXPAGES

IZE-96) / (KEYS * (ISIZE+8) + (CARD-KEYS) * 4) < 100

CREATOR NAME CARD LEAF LVLS ISIZE KEYS F 4 F 5 F 6 REORG

--------------------------------------------------------------------------------

表：DB2INST1.ORDER_ITEM

DB2INST1 ORDER_ITEM_HDR 138064 369 3 10 48245 97 81 53 ---

DB2INST1 ORDER_ITEM_ORTYP 138064 192 2 6 1 100 70 0 ---

DB2INST1 ORDER_ITEM_PRODUCT 138064 201 3 12 2465 43 71 101 *-*

DB2INST1 ORDER_ITEM_STTS 138064 193 2 17 26 88 69 0 ---

DB2INST1 ORDER_ITEM_SYST 138064 192 2 6 1 100 70 0 ---

DB2INST1 PK_ORDER_ITEM 138064 783 3 15 138064 97 99 16 ---

--------------------------------------------------------------------------------

CLUSTERRATIO 或正常化的 CLUSTERFACTOR（F4）将指示索引需要

REORG，该索引与基本表不在相同的序列中。当在表中定义了多个索引时，一个或多个索引

可能被标记为需要 REORG。指定 REORG 顺序的最重要索引。

2. 重新组织索引：

reorg table db2inst1.order_item index db2inst1.ORDER_ITEM_PRODUCT use tempspace1

3. 进行统计 RUNSTATS ON TABLE db2inst1.order_item AND INDEXES ALL

4.重新运行 reorgchk

索引统计信息：

F4: CLUSTERRATIO 或正常化的 CLUSTERFACTOR > 80

F5: 100 * (KEYS * (ISIZE+8) + (CARD-KEYS) * 4) / (NLEAF * INDEXPAGESIZE) > 50

F6: (100-PCTFREE) * (INDEXPAGESIZE-96) / (ISIZE+12) ** (NLEVELS-2) * (INDEXPAGES

IZE-96) / (KEYS * (ISIZE+8) + (CARD-KEYS) * 4) < 100

CREATOR NAME CARD LEAF LVLS ISIZE KEYS F 4 F 5 F 6 REORG

--------------------------------------------------------------------------------

表：DB2INST1.ORDER_ITEM

DB2INST1 ORDER_ITEM_HDR 138086 366 3 10 48266 28 81 53 *--

DB2INST1 ORDER_ITEM_ORTYP 138086 193 2 6 1 100 69 0 ---

DB2INST1 ORDER_ITEM_PRODUCT 138086 202 3 12 2465 99 71 101 --*

DB2INST1 ORDER_ITEM_STTS 138086 194 2 17 26 83 69 0 ---

DB2INST1 ORDER_ITEM_SYST 138086 193 2 6 1 100 69 0 ---

DB2INST1 PK_ORDER_ITEM 138086 726 3 15 138086 26 106 16 *--

--------------------------------------------------------------------------------

CLUSTERRATIO 或正常化的 CLUSTERFACTOR（F4）将指示索引需要

REORG，该索引与基本表不在相同的序列中。当在表中定义了多个索引时，一个或多个索引

可能被标记为需要 REORG。指定 REORG 顺序的最重要索引。

5. REBIND 所有的数据库包，以便静态 SQL 可以利用最新的系统统计信息。

db2rbind spider81 -l logfile.out -u db2inst1 -p ibmdb2

结果：

对比：

	总成本	效率
Reorg前	26,783.24
Reorg后	48,721.15	更差了!

分析：

product 原来是ixscan,现在是 tbscan. Reorg可能改变了与之相关的表的信息。

而相关的Sql 是:

select oi.order_id from Order_Item oi ,product prdt

where 1=1 and oi.product_id=prdt.product_id and prdt.product_Name

like '%海尔洗衣机%'

处理: 在product表对字段 product_name 创建索引 PRODUCT_PRDNAME 。

再分析sql

分析: 索引用到了，但是时间 33,794.18和tbscan的34,156 相比并没有显著改善，为什么呢

运行一下: reorgchk current statistics on table db2inst1.product

表统计信息：

F1: 100 * OVERFLOW / CARD < 5

F2: 100 * TSIZE / ((FPAGES-1) * (TABLEPAGESIZE-76)) > 70

F3: 100 * NPAGES / FPAGES > 80

CREATOR NAME CARD OV NP FP TSIZE F 1 F 2 F 3 REORG

--------------------------------------------------------------------------------

DB2INST1 PRODUCT 103168 2 8688 8688 33839104 0 96 100 ---

--------------------------------------------------------------------------------

索引统计信息：

F4: CLUSTERRATIO 或正常化的 CLUSTERFACTOR > 80

F5: 100 * (KEYS * (ISIZE+8) + (CARD-KEYS) * 4) / (NLEAF * INDEXPAGESIZE) > 50

F6: (100-PCTFREE) * (INDEXPAGESIZE-96) / (ISIZE+12) ** (NLEVELS-2) * (INDEXPAGES

IZE-96) / (KEYS * (ISIZE+8) + (CARD-KEYS) * 4) < 100

CREATOR NAME CARD LEAF LVLS ISIZE KEYS F 4 F 5 F 6 REORG

--------------------------------------------------------------------------------

表：DB2INST1.PRODUCT

DB2INST1 PK_PRODUCT - - - - - - - - ---

DB2INST1 PRODUCT_PRDNAME - - - - - - - - ---

--------------------------------------------------------------------------------

CLUSTERRATIO 或正常化的 CLUSTERFACTOR（F4）将指示索引需要

REORG，该索引与基本表不在相同的序列中。当在表中定义了多个索引时，一个或多个索引

可能被标记为需要 REORG。指定 REORG 顺序的最重要索引。

结果令人吃惊，为什么索引没内容呢

再运行 RUNSTATS ON TABLE db2inst1.product AND INDEXES ALL

表统计信息：

F1: 100 * OVERFLOW / CARD < 5

F2: 100 * TSIZE / ((FPAGES-1) * (TABLEPAGESIZE-76)) > 70

F3: 100 * NPAGES / FPAGES > 80

CREATOR NAME CARD OV NP FP TSIZE F 1 F 2 F 3 REORG

--------------------------------------------------------------------------------

DB2INST1 PRODUCT 103168 2 8688 8688 33839104 0 96 100 ---

--------------------------------------------------------------------------------

索引统计信息：

F4: CLUSTERRATIO 或正常化的 CLUSTERFACTOR > 80

F5: 100 * (KEYS * (ISIZE+8) + (CARD-KEYS) * 4) / (NLEAF * INDEXPAGESIZE) > 50

F6: (100-PCTFREE) * (INDEXPAGESIZE-96) / (ISIZE+12) ** (NLEVELS-2) * (INDEXPAGES

IZE-96) / (KEYS * (ISIZE+8) + (CARD-KEYS) * 4) < 100

CREATOR NAME CARD LEAF LVLS ISIZE KEYS F 4 F 5 F 6 REORG

--------------------------------------------------------------------------------

表：DB2INST1.PRODUCT

DB2INST1 PK_PRODUCT 103168 480 3 10 103168 95 94 35 ---

DB2INST1 PRODUCT_PRDNAME 103168 271 3 23 16107 53 76 48 *--

--------------------------------------------------------------------------------

CLUSTERRATIO 或正常化的 CLUSTERFACTOR（F4）将指示索引需要

REORG，该索引与基本表不在相同的序列中。当在表中定义了多个索引时，一个或多个索引

可能被标记为需要 REORG。指定 REORG 顺序的最重要索引。

运行sql

对product建索引	总成本	效率
索引前	48,721.15
索引后	22,207.1

也比分析5的26,783.24 有了提高。

分析6：

分析：将分析4的sql 放到程序中运行，发现不加商品名称和加商品名称查询时居然记录数是一样的！仔细看sql ，原来 Order_Item 和 product关联时没和order_header关联。

处理：修改sql如下：红字为新加

SELECT *

FROM

(SELECT ROW_NUMBER() OVER (ORDER BY INNER_SET.ORDER_DATE

DESC) AS ROW_NO, INNER_SET.*

FROM (

select order_id,order_date

from

(select header.order_Id, order_Type_Id, header.status_Id,

owner_Id, order_date,created_By

from Order_Header header,Work_Effort effort

where 1=1 and header.order_Type_Id = 'DRPR_ORDER' and

workflow_Activity_Id='purchase_approve' and current_Status_Id='WF_RUNNING'

and header.order_id = effort.source_Reference_Id

) ord

where 1=1 and (1=2 or exists

(select order_id

from Order_Attribute

where ((attr_name='ORDER_CORPORATION_ID' and (1=2 or

attr_Value = '10000' or attr_Value = '11000' or

attr_Value = '20000' or attr_Value = '30000' or

attr_Value = '31000' or attr_Value = '50000' or

attr_Value = '60000' or attr_Value = '90000' ))

or ( attr_name='BUSINESS_MODE' and attr_Value='BM_DDCG'

)) and ord.order_id=order_id )) and exists(select

oi.order_id

from Order_Item oi ,product prdt

where 1=1 and oi.product_id=prdt.product_id and prdt.product_Name

like '%海尔%' and oi.order_id=ord.order_id)

ORDER BY ord.ORDER_DATE DESC ) AS INNER_SET) AS OUTER_SET

WHERE OUTER_SET.ROW_NO > 0 AND OUTER_SET.ROW_NO <= 15

Sql存取图:

	总成本	效率
改动前	22,207.1
改动后	6,819.9

再分析存取图：显然针对 product表的productName所作的索引没用上，考虑去掉productname索引 ,在运行一下sql 总成本不变。因此去掉 productName的索引。

最终处理方式:

1. work_effort加索引： pk_effort

2. sql 为分析6的sql

优化的建议：

1. 尽量减少order by、exsist 字句的使用

2. 在两个表连接列上建索引 where a=b

3．对于where (a=b or c=d) and e=f 情况, 在 and 后的e,f表的索引起作用

4. 对表作改动时，要进行必要的reorg 和runstats

contact me: mousehellen@hotmail.com

hellen1029

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

一个sql语句的优化

by hellen

一、优化前的sql 及存取方案图

1. 原Sql:

2.原语句sql分析:

二、优化过程:

分析1：

分析6：

最终 处理方式:

最终处理方式: