从 TPCH 测试学习性能优化技巧之 Q22

润乾软件

于 2020-05-21 09:54:56 发布

阅读量137

点赞数

分类专栏：集算器

本文链接：https://blog.csdn.net/raqsoft/article/details/106250284

版权

集算器专栏收录该内容

263 篇文章 5 订阅

订阅专栏

一、查询要求

Q22语句查询获得消费者可能购买的地理分布。本查询计算在指定的国家，比平均水平更持肯定态度但还没下七年订单的消费者数量。能反应出普通消费者的态度，即购买意向。

Q22语句的特点是：带有分组、排序、聚集、子查询、NOT EXISTS子查询操作并存的两表连接操作。

二、 Oracle执行

Oracle编写的查询SQL语句如下：

select /*+ parallel(n) */

cntrycode,

count(*) as numcust,

sum(c_acctbal) as totacctbal

from

(

select

substr(c_phone,1,2) as cntrycode,

c_acctbal

from

customer

where

substr(c_phone,1,2) in

('11', '14', '15', '19', '20', '21', '23')

and c_acctbal > (

select

avg(c_acctbal)

from

customer

where

c_acctbal > 0.00

and substr(c_phone,1,2) in

('11', '14', '15', '19', '20', '21', '23')

)

and not exists (

select

from

orders

where

o_custkey = c_custkey

)

) custsale

group by

cntrycode

order by

cntrycode;

其中/*+ parallel(n) */ 是Oracle的并行查询语法，n是并行数。

脚本执行时间，单位：秒

并行数	1	2	4	8	12
Oracle	128	86	70	53	48

三、 SPL优化

这个查询的外层是个常规的分组聚合，其主要复杂度在于内层的两个用于条件的子查询。这两个子查询能对应一些满足条件的customer记录，而后一个子查询是用了not exist修饰，最后结果应当是这两个子查询对应的customer记录集合的差集。

SPL脚本如下：

	A
1	=1
2	=now()
3	=["11", "14", "15", "19", "20", "21", "23"]
4	=file(path+"customer.ctx").create().cursor@m(C_CUSTKEY,C_PHONE,C_ACCTBAL;C_ACCTBAL > 0.0 && A3.contain(left(C_PHONE,2));A1).fetch()
5	=A4.avg(C_ACCTBAL)
6	=A4.select(C_ACCTBAL > A5).derive@o().keys@i(C_CUSTKEY)
7	=file(path+"orders.ctx").create().cursor@m(O_CUSTKEY;A6.find(O_CUSTKEY);A1)
8	=A7.groups(O_CUSTKEY:C_CUSTKEY)
9	=[A6,A8].merge@d(C_CUSTKEY)
10	=A9.groups(left(C_PHONE,2):cntrycode; count(1):numcust, sum(C_ACCTBAL):totacctbal)
11	=now()
12	=interval@s(A2,A11)