优化案例1:列值迭代拆分查询-【单列值转行查询】

DM技术交流QQ群:940124259

1. 场景描述

前些天群上有一波开发人员一直追问某一条列值切割查询的SQL运行很慢,他们自己几个人使尽浑身懈数,改来改去没撤,始终执行几十秒,列值越长,运行结果越慢。近日有些忙,又因他们群上光是放个截图,完全没法左右优化,今日把旁边单位事务办完,被他们追着询问,让我方帮忙优化此类慢SQL。

大致需求:通过某个用户ID返回其所属组织ID传值与另一A表做半连接查询,继续将A表所在行中列ID进行切割返回多行数据,有点类似于列转行。

在这里插入图片描述
在这里插入图片描述
看到上图,那个BI_CODE过滤条件直接写明常量值,属于中等的切割分段查询,时间都需要1.5秒,如果遇到10个ID半连接传值,岂不是15.5秒,更别谈很长的列值切分查询,难哟,难上加难。


2. 优化手段

在这里插入图片描述

SQL改写:

-- 0.110秒
with ss(职务ID, temp_num) as (
    select 职务ID, regexp_count(职务ID, ',') + 1 
    FROM "ITAI_PLATFORM"."View_会议议题申报表" 
    where bi_code='202140370' 
-- 以上先针对单值拆分优化
-- where bi_code in (SELECT cast(Ou_OrganizationID as varchar(200)) FROM ITAI_PLATFORM.Bus_OrganizationUser WHERE Ou_UserID = 20201475)
)
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(职务ID, ',', level), ',', -1)
from ss
connect by  level <= temp_num;

优化后的执行计划:

1   #NSET2: [4, 2, 1092] 
2     #PIPE2: [4, 2, 1092] 
3       #PRJT2: [3, 2, 1092]; exp_num(1), is_atom(FALSE) 
4         #PRJT2: [3, 2, 1092]; exp_num(2), is_atom(FALSE) 
5           #HIERARCHICAL QUERY: [3, 2, 1092]; KEY_NUM(0);
6             #HEAP TABLE SCAN: [1, 2, 1092]; table_no(0), 
7             #SLCT2: [1, 1, 1092]; ss.temp_num >= var1
8               #HEAP TABLE SCAN: [1, 2, 1092]; table_no(0), 
9       #HEAP TABLE: [1, 2, 1092]; table_no(0) full(FALSE), mpp_full(0) autoid(FALSE) -- with查询结果存放在临时堆表,便于后续的取集运算。
10        #PRJT2: [1, 2, 1092]; exp_num(2), is_atom(FALSE) 
11          #PRJT2: [1, 2, 1092]; exp_num(1), is_atom(FALSE) 
12            #UNION ALL: [1, 2, 1092]  -- 说明View_会议议题申报表是一个视图,合并结果集的sql查询,而且视图条件进行下放到基表。
13              #PRJT2: [1, 1, 1092]; exp_num(25), is_atom(FALSE) 
14                #CSEK2: [1, 1, 1092]; scan_type(UNIQUE), INDEX33555609(TB_会议议题申报表_MYFIELDS), scan_range['202140370','202140370']
15              #PRJT2: [1, 1, 948]; exp_num(25), is_atom(FALSE) 
16                #INDEX JOIN LEFT JOIN2: [1, 1, 948]  ret_null(0)
17                  #SLCT2: [1, 1, 948]; a.INSTANCE_CODE = '202140370'
18                    #CSCN2: [1, 12, 948]; INDEX33556088(FB_会议议题申报表_MAIN as a)
19                  #BLKUP2: [1, 1, 0]; INDEX33555758(b)
20                    #SSEK2: [1, 1, 0]; scan_type(ASC), INDEX33555758(TOPIC_CONFERENCE_TYPE as b), scan_range[a.HYLX,a.HYLX]

3. 案例再现

本次省略半连接查询表,因为现场观察其表很小结果集,对整体影响不大。

create table fstest (c1 varchar(200), c2 varchar(8000));
insert into fstest values('202140370', '20219704,20219980,20219996,20219908,20219613,20219707,20219910,20219911,20219912,20219713,202110172,202110168,202110167,20219741,20219742,20219743,20219744,20219745,20219746,202110056,202110002,20219749,20219750,20219300,20219302,20219301,20219303,20219478,202110088,202110091,20219306,20219794,20219308,20218962,20208642,20208643,202110111,20207853,20219781,20219312,20219053,20207833,20207834,20208535,20208839,20208533,202110200,20219863,202110132,20208351,202110080,20219378,20219405,20219978,20219906,20208155,20208648,20219826,202110022,20208824,20208651,20208652,20208544,20208654,20208538,20207868,20219364,20208266,202110066,20218925,20219960,202110202,20208340,20208339,20208829,20208656,20208657,20208658,20218965,20219080,20219810,20219518,20208267,20219045,20219731,20219200,20208354,20207885,202110082,202110146,20219264,20219566,20219898,20208270,20219333,20208661,20207890,202110124,20218993,202110209,20218996,202110024,20208271,20208663,20208847,202110054,20219752,20208356,202110023,20208347,20219967,20219221,20219538,202110108,20208272,20208665,20208666,20208667,20219537,20219531,20208668,20208669,202110092,20207901,20219089,20208875,20208856,20219082,202110151,20208670,20208671,20208672,20207907,20208294,20208186,202110064,202110011,202110138,202110012,202110041,202110042,20219760,202110069,202110129,20219880,20219883,202110101,20219884,20208676,20219827,20219806,20219739,20219635,20219634,20219633,202110078,20208519,20208678,20219545,20219598,20219605,20219608,20219904,202110115,202110043,20219013,20219016,20219018,202110203,20219022,20219023,20219024,20219025,20219520,20219938,202110106,20219805,20219017,20219021,20219026,20219513,20208872,20219899,20208690,20208513,20208692,202110201,20208843,20219532,20218931,20219609,20208694,20208695,202110185,20219775,20208579,20219864,20219811,20208419,20208421,202110030,20219503,202110139,20208701,20208161,20208163,20207925,20208702,20207927,20218901,20219787,20219717,20219886,20208704,20208705,202110204,20219812,202110198,20208425,20208426,20219768,202110100,20208854,20208859,20208706,20219231,202110045,20219641,20208828,20208710,202110127,202110159,20219764,20219839,20219859,20219846,20219842,20219851,20219044,20208713,20208714,20219643,20219766,20208330,20208428,20208429,20219460,202110104,202110195,20208717,20208882,20219757,20208720,202110093,202110196,20208431,202110154,202110187,202110060,20208588,20218988,20207951,20208617,20208481,20208469,20208433,20219229,20219576,202110199,20207965,20207966,20207967,20207968,20207969,20208580,20219476,20219469,20219857,20208326,20208722,20219226,202110079,20219565,20219582,202110113,20219039,20208725,20208726,20208564,20218969,20219535,20208508,20219834,20207979,20208834,20208501,20208502,20219041,20208504,20208505,20208808,20207990,20218956,20218958,20219577,20207805,20219155,20219821,20207807,20207812,202110205,20219756,20219761,202110175,20219224,20208728,202110047,20208000,20208168,20208169,20208880,20208004,20208438,20218953,20219152,202110015,20208594,20208010,20208729,20208730,20208731,20219505,20218984,20219047,20208439,20219341,20219511,202110032,20219313,20219282,20219773,202110095,202110094,20219223,20208738,20208739,20219250,20208023,202110061,20208028,20208740,20208742,20208743,20219732,20208791,20208030,202110105,20208744,20208745,20219589,20207799,20219365,20219588,20208804,20207801,20208749,20208750,20208751,20208035,20208752,202110206,20208754,202110070,20219459,20219457,20208755,20208756,20208040,20208041,202110207,20208042,20219327,20219328,20219888,20219968');
insert into fstest values('202140379','0219577,20207805,20219155,20219821,20207807,20207812,202110205,20219756,20219761,202110175,20219224,20208728,202110047,20208000,20208168,20208169,20208880,20208004,20208438,20218953,20219152,202110015,20208594,20208010,20208729,20208730,20208731,20219505,20218984,20219047,20208439,20219341,20219511,202110032,20219313,20219282,20219773,202110095,202110094,20219223,20208738,20208739,20219250,20208023,202110061,20208028,20208740,20208742,20208743,20219732,20208791,20208030,202110105,20208744,20208745,20219589,20207799,20219365,20219588,20208804,20207801,20208749,20208750,20208751,20208035,20208752,202110206,20208754,202110070,20219459,20219457,20208755,20208756,20208040,20208041,202110207,20208042,20219327');
commit;

3.1 再现案例写法

尝试书写类似于开发人员的SQL完成一小节的需求

select 
        c1, regexp_substr(c2, '[^,]+', 1, temp.num)
from
        (
            select  c1,
                    c2,
                    length(c2) - length(regexp_replace(c2, ',', '')) + 1 as split_num
            from
                    fstest
            where
                    c1='202140370'
        )
        x
join
        (
            select level num from dual connect by level <= 500
        )
        temp
on
        temp.num <= split_num;       

优化前计划:


1   #NSET2: [11, 1, 97] 
2     #PRJT2: [11, 1, 97]; exp_num(2), is_atom(FALSE) 
3       #SLCT2: [11, 1, 97]; TEMP.NUM <= X.SPLIT_NUM -- 连接条件置后,说明下面的连接操作符属于交叉连接,结果集庞大.
4         #NEST LOOP INNER JOIN2: [11, 1, 97];  -- 嵌套循环连接(外表:层次查询表SYSDUAL, 内表:FSTEST)
5           #PRJT2: [1, 1, 1]; exp_num(1), is_atom(FALSE) 
6             #PRJT2: [1, 1, 1]; exp_num(1), is_atom(FALSE) 
7               #HIERARCHICAL QUERY: [1, 1, 1]; KEY_NUM(0); -- 采用NLIJ方式层次查询
8                 #CSCN2: [1, 1, 1]; SYSINDEXSYSDUAL2(SYSDUAL2 as DUAL)
9                 #SLCT2: [1, 1, 0]; var1 <= 500 -- 层次查询子句使用LEVEL变量,不能使用HI方式
10                  #CSCN2: [1, 1, 0]; SYSINDEXSYSDUAL2(SYSDUAL2 as DUAL)
11          #PRJT2: [1, 1, 96]; exp_num(3), is_atom(FALSE) 
12            #SLCT2: [1, 1, 96]; FSTEST.C1 = '202140370'
13              #CSCN2: [1, 2, 96]; INDEX33555463(FSTEST) -- 被反复多次全表扫描

3.2 SQL改写提升

3.2.1 句型1:CTE + REGEXP_SUBSTR

-- 慢:10.058秒
-- 语句1:固化表查询,提前处理表过滤
with ss(c1,c2, temp_num) as (
select  c1, c2, regexp_count(c2, ',') + 1
from fstest 
where c1 = '202140370'
)
select c1, regexp_substr(c2, '[^,]+', 1, level)
from ss
connect by  level <= temp_num; 

在这里插入图片描述

3.2.2 句型2:CTE + SUBSTRING_INDEX

-- 语句2: 替换regexp_substr正则表达式截取函数,换用substring_index普通字符串函数.
with ss(c1, c2, temp_num) as (
    select  c1, c2, regexp_count(c2, ',') + 1
    from fstest 
    where c1 = '202140370'
)
select c1, substring_index(substring_index(c2, ',', level), ',', -1)
from ss
connect by  level <= temp_num; 

在这里插入图片描述

1   #NSET2: [2, 1, 96] 
2     #PIPE2: [2, 1, 96]  -- 合并临时表与层次查询运算结果,关联过滤SS.TEMP_NUM >= var1
3       #PRJT2: [2, 1, 96]; exp_num(2), is_atom(FALSE) 
4         #PRJT2: [2, 1, 96]; exp_num(3), is_atom(FALSE) 
5           #HIERARCHICAL QUERY: [2, 1, 96]; KEY_NUM(0);
6             #HEAP TABLE SCAN: [1, 1, 96]; table_no(0),  -- 左表:驱动表,依次传值到右表关联过滤查询
7             #SLCT2: [1, 1, 96]; SS.TEMP_NUM >= var1 -- 迭代比较
8               #HEAP TABLE SCAN: [1, 1, 96]; table_no(0), 
9       #HEAP TABLE: [1, 1, 96]; table_no(0) full(FALSE), mpp_full(0) autoid(FALSE) -- 内存堆表临时存放已经过滤好结果集,只被扫描一次全表
10        #PRJT2: [1, 1, 96]; exp_num(3), is_atom(FALSE) 
11          #SLCT2: [1, 1, 96]; FSTEST.C1 = '202140370'
12            #CSCN2: [1, 2, 96]; INDEX33555463(FSTEST)

在这里插入图片描述

3.2.3 句型3:派生表查询 + INSTR + SUBSTR + TRANSLATE

如果数据库版本是8-1-1-190以下的,substring_index函数没得现成,换其他思路并改用其他文本函数。

-- 语句3: 不用正则表达式函数,奇思妙想,换花招。
select  c1, 
		substr(c2, startpos + 1, endpos - startpos - 1) split_column, 
		startpos, 
		endpos, 
		endpos - startpos - 1 lenchar
from (
	select c1, c2, instr(c2, ',', 1, level) startpos,  instr(c2, ',', 1, level + 1 ) endpos
	from (
		select  c1, ',' || c2 || ',' c2
		from fstest 
		where c1 = '202140370'
		)
	connect by level <= length(translate(c2, ',' || c2, ',')) - 1
);

在这里插入图片描述


4. YY瞎论

刚开始通过表面执行计划比较,改写大方向上的SQL代码,以为有所提升,胸有成竹。随着细致地排查,终发现端倪,也就是REGEXP_SUBSTR这个函数在1-126版本的库环境执行是飞叉叉地快,拿到2-94版本的库环境执行像蜗牛,令i非常费解,难道新版本对这函数升级过代码?暂时先不管,大家可以按着我的思路改写,至于这个bug,本人向公司研发上报。如果大伙儿有更为高效和震撼的方法实现同目标问题,欢迎加入鄙人的QQ群,一起技术讨论,共同成长。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值