梁敬彬梁敬弘兄弟出品
往期回顾
SQL等价改写优化案例精选第一部【一统天下】① count(case when)的合并魔力
SQL等价改写优化案例精选第一部【一统天下】②分析函数的穿透之力
本案例将展示如何利用GROUP BY与CASE WHEN的巧妙结合,将一个复杂的UNION查询转化为简洁高效的单一查询,减少表扫描次数,大幅提升性能。
案例3 GROUP BY (CASE WHEN) 之合并大杀器
select decode(so.sFileName, 'SNP_20', 'SNP', 'HNIC_2', 'HNIC', 'IBRC_2', 'IBRC', 'IISMP_', 'IISMP', 'NIC_20', 'NIC', 'NIG_20', 'NIG', 'IIC_20', 'IIC', 'HIIC_2', 'HIIC', 'CA.D.A', 'CA.D.ATSR', 'ULH_20', 'ULH', 'IBRST_', 'IBRST', so.sFileName) 业务名称,
so.sFileCount 合并前文件个数,
so.sRecordNum 合并前总记录数,
ta.tFileCount 合并后文件个数,
ta.tRecordNum 合并后总记录数,
NVL(so1.sFileCount, 0) 合并前当天文件个数,
NVL(so1.sRecordNum, 0) 合并前当天文件总记录数,
NVL(so2.sFileCount, 0) 合并前昨天文件个数,
NVL(so2.sRecordNum, 0) 合并前昨天文件总记录数
from (select substr(a.file_name, 1, 6) sfileName,
count(*) sFileCount,
sum(sRecordNum) sRecordNum
from (select distinct bsf.file_name,
bsf.record_num sRecordNum,
bsf.create_time
from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
where bsf.file_id = ipar.bus_file_id
and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')) a
group by substr(a.file_name, 1, 6)
order by substr(a.file_name, 1, 6)) so
LEFT JOIN
(select substr(a.file_name, 1, 6) sfileName,
count(*) sFileCount,
sum(sRecordNum) sRecordNum
from (select distinct bsf.file_name,
bsf.record_num sRecordNum,
bsf.create_time
from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
where bsf.file_id = ipar.bus_file_id
and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')
AND FILE_NAME LIKE '%20100805%') a
group by substr(a.file_name, 1, 6)
order by substr(a.file_name, 1, 6)) so1
ON (so.sFileName = so1.sFileName)
LEFT JOIN
(select substr(a.file_name, 1, 6) sfileName,
count(*) sFileCount,
sum(sRecordNum) sRecordNum
from (select distinct bsf.file_name,
bsf.record_num sRecordNum,
bsf.create_time
from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
where bsf.file_id = ipar.bus_file_id
and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')
AND FILE_NAME not LIKE '%20100805%') a
group by substr(a.file_name, 1, 6)
order by substr(a.file_name, 1, 6)) so2
ON (so.sFileName = so2.sFileName)
LEFT JOIN
(select substr(a.file_name, 1, 6) tFileName,
count(*) tFileCount,
sum(record_num) tRecordNum
from (select distinct ipsf.file_name,
ipsf.record_num
from idep_plugin_send_filelist ipsf
where trunc(ipsf.create_time) = to_date('2010-08-05', 'yyyy-mm-dd')
and remark = '处理成功') a
group by substr(a.file_name, 1, 6)
order by substr(a.file_name, 1, 6)) ta
ON (so.sFileName = ta.tFileName)
where so.sFileName not like 'MVI%'
union
select so.sFileName,
so.sFileCount,
(so.sRecordNum - (so.sFileCount * 2)) sRecordNum,
ta.tFileCount,
ta.tRecordNum,
NVL(so1.sFileCount, 0),
(nvl(so1.sRecordNum, 0) - (nvl(so1.sFileCount, 0) * 2)),
NVL(so2.sFileCount, 0),
(nvl(so2.sRecordNum, 0) - (nvl(so2.sFileCount, 0) * 2))
from (select substr(a.file_name, 1, 3) sfileName,
count(*) sFileCount,
sum(sRecordNum) sRecordNum
from (select distinct bsf.file_name,
bsf.record_num sRecordNum,
bsf.create_time
from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
where bsf.file_id = ipar.bus_file_id
and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')) a
group by substr(a.file_name, 1, 3)
order by substr(a.file_name, 1, 3)) so
LEFT JOIN
(select substr(a.file_name, 1, 3) sfileName,
count(*) sFileCount,
sum(sRecordNum) sRecordNum
from (select distinct bsf.file_name,
bsf.record_num sRecordNum,
bsf.create_time
from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
where bsf.file_id = ipar.bus_file_id
and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')
AND FILE_NAME LIKE 'MVI100805%') a
group by substr(a.file_name, 1, 3)
order by substr(a.file_name, 1, 3)) so1
ON (so.sFileName = so1.sFileName)
LEFT JOIN
(select substr(a.file_name, 1, 3) sfileName,
count(*) sFileCount,
sum(sRecordNum) sRecordNum
from (select distinct bsf.file_name,
bsf.record_num sRecordNum,
bsf.create_time
from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
where bsf.file_id = ipar.bus_file_id
and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')
AND FILE_NAME not LIKE 'MVI100805%') a
group by substr(a.file_name, 1, 3)
order by substr(a.file_name, 1, 3)) so2
ON (so.sFileName = so2.sFileName)
LEFT JOIN
(select substr(a.file_name, 1, 3) tFileName,
count(*) tFileCount,
sum(record_num) tRecordNum
from (select distinct ipsf.file_name,
ipsf.record_num
from idep_plugin_send_filelist ipsf
where trunc(ipsf.create_time) = to_date('2010-08-05', 'yyyy-mm-dd')
and remark = '处理成功') a
group by substr(a.file_name, 1, 3)
order by substr(a.file_name, 1, 3)) ta
ON (so.sFileName = ta.tFileName)
WHERE so.sFileName = 'MVI'
分析
1.将trunc(ipar.relation_time) = to_date(‘2010-08-05’, ‘yyyy-mm-dd’) 等等类似之处改写为如下,要避免对列进行运算,这样会导致用不上索引,除非是建立了函数索引。
ipsf.create_time >= to_date('2010-08-05', 'yyyy-mm-dd') and
ipsf.create_time < to_date('2010-08-05', 'yyyy-mm-dd')+1
2.可通过CASE WHEN 语句进一步减少表扫描次数,如
(count(CASE WHEN FILE_NAME LIKE '%20100805%' THEN 1 END) sFileCount1)
,类似如上的修改,可以等价改写,将本应用的表扫描从8次减少为4次。
3.更进一步 利用
group by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)
,可将表扫描从4次再次减少为2次
最终SQL改写优化为:
select decode(so.sFileName, 'SNP_20', 'SNP', 'HNIC_2', 'HNIC', 'IBRC_2', 'IBRC', 'IISMP_', 'IISMP', 'NIC_20', 'NIC', 'NIG_20', 'NIG', 'IIC_20', 'IIC', 'HIIC_2', 'HIIC', 'CA.D.A', 'CA.D.ATSR', 'ULH_20', 'ULH', 'IBRST_', 'IBRST', so.sFileName) 业务名称,
so.sFileCount 合并前文件个数,
case when so.sfilename like 'MVI%' then (so.sRecordNum - (so.sFileCount * 2)) else so.sRecordNum end 合并前总记录数,
ta.tFileCount 合并后文件个数,
ta.tRecordNum 合并后总记录数,
NVL(so.sFileCount1, 0) 合并前当天文件个数,
case when so.sfilename like 'MVI%'
then (nvl(so.sRecordNum1, 0) - (nvl(so.sFileCount1, 0) * 2))
else NVL(so.sRecordNum1, 0) end 合并前当天文件总记录数,
NVL(so.sFileCount2, 0) 合并前昨天文件个数,
case when so.sfilename like 'MVI%'
then (nvl(so.sRecordNum2, 0) - (nvl(so.sFileCount2, 0) * 2))
else NVL(so.sRecordNum2, 0) end 合并前昨天文件总记录数
from (select substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END) sfileName,
count(*) sFileCount,
sum(sRecordNum) sRecordNum,
count(CASE WHEN (FILE_NAME LIKE '%20100805%' AND FILE_NAME not like 'MVI%') OR (FILE_NAME LIKE 'MVI100805%' AND FILE_NAME like 'MVI%') THEN 1 END) sFileCount1,
sum (CASE WHEN (FILE_NAME LIKE '%20100805%' AND FILE_NAME not like 'MVI%') OR (FILE_NAME LIKE 'MVI100805%' AND FILE_NAME like 'MVI%') THEN sRecordNum END) sRecordNum1,
count(CASE WHEN (FILE_NAME NOT LIKE '%20100805%' AND FILE_NAME not like 'MVI%') OR (FILE_NAME NOT LIKE 'MVI100805%' AND FILE_NAME like 'MVI%') THEN 1 END) sFileCount2,
sum (CASE WHEN (FILE_NAME NOT LIKE '%20100805%' AND FILE_NAME not like 'MVI%') OR (FILE_NAME NOT LIKE 'MVI100805%' AND FILE_NAME like 'MVI%') THEN sRecordNum END) sRecordNum2
from (select distinct bsf.file_name,
bsf.record_num sRecordNum,
bsf.create_time
from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
where bsf.file_id = ipar.bus_file_id
and ipar.relation_time >= to_date('2010-08-05', 'yyyy-mm-dd')
and ipar.relation_time < to_date('2010-08-05', 'yyyy-mm-dd')+1) a
group by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)
order by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)) so
LEFT JOIN
(select substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END) tFileName,
count(*) tFileCount,
sum(record_num) tRecordNum
from (select distinct ipsf.file_name,
ipsf.record_num
from idep_plugin_send_filelist ipsf
where ipsf.create_time >= to_date('2010-08-05', 'yyyy-mm-dd')
and ipsf.create_time < to_date('2010-08-05', 'yyyy-mm-dd')+1
and remark = '处理成功') a
group by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)
order by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)) ta
ON (so.sFileName = ta.tFileName)
优化亮点解析
这个优化案例展示了几个精巧的SQL改写技巧:
- 动态分组长度:使用CASE WHEN动态决定截取文件名的长度,统一处理MVI文件和非MVI文件,消除了UNION操作。
- 条件聚合函数:通过COUNT(CASE WHEN…)和SUM(CASE
WHEN…)在一次查询中完成多种条件的统计,替代了原来的多个LEFT JOIN。 - 索引友好的日期条件:使用范围条件代替TRUNC函数,使索引能够发挥作用。
- 内联计算特殊值:直接在主查询中处理MVI文件的特殊记录数计算,避免了额外的分支逻辑。
性能提升效果
通过这一系列优化,我们将表扫描次数从原来的8次减少到了2次,同时保持了原有的业务逻辑不变。这带来了显著的性能改善:
- 查询响应时间大幅降低
- 数据库资源占用减少
- 系统整体吞吐量提升
总结
本案例充分展示了GROUP BY与CASE WHEN组合使用的强大威力。通过动态调整分组条件和使用条件聚合函数,我们可以大幅减少重复查询和表扫描次数,提升SQL执行效率。这种"一统天下"的优化思路,不仅简化了SQL结构,还带来了显著的性能提升,是数据库优化中值得掌握的关键技巧。
当面对类似的复杂查询时,不妨尝试从合并相似逻辑、减少表扫描次数的角度思考,看是否能用GROUP BY与CASE WHEN的组合来简化查询结构,提升性能。
未完待续…
SQL等价改写优化案例精选第一部【一统天下】④ or 与 union all的优化交响