SQL等价改写优化案例精选第一部【一统天下】③ group by (case when)的聚合艺术

梁敬彬梁敬弘兄弟出品

往期回顾

SQL等价改写优化案例精选第一部【一统天下】① count(case when)的合并魔力
SQL等价改写优化案例精选第一部【一统天下】②分析函数的穿透之力

本案例将展示如何利用GROUP BY与CASE WHEN的巧妙结合,将一个复杂的UNION查询转化为简洁高效的单一查询,减少表扫描次数,大幅提升性能。

在这里插入图片描述

案例3 GROUP BY (CASE WHEN) 之合并大杀器

select decode(so.sFileName, 'SNP_20', 'SNP', 'HNIC_2', 'HNIC', 'IBRC_2', 'IBRC', 'IISMP_', 'IISMP', 'NIC_20', 'NIC', 'NIG_20', 'NIG', 'IIC_20', 'IIC', 'HIIC_2', 'HIIC', 'CA.D.A', 'CA.D.ATSR', 'ULH_20', 'ULH', 'IBRST_', 'IBRST', so.sFileName) 业务名称,
       so.sFileCount 合并前文件个数,
       so.sRecordNum 合并前总记录数,
       ta.tFileCount 合并后文件个数,
       ta.tRecordNum 合并后总记录数,
       NVL(so1.sFileCount, 0) 合并前当天文件个数,
       NVL(so1.sRecordNum, 0) 合并前当天文件总记录数,
       NVL(so2.sFileCount, 0) 合并前昨天文件个数,
       NVL(so2.sRecordNum, 0) 合并前昨天文件总记录数
from (select substr(a.file_name, 1, 6) sfileName,
             count(*) sFileCount,
             sum(sRecordNum) sRecordNum
      from (select distinct bsf.file_name,
                   bsf.record_num sRecordNum,
                   bsf.create_time
            from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
            where bsf.file_id = ipar.bus_file_id 
                  and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')) a
      group by substr(a.file_name, 1, 6)
      order by substr(a.file_name, 1, 6)) so
LEFT JOIN 
      (select substr(a.file_name, 1, 6) sfileName,
              count(*) sFileCount,
              sum(sRecordNum) sRecordNum
       from (select distinct bsf.file_name,
                    bsf.record_num sRecordNum,
                    bsf.create_time
             from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
             where bsf.file_id = ipar.bus_file_id
                   and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')
                   AND FILE_NAME LIKE '%20100805%') a
       group by substr(a.file_name, 1, 6)
       order by substr(a.file_name, 1, 6)) so1
ON (so.sFileName = so1.sFileName)
LEFT JOIN 
       (select substr(a.file_name, 1, 6) sfileName,
               count(*) sFileCount,
               sum(sRecordNum) sRecordNum
        from (select distinct bsf.file_name,
                     bsf.record_num sRecordNum,
                     bsf.create_time
              from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
              where bsf.file_id = ipar.bus_file_id
                    and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')
                    AND FILE_NAME not LIKE '%20100805%') a
        group by substr(a.file_name, 1, 6)
        order by substr(a.file_name, 1, 6)) so2
ON (so.sFileName = so2.sFileName)
LEFT JOIN 
        (select substr(a.file_name, 1, 6) tFileName,
                count(*) tFileCount,
                sum(record_num) tRecordNum
         from (select distinct ipsf.file_name, 
                      ipsf.record_num
               from idep_plugin_send_filelist ipsf
               where trunc(ipsf.create_time) = to_date('2010-08-05', 'yyyy-mm-dd')
                     and remark = '处理成功') a
         group by substr(a.file_name, 1, 6)
         order by substr(a.file_name, 1, 6)) ta
         ON (so.sFileName = ta.tFileName)
         where so.sFileName not like 'MVI%'
union
select so.sFileName,
       so.sFileCount,
       (so.sRecordNum - (so.sFileCount * 2)) sRecordNum,
       ta.tFileCount,
       ta.tRecordNum,
       NVL(so1.sFileCount, 0),
       (nvl(so1.sRecordNum, 0) - (nvl(so1.sFileCount, 0) * 2)),
       NVL(so2.sFileCount, 0),
       (nvl(so2.sRecordNum, 0) - (nvl(so2.sFileCount, 0) * 2))
from (select substr(a.file_name, 1, 3) sfileName,
             count(*) sFileCount,
             sum(sRecordNum) sRecordNum
      from (select distinct bsf.file_name,
                   bsf.record_num sRecordNum,
                   bsf.create_time
            from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
            where bsf.file_id = ipar.bus_file_id
                  and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')) a
      group by substr(a.file_name, 1, 3)
      order by substr(a.file_name, 1, 3)) so
LEFT JOIN 
     (select substr(a.file_name, 1, 3) sfileName,
             count(*) sFileCount,
             sum(sRecordNum) sRecordNum
      from (select distinct bsf.file_name,
                   bsf.record_num sRecordNum,
                   bsf.create_time
            from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
            where bsf.file_id = ipar.bus_file_id
                  and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')
                  AND FILE_NAME LIKE 'MVI100805%') a
      group by substr(a.file_name, 1, 3)
      order by substr(a.file_name, 1, 3)) so1
ON (so.sFileName = so1.sFileName)
LEFT JOIN 
     (select substr(a.file_name, 1, 3) sfileName,
             count(*) sFileCount,
             sum(sRecordNum) sRecordNum
      from (select distinct bsf.file_name,
                   bsf.record_num sRecordNum,
                   bsf.create_time
            from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
            where bsf.file_id = ipar.bus_file_id
                  and trunc(ipar.relation_time) = to_date('2010-08-05', 'yyyy-mm-dd')
                  AND FILE_NAME not LIKE 'MVI100805%') a
      group by substr(a.file_name, 1, 3)
      order by substr(a.file_name, 1, 3)) so2
ON (so.sFileName = so2.sFileName)
LEFT JOIN 
     (select substr(a.file_name, 1, 3) tFileName,
             count(*) tFileCount,
             sum(record_num) tRecordNum
      from (select distinct ipsf.file_name, 
                   ipsf.record_num
            from idep_plugin_send_filelist ipsf
            where trunc(ipsf.create_time) = to_date('2010-08-05', 'yyyy-mm-dd')
                  and remark = '处理成功') a
      group by substr(a.file_name, 1, 3)
      order by substr(a.file_name, 1, 3)) ta
ON (so.sFileName = ta.tFileName)
WHERE so.sFileName = 'MVI' 

分析

1.将trunc(ipar.relation_time) = to_date(‘2010-08-05’, ‘yyyy-mm-dd’) 等等类似之处改写为如下,要避免对列进行运算,这样会导致用不上索引,除非是建立了函数索引。

ipsf.create_time >= to_date('2010-08-05', 'yyyy-mm-dd') and 
ipsf.create_time < to_date('2010-08-05', 'yyyy-mm-dd')+1

2.可通过CASE WHEN 语句进一步减少表扫描次数,如

count(CASE WHEN FILE_NAME LIKE '%20100805%' THEN 1 END) sFileCount1)

,类似如上的修改,可以等价改写,将本应用的表扫描从8次减少为4次。

3.更进一步 利用

group by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)

,可将表扫描从4次再次减少为2次

最终SQL改写优化为:

select decode(so.sFileName, 'SNP_20', 'SNP', 'HNIC_2', 'HNIC', 'IBRC_2', 'IBRC', 'IISMP_', 'IISMP', 'NIC_20', 'NIC', 'NIG_20', 'NIG', 'IIC_20', 'IIC', 'HIIC_2', 'HIIC', 'CA.D.A', 'CA.D.ATSR', 'ULH_20', 'ULH', 'IBRST_', 'IBRST', so.sFileName) 业务名称,
       so.sFileCount 合并前文件个数,
       case when so.sfilename like 'MVI%' then (so.sRecordNum - (so.sFileCount * 2)) else so.sRecordNum end 合并前总记录数,
       ta.tFileCount 合并后文件个数,
       ta.tRecordNum 合并后总记录数,
       NVL(so.sFileCount1, 0) 合并前当天文件个数,
       case when so.sfilename like 'MVI%' 
       then  (nvl(so.sRecordNum1, 0) - (nvl(so.sFileCount1, 0) * 2))  
       else NVL(so.sRecordNum1, 0) end  合并前当天文件总记录数,
       NVL(so.sFileCount2, 0) 合并前昨天文件个数,
       case when so.sfilename like 'MVI%' 
       then  (nvl(so.sRecordNum2, 0) - (nvl(so.sFileCount2, 0) * 2))  
       else NVL(so.sRecordNum2, 0) end 合并前昨天文件总记录数
from (select substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END) sfileName,
             count(*) sFileCount,
             sum(sRecordNum) sRecordNum,
             count(CASE WHEN (FILE_NAME LIKE '%20100805%'  AND FILE_NAME not like 'MVI%') OR (FILE_NAME LIKE  'MVI100805%'  AND FILE_NAME like 'MVI%') THEN 1 END) sFileCount1,
             sum  (CASE WHEN (FILE_NAME LIKE '%20100805%'  AND FILE_NAME not like 'MVI%') OR (FILE_NAME LIKE  'MVI100805%'  AND FILE_NAME like 'MVI%') THEN sRecordNum END) sRecordNum1,
             count(CASE WHEN (FILE_NAME NOT LIKE '%20100805%' AND FILE_NAME not like 'MVI%') OR (FILE_NAME NOT LIKE  'MVI100805%' AND FILE_NAME like 'MVI%') THEN 1 END) sFileCount2,
             sum  (CASE WHEN (FILE_NAME NOT LIKE '%20100805%' AND FILE_NAME not like 'MVI%') OR (FILE_NAME NOT LIKE  'MVI100805%' AND FILE_NAME like 'MVI%') THEN sRecordNum END) sRecordNum2
      from (select distinct bsf.file_name,
                   bsf.record_num sRecordNum,
                   bsf.create_time
            from BUSINESS_SEND_FILELIST bsf, IDEP_PLUGIN_AUTO_RELATION ipar
            where bsf.file_id = ipar.bus_file_id 
                  and ipar.relation_time >= to_date('2010-08-05', 'yyyy-mm-dd')
                  and ipar.relation_time < to_date('2010-08-05', 'yyyy-mm-dd')+1) a
      group by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)
      order by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)) so
LEFT JOIN 
        (select substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END) tFileName,
                count(*) tFileCount,
                sum(record_num) tRecordNum
         from (select distinct ipsf.file_name, 
                      ipsf.record_num
               from idep_plugin_send_filelist ipsf
               where ipsf.create_time >= to_date('2010-08-05', 'yyyy-mm-dd')
                     and ipsf.create_time < to_date('2010-08-05', 'yyyy-mm-dd')+1
                     and remark = '处理成功') a
         group by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)
         order by substr(a.file_name, 1, CASE WHEN a.file_name like 'MVI%' THEN 3 ELSE 6 END)) ta
         ON (so.sFileName = ta.tFileName)

优化亮点解析

这个优化案例展示了几个精巧的SQL改写技巧:

  1. 动态分组长度:使用CASE WHEN动态决定截取文件名的长度,统一处理MVI文件和非MVI文件,消除了UNION操作。
  2. 条件聚合函数:通过COUNT(CASE WHEN…)和SUM(CASE
    WHEN…)在一次查询中完成多种条件的统计,替代了原来的多个LEFT JOIN。
  3. 索引友好的日期条件:使用范围条件代替TRUNC函数,使索引能够发挥作用。
  4. 内联计算特殊值:直接在主查询中处理MVI文件的特殊记录数计算,避免了额外的分支逻辑。

性能提升效果

通过这一系列优化,我们将表扫描次数从原来的8次减少到了2次,同时保持了原有的业务逻辑不变。这带来了显著的性能改善:

  • 查询响应时间大幅降低
  • 数据库资源占用减少
  • 系统整体吞吐量提升

总结

本案例充分展示了GROUP BY与CASE WHEN组合使用的强大威力。通过动态调整分组条件和使用条件聚合函数,我们可以大幅减少重复查询和表扫描次数,提升SQL执行效率。这种"一统天下"的优化思路,不仅简化了SQL结构,还带来了显著的性能提升,是数据库优化中值得掌握的关键技巧。

当面对类似的复杂查询时,不妨尝试从合并相似逻辑、减少表扫描次数的角度思考,看是否能用GROUP BY与CASE WHEN的组合来简化查询结构,提升性能。

未完待续…
SQL等价改写优化案例精选第一部【一统天下】④ or 与 union all的优化交响

系列回顾

“大白话人工智能” 系列
“数据库拍案惊奇” 系列
“世事洞明皆学问” 系列

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

收获不止数据库

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值