SQL等价改写优化案例精选第一部【一统天下】① count(case when)的合并魔力

前言

SQL优化的终极境界是回归需求本质。理解需求,正是高效SQL的起点,而改写SQL往往是优化的关键手段。

本系列暂时不对执行计划做过多的分析,虽然它在SQL优化中至关重要。原因很简单:如果不能深入理解业务需求,即使精通执行计划分析,面对那些"九曲十八弯"的复杂SQL,我们依然会束手无策。

文中案例的SQL语句全部来自真实的生产系统,SQL语句大多非常繁琐,希望大家有空的时候能克服烦躁,耐心阅读完。我在审核这样的SQL时不但要看懂这么繁琐的语句,还要完成高效的等价改写来调优,这需要读懂开发人员代码编写的需求背后的真正含义。

低效语句往往在开发投入生产初期由于数据量少而不能被识别出来,等运行了一段时间数据量大涨后才暴露出来。根据经验,生产中和数据库相关的80%的故障都来自不良SQL,因此今天根据审核SQL的心得,总结为如下三部分,共计12个经典案例,希望对大家有用。

在这里插入图片描述

一.统一天下(合并代码,减少表扫描)

合理整合SQL逻辑,一次查询解决多个需求,让数据库少跑路。

案例1 注重COUNT(CASE WHEN)的合并魔力

看到这个SQL,你可能第一反应是"天啊,这么长!":

select distinct ne_state.peer_id peer_name,
                         to_char(ne_state.ne_state) peer_state,
                         (case
                           when ne_state.ne_state = 0 then
                            to_char(0)
                           else
                            (select distinct to_char(nvl(ne_active.active, 0))
                               from dcc_sys_log,
                                    (select peer_id,
                                            decode(action,
                                                   'active',
                                                   1,
                                                   'de-active',
                                                   0,
                                                   0) active,
                                            max(log_time)
                                       from dcc_sys_log
                                      where action = 'active'
                                         or action = 'de-active'
                                      group by (peer_id, action)) ne_active
                              where dcc_sys_log.peer_id = ne_active.peer_id(+)
                                and dcc_sys_log.peer_id = ne_state.peer_id)
                         end) peer_active,
                         (case
                           when ne_state.ne_state = 0 then
                            to_char(0)
                           else
                            (to_char(nvl((select count(*)
                                           from dcc_ne_log
                                          where dcc_ne_log.result <> 1
                                            and peer_id = ne_state.peer_id
                                            and log_time between
                                                trunc(sysdate) and sysdate
                                          group by (peer_id)),
                                         0)))
                         end) err_cnt,
                         (case
                           when ne_state.ne_state = 0 then
                            to_char(0)
                           else
                            (to_char(nvl((select count(*)
                                           from dcc_ne_log in_dnl
                                          where in_dnl.direction = 'recv'
                                            and in_dnl.peer_id =
                                                ne_state.peer_id
                                            and log_time between
                                                trunc(sysdate) and sysdate),
                                         0)))
                         end) recv_cnt,
                         (case
                           when ne_state.ne_state = 0 then
                            to_char(0)
                           else
                            (to_char(nvl((select sum(length)
                                           from dcc_ne_log in_dnl
                                          where in_dnl.direction = 'recv'
                                            and in_dnl.peer_id =
                                                ne_state.peer_id
                                            and log_time between
                                                trunc(sysdate) and sysdate),
                                         0)))
                         end) recv_byte,
                         (case
                           when ne_state.ne_state = 0 then
                            to_char(0)
                           else
                            (to_char(nvl((select count(*)
                                           from dcc_ne_log in_dnl
                                          where in_dnl.direction = 'send'
                                            and in_dnl.peer_id =
                                                ne_state.peer_id
                                            and log_time between
                                                trunc(sysdate) and sysdate),
                                         0)))
                         end) send_cnt,
                         (case
                           when ne_state.ne_state = 0 then
                            to_char(0)
                           else
                            (to_char(nvl((select sum(length)
                                           from dcc_ne_log in_dnl
                                          where in_dnl.direction = 'send'
                                            and in_dnl.peer_id =
                                                ne_state.peer_id
                                            and log_time between
                                                trunc(sysdate) and sysdate),
                                         0)))
                         end) send_byte
           from dcc_ne_log,
                (select distinct dsl1.peer_id peer_id,
                                 nvl(ne_disconnect_info.ne_state, 1) ne_state
                   from dcc_sys_log dsl1,
                        (select distinct dnl.peer_id peer_id,
                                         decode(action,
                                                'disconnect',
                                                0,
                                                'connect',
                                                0,
                                                1) ne_state
                           from dcc_sys_log dsl, dcc_ne_log dnl
                          where dsl.peer_id = dnl.peer_id
                            and ((dsl.action = 'disconnect' and
                                dsl.cause = '关闭对端') or
                                (dsl.action = 'connect' and
                                dsl.cause = '连接主机失败'))
                            and dsl.log_time =
                                (select max(log_time)
                                   from dcc_sys_log
                                  where peer_id = dnl.peer_id
                                    and log_type = '对端交互')) ne_disconnect_info
                  where dsl1.peer_id = ne_disconnect_info.peer_id(+)) ne_state
          where ne_state.peer_id = dcc_ne_log.peer_id(+)

问题诊断:

  • 一眼看去,代码冗长复杂
  • 多次重复扫描同一张表(dcc_ne_log),仅条件不同
  • 存在多个子查询和复杂嵌套
  • 执行时间:超过300秒!

优化思路:寻找合并机会

分析发现:err_cnt、recv_cnt、recv_byte、send_cnt、send_byte这5个字段全部来自同一张表,只是过滤条件不同。这是典型的"可合并查询"场景!

魔法武器:COUNT(CASE WHEN)和SUM(CASE WHEN)

select peer_id
          ,COUNT(CASE WHEN RESULT <> 1 THEN 1 END) err_cnt
          ,COUNT(CASE WHEN direction = 'recv' THEN 1 END) recv_cnt
          ,SUM(CASE WHEN direction = 'recv' THEN length END) recv_byte
          ,COUNT(CASE WHEN direction = 'send' THEN 1 END) send_cnt
          ,SUM(CASE WHEN direction = 'send' THEN length END) send_byte
  from dcc_ne_log 
          where log_time >=trunc(sysdate)
          GROUP BY peer_id

进一步分析如下ne_disconnect_info结果集的写法:

select distinct dsl1.peer_id peer_id,
                                 nvl(ne_disconnect_info.ne_state, 1) ne_state
                   from dcc_sys_log dsl1,
                        (select distinct dnl.peer_id peer_id,
                                         decode(action,
                                                'disconnect',
                                                0,
                                                'connect',
                                                0,
                                                1) ne_state
                           from dcc_sys_log dsl, dcc_ne_log dnl
                          where dsl.peer_id = dnl.peer_id
                            and ((dsl.action = 'disconnect' and
                                dsl.cause = '关闭对端') or
                                (dsl.action = 'connect' and
                                dsl.cause = '连接主机失败'))
                            and log_type = '对端交互'
                            and dsl.log_time =
                                (select max(log_time)
                                   from dcc_sys_log
                                  where peer_id = dnl.peer_id
                                    and log_type = '对端交互')) ne_disconnect_info
                  where dsl1.peer_id = ne_disconnect_info.peer_id(+)

为了清晰的将结构展现出来,将上述两次改造的脚本用WITH子句分别封装为ne_state和dcc_ne_log_time,最终整体SQL语句改写完毕后,代码量大大减少的同时性能极大的提升了,表扫描次数从总计7次缩减为2次,执行完成时间从原先的300秒缩短为5秒。完整改造后的最终SQL代码优雅精致,具体如下:

with ne_state as                                                                 
(SELECT a.peer_id,
CASE WHEN dnl.peer_id IS NOT NULL AND str IN ('disconnect关闭对端','connect连接主机失败') THEN '0' ELSE '1' END ne_state
FROM (SELECT peer_id,MIN(action||cause) KEEP(DENSE_RANK LAST ORDER BY log_time) str 
      FROM dcc_sys_log dsl
      WHERE log_type = '对端交互'
      GROUP BY peer_id
) a,(SELECT DISTINCT peer_id FROM dcc_ne_log) dnl 
WHERE a.peer_id = dnl.peer_id(+)),
dcc_ne_log_time as (select peer_id
                          ,COUNT(CASE WHEN RESULT <> 1 THEN 1 END) err_cnt
                          ,COUNT(CASE WHEN direction = 'recv' THEN 1 END) recv_cnt
                          ,SUM(CASE WHEN direction = 'recv' THEN length END) recv_byte
                          ,COUNT(CASE WHEN direction = 'send' THEN 1 END) send_cnt
                          ,SUM(CASE WHEN direction = 'send' THEN length END) send_byte
                     from dcc_ne_log 
                     where log_time >=trunc(sysdate) ---- between trunc(sysdate) and sysdate 
                     GROUP BY peer_id)                                                                    
select distinct ne_state.peer_id peer_name,                                       
                         to_char(ne_state.ne_state) peer_state,                   
                         (case                                                    
                           when ne_state.ne_state = 0 then                        
                            to_char(0)                                            
                           else                                                   
                         NVL((select '1' from dcc_sys_log where peer_id = ne_state.peer_id and action = 'active' and rownum=1),'0')  
                         end) peer_active,   
                         decode(ne_state.ne_state,0,'0',nvl(dnlt.ERR_CNT,0)) ERR_CNT, ---注意NVL改造
                         decode(ne_state.ne_state,0,'0',nvl(dnlt.recv_cnt,0)) recv_cnt, 
                         decode(ne_state.ne_state,0,'0',nvl(dnlt.recv_byte,0)) recv_byte,
                         decode(ne_state.ne_state,0,'0',nvl(dnlt.send_cnt,0)) send_cnt, 
                         decode(ne_state.ne_state,0,'0',nvl(dnlt.send_byte,0)) send_byte                                      
           from ne_state ,dcc_ne_log_time dnlt  
           where    ne_state.peer_id=dnlt.peer_id(+)

关键启示

本案例完美诠释了"统一天下"的优化理念。当代码逻辑散落在多个相似的子查询中时,我们可以通过巧妙的SQL技巧(如条件聚合)将它们合并,减少数据访问次数,实现性能的质的提升。

优化SQL前,不要急于分析执行计划细节,而是先退一步看全局:这段SQL到底想完成什么?有没有更简洁的表达方式?回归需求本质,往往能找到突破性的优化思路。

未完待续…
SQL等价改写优化案例精选第一部【一统天下】②分析函数的穿透之力

系列回顾

“大白话人工智能” 系列
“数据库拍案惊奇” 系列
“世事洞明皆学问” 系列

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

收获不止数据库

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值