SQL等价改写优化案例精选第二部【精兵简政】 ② 无脑复制粘贴-CSDN博客

本文链接：https://blog.csdn.net/weixin_39534833/article/details/147904548

梁敬彬梁敬弘兄弟出品

往期回顾

SQL等价改写优化案例精选第一部【一统天下】① count(case when)的合并魔力
 SQL等价改写优化案例精选第一部【一统天下】②分析函数的穿透之力
 SQL等价改写优化案例精选第一部【一统天下】③ group by (case when)的聚合艺术
 SQL等价改写优化案例精选第一部【一统天下】④ or 与 union all的优化交响
 SQL等价改写优化案例精选第二部【精兵简政】① 莫名其妙的行列转换

在上一篇案例中，我们探讨了"莫名其妙"的行列转换如何导致SQL性能低下。本篇，我们将继续探讨"精兵简政"理念，聚焦于另一个常见的代码编写误区——无脑复制粘贴。这种做法不仅使代码冗长难维护，还会带来严重的性能问题。

案例6 代码需要如此的复制粘贴吗？

select distinct to_char(a.svcctx_id),
                to_char(0),
                to_char(nvl((select peer_id
                              from dcc_ne_config
                             where peer_name = a.peer),
                            0)),
                to_char(a.priority),
                to_char(nvl((select peer_id
                              from dcc_ne_config
                             where peer_name = b.peer),
                            0)),
                to_char(b.priority),
                to_char(nvl((select peer_id
                              from dcc_ne_config
                             where peer_name = c.peer),
                            0)),
                to_char(c.priority)
  from (select hopbyhop,
               svcctx_id,
               substr(cause,
                      instr(cause, 'Host = ') + 7,
                      instr(cause, 'Priority = ') - instr(cause, 'Host = ') - 11) peer,
               substr(cause,
                      instr(cause, 'Priority = ') + 11,
                      instr(cause, 'reachable = ') -
                      instr(cause, 'Priority = ') - 13) priority
          from dcc_sys_log
         where cause like '%SC路由应答%'
           and hopbyhop in (select distinct hopbyhop from dcc_sys_log)) a,
       (select hopbyhop,
               svcctx_id,
               substr(cause,
                      instr(cause, 'Host = ') + 7,
                      instr(cause, 'Priority = ') - instr(cause, 'Host = ') - 11) peer,
               substr(cause,
                      instr(cause, 'Priority = ') + 11,
                      instr(cause, 'reachable = ') -
                      instr(cause, 'Priority = ') - 13) priority
          from dcc_sys_log
         where cause like '%SC路由应答%'
           and hopbyhop in (select distinct hopbyhop from dcc_sys_log)) b,
       (select hopbyhop,
               svcctx_id,
               substr(cause,
                      instr(cause, 'Host = ') + 7,
                      instr(cause, 'Priority = ') - instr(cause, 'Host = ') - 11) peer,
               substr(cause,
                      instr(cause, 'Priority = ') + 11,
                      instr(cause, 'reachable = ') -
                      instr(cause, 'Priority = ') - 13) priority
          from dcc_sys_log
         where cause like '%SC路由应答%'
           and hopbyhop in (select distinct hopbyhop from dcc_sys_log)) c
 where a.hopbyhop = b.hopbyhop
   and a.hopbyhop = c.hopbyhop
   and a.peer <> b.peer
   and a.peer <> c.peer
   and b.peer <> c.peer
   and a.priority <> b.priority
   and a.priority <> c.priority
   and b.priority <> c.priority

一眼望去，这段SQL充满了复制粘贴的痕迹。同一个子查询被复制了三次，只是别名不同（a、b、c）。这种写法不仅使代码臃肿难读，更严重的是，它导致数据库需要多次扫描相同的表和执行相同的计算逻辑，极大地降低了查询性能。

这个SQL语句其实是为了实现行列转换，但是效率非常低，扫描表多次，而且最终结果也不正确。如何进行分析呢？

需求理解

首先了解该语句的需求是什么，经了解，需求如下：

DROP TABLE TEST;
CREATE TABLE TEST ( ID1 NUMBER,ID2 NUMBER,VALUE1 VARCHAR2(20),VALUE2 VARCHAR2(20));
INSERT INTO TEST VALUES (1,2,'A','B');
INSERT INTO TEST VALUES (1,2,'C','D');
INSERT INTO TEST VALUES (1,2,'E','F');
INSERT INTO TEST VALUES (1,2,'G','H');
INSERT INTO TEST VALUES (3,8,'I','J');
INSERT INTO TEST VALUES (3,8,'K','L');
INSERT INTO TEST VALUES (3,8,'M','N');
INSERT INTO TEST VALUES (8,9,'O','P');
INSERT INTO TEST VALUES (8,9,'Q','R');
INSERT INTO TEST VALUES (11,12,'S','T');
COMMIT;

SQL> SELECT * FROM TEST;
 
       ID1        ID2   VALUE1           VALUE2
---------- ---------- -------------------- ------------------------------
         1          2  A                   B
         1          2  C                   D
         1          2  E                   F
         1          2  G                   H
         3          8  I                    J
         3          8  K                   L
         3          8  M                  N
         8          9  O                   P
         8          9  Q                   R
        11         12  S                   T
 
10 rows selected

要求为(行列转换，超过3个的只取三个，不足3个的用空格来补列）

ID1        ID2 VALUE1               VALUE2  VALUE3               VALUE4  VALUE5               VALUE6 
---------- ---------- -------------------- ----------------------------------------------------------------------------------------------------------------------------
1           2    A                    B      C                    D        E                    F
3           8    I                     J      K                    L         M                   N
8           9    O                    P      Q                    R        NULL                NULL
11         12    S                    T      NULL                 NULL     NULL                NULL

问题分析

原SQL存在以下几个主要问题：

重复子查询：相同的子查询被复制三次，导致数据库必须重复执行相同的逻辑
自连接复杂性：使用三个相同子查询的自连接，并通过多个条件进行筛选，逻辑复杂且性能低下
字符串处理僵硬：使用固定偏移量进行字符串截取，缺乏灵活性
IN子查询低效：hopbyhop in (select distinct hopbyhop from
dcc_sys_log)子句使得每个子查询都会扫描同一个表两次

优化方案

针对行列转换需求，我们可以使用更高效的方法，我们可以通过MAX+分析函数实现如下：

SELECT ID1,ID2
      ,MAX(DECODE(RN,1,VALUE1))
      ,MAX(DECODE(RN,1,VALUE2))
      ,MAX(DECODE(RN,2,VALUE1))
      ,MAX(DECODE(RN,2,VALUE2))
      ,MAX(DECODE(RN,3,VALUE1))
      ,MAX(DECODE(RN,3,VALUE2))
  FROM (SELECT TEST.*, ROW_NUMBER() OVER(PARTITION BY ID1,ID2 ORDER BY VALUE1,VALUE2) RN FROM TEST) T
WHERE RN<=3
GROUP BY ID1,ID2;

这种方式利用分析函数和条件聚合实现行列转换，只需一次表扫描，代码简洁明了。对于原始SQL中的字符串处理，我们也可以采用更灵活的方式：

with data as (SELECT 'SC路由应答：Host = SR2@001.ChinaTelecom.com, Priority = 1, reachable = true' as str 
                    ,'Host = ' k1
                    ,'Priority = ' k2
               FROM DUAL)
,data2 AS (SELECT data.*,INSTR(str,k1) p1,INSTR(str,k2) p2 FROM data)
select SUBSTR(str,p1+LENGTH(k1),INSTR(str,',',p1+1)-p1-LENGTH(k1))
      ,SUBSTR(str,p2+LENGTH(k2),INSTR(str,',',p2+1)-p2-LENGTH(k2))
  from data2;

这样通过计算相对位置来提取字符串，更加健壮可靠。

完整优化方案

将上述思路应用到原始SQL中，优化后的代码如下：

select distinct to_char(svcctx_id),
                to_char(0),
                to_char(nvl((select peer_id
                              from dcc_ne_config
                             where peer_name = PEER1),
                            0)),
                to_char(priority1),
                to_char(nvl((select peer_id
                              from dcc_ne_config
                             where peer_name =PEER2),
                            0)),
                to_char(priority2),
                to_char(nvl((select peer_id
                              from dcc_ne_config
                             where peer_name = PEER3),
                            0)),
                to_char(priority3)
  from
(with data as (select hopbyhop,
               svcctx_id,
               cause as str,
               'Host = ' k1,
               'Priority = ' k2 
               from dcc_sys_log  where cause like '%SC路由应答%')
,data2 as (select data.*,instr(str,k1) p1, instr(str,k2) p2 from data)
,data3 as 
(select hopbyhop,
       svcctx_id,
       SUBSTR(str,p1+LENGTH(k1),INSTR(str,',',p1+1)-p1-LENGTH(k1)) peer
      ,SUBSTR(str,p2+LENGTH(k2),INSTR(str,',',p2+1)-p2-LENGTH(k2)) PRIORITY
  from data2)
SELECT hopbyhop,svcctx_id,
       MAX(DECODE(RN,1,PEER)) PEER1
      ,MAX(DECODE(RN,1,PRIORITY)) PRIORITY1
      ,MAX(DECODE(RN,2,PEER))  PEER2
      ,MAX(DECODE(RN,2,PRIORITY))  PRIORITY2
      ,MAX(DECODE(RN,3,PEER)) PEER3
      ,MAX(DECODE(RN,3,PRIORITY)) PRIORITY3
 FROM (SELECT data3.*, ROW_NUMBER() OVER(PARTITION BY hopbyhop,svcctx_id ORDER BY PEER,PRIORITY) RN FROM data3) 
WHERE RN<=3
GROUP BY hopbyhop,svcctx_id) t2