Oracle SQL - 合并重叠的期间

  • 数据和目标

有如下数据存储了各组件的有效期间(此处起止日期用数字代替以便查阅),目标为将有重叠的期间合并到一起。

SQL> SELECT * FROM demo_eff_periods;

COMPONENT_ITEM_ID EFFECTIVITY_DATE DISABLE_DATE
----------------- ---------------- ------------
                1             1000         1999
                1             1500         3000
                1             3000         3999
                1             4000         4999
                2             1000         4999
                2             2000         3999
                3             1000         1999
                3             1500         2999
                3             1000         3999
                4             1000         1999
                4             2000         2999
                4             3000         3999
                5             1000         1999
                5             1000         1999
                5             4000         4999

15 rows selected

将数据用下图表示,除组件1的第二段区间为闭区间,其它均为左闭右开。

  •  例程一:转化为连续日期合并

先使用递归语法把每段期间拆解出具体的每一天,再将连续的日期整合为新区间。

这里使用dense_rank是因为可能有重复的日期出现。

如果出现上一期间的结束日期与下一期间的开始日期刚好连续的情况,这个方法将会把这两个区间合并到一起。

SQL> SELECT component_item_id,
  2         MIN(new_date) effectivity_date,
  3         MAX(new_date) disable_date
  4    FROM (SELECT dep.component_item_id,
  5                 dep.effectivity_date + LEVEL - 1 new_date,
  6                 dense_rank() over(PARTITION BY dep.component_item_id ORDER BY dep.effectivity_date + LEVEL) dr
  7            FROM demo_eff_periods dep
  8          CONNECT BY LEVEL + dep.effectivity_date - 1 <= dep.disable_date
  9                 AND PRIOR dep.rowid = dep.rowid
 10                 AND PRIOR sys_guid() IS NOT NULL)
 11   GROUP BY component_item_id, new_date - dr;

COMPONENT_ITEM_ID EFFECTIVITY_DATE DISABLE_DATE
----------------- ---------------- ------------
                1             1000         4999
                2             1000         4999
                3             1000         3999
                4             1000         3999
                5             1000         1999
                5             4000         4999

6 rows selected
  • 例程二:递归串联区间合并

使用RSF语法,先找到所有不被其它区间包含的期间开始日期,再递归串联后续区间,最终得以合并。

将SQL中注释的部分解除,则可实现如例程一的进一步合并连续区间的效果。

SQL> WITH rsf_eff_periods(component_item_id, effectivity_date, disable_date) AS
  2   (SELECT dep.component_item_id, dep.effectivity_date, dep.disable_date
  3      FROM demo_eff_periods dep
  4     WHERE NOT EXISTS
  5     (SELECT 1
  6              FROM demo_eff_periods dep0
  7             WHERE dep0.component_item_id = dep.component_item_id
  8               AND dep0.effectivity_date < dep.effectivity_date
  9               AND dep0.disable_date >= dep.effectivity_date /*- 1*/)
 10    UNION ALL
 11    SELECT rep.component_item_id, rep.effectivity_date, dep.disable_date
 12      FROM rsf_eff_periods rep, demo_eff_periods dep
 13     WHERE rep.component_item_id = dep.component_item_id
 14       AND rep.disable_date >= dep.effectivity_date /*- 1*/
 15       AND rep.disable_date < dep.disable_date)
 16  SELECT component_item_id, effectivity_date, MAX(disable_date) disable_date
 17    FROM rsf_eff_periods
 18   GROUP BY component_item_id, effectivity_date;

COMPONENT_ITEM_ID EFFECTIVITY_DATE DISABLE_DATE
----------------- ---------------- ------------
                1             1000         3999
                1             4000         4999
                2             1000         4999
                3             1000         3999
                4             1000         1999
                4             2000         2999
                4             3000         3999
                5             1000         1999
                5             4000         4999

9 rows selected

 CONNECT BY写法:

SELECT component_item_id, effectivity_date, MAX(disable_date) disable_date
  FROM (SELECT dep.component_item_id,
               connect_by_root effectivity_date effectivity_date,
               dep.disable_date
          FROM demo_eff_periods dep
         START WITH NOT EXISTS
         (SELECT 1
                       FROM demo_eff_periods dep0
                      WHERE dep0.component_item_id = dep.component_item_id
                        AND dep0.effectivity_date < dep.effectivity_date
                        AND dep0.disable_date >= dep.effectivity_date /*- 1*/
                     )
        CONNECT BY PRIOR dep.component_item_id = dep.component_item_id
               AND PRIOR dep.disable_date >= dep.effectivity_date /*- 1*/
               AND PRIOR dep.disable_date < dep.disable_date)
 GROUP BY component_item_id, effectivity_date;
  • 例程三:区间端点排序重组

分别找到不被其它期间包含的开始日期和结束日期,即合并后应得的区间端点,两类端点(开始和结束)分别排序,同位次的端点即组成新区间。

这里使用dense_rank是因为可能有重复的日期出现。

将SQL中注释的部分解除,则可实现如例程一的进一步合并连续区间的效果。

SQL> SELECT component_item_id,
  2         MIN(new_date) effectivity_date,
  3         MAX(new_date) disable_date
  4    FROM (SELECT dep.component_item_id,
  5                 dep.effectivity_date new_date,
  6                 dense_rank() over(PARTITION BY dep.component_item_id ORDER BY dep.effectivity_date) dr
  7            FROM demo_eff_periods dep
  8           WHERE NOT EXISTS
  9           (SELECT 1
 10                    FROM demo_eff_periods dep0
 11                   WHERE dep0.component_item_id = dep.component_item_id
 12                     AND dep0.effectivity_date < dep.effectivity_date
 13                     AND dep0.disable_date >= dep.effectivity_date /*- 1*/)
 14          UNION ALL
 15          SELECT dep.component_item_id,
 16                 dep.disable_date new_date,
 17                 dense_rank() over(PARTITION BY dep.component_item_id ORDER BY dep.disable_date) dr
 18            FROM demo_eff_periods dep
 19           WHERE NOT EXISTS
 20           (SELECT 1
 21                    FROM demo_eff_periods dep0
 22                   WHERE dep0.component_item_id = dep.component_item_id
 23                     AND dep0.effectivity_date <= dep.disable_date /*+ 1*/
 24                     AND dep0.disable_date > dep.disable_date))
 25   GROUP BY component_item_id, dr;

COMPONENT_ITEM_ID EFFECTIVITY_DATE DISABLE_DATE
----------------- ---------------- ------------
                1             1000         3999
                1             4000         4999
                2             1000         4999
                3             1000         3999
                4             1000         1999
                4             2000         2999
                4             3000         3999
                5             1000         1999
                5             4000         4999

9 rows selected
  • 例程四:区间端点排序重组(减少扫描表次数)

对于每一个原始期间,如有比自己早开始的期间其结束日期比自己的开始日期还大,则自己的开始日期将被吞并;如有比自己晚结束的期间其开始日期比自己的结束日期还小,则自己的结束日期将被吞并。

筛选出合并后应得的区间端点后,两类端点(开始和结束)分别排序,同位次的端点即组成新区间。

这里使用dense_rank是因为可能有重复的日期出现。

将SQL中注释的部分解除,则可实现如例程一的进一步合并连续区间的效果。

SQL> SELECT component_item_id,
  2         MIN(new_date) effectivity_date,
  3         MAX(new_date) disable_date
  4    FROM (SELECT component_item_id,
  5                 new_date,
  6                 dense_rank() over(PARTITION BY component_item_id, date_pos ORDER BY new_date) dr
  7            FROM (SELECT dep.component_item_id,
  8                         CASE
  9                           WHEN MAX(dep.disable_date)
 10                            over(PARTITION BY dep.component_item_id ORDER BY
 11                                     dep.effectivity_date
 12                                     RANGE BETWEEN unbounded preceding AND 1
 13                                     preceding) >= dep.effectivity_date /*- 1*/ THEN
 14                            NULL
 15                           ELSE
 16                            dep.effectivity_date
 17                         END AS effectivity_date,
 18                         CASE
 19                           WHEN MIN(dep.effectivity_date)
 20                            over(PARTITION BY dep.component_item_id ORDER BY
 21                                     dep.disable_date DESC
 22                                     RANGE BETWEEN unbounded preceding AND 1
 23                                     preceding) <= dep.disable_date /*+ 1*/ THEN
 24                            NULL
 25                           ELSE
 26                            dep.disable_date
 27                         END AS disable_date
 28                    FROM demo_eff_periods dep)
 29          UNPIVOT(new_date FOR date_pos IN(effectivity_date, disable_date)))
 30   GROUP BY component_item_id, dr;

COMPONENT_ITEM_ID EFFECTIVITY_DATE DISABLE_DATE
----------------- ---------------- ------------
                1             1000         3999
                1             4000         4999
                2             1000         4999
                3             1000         3999
                4             1000         1999
                4             2000         2999
                4             3000         3999
                5             1000         1999
                5             4000         4999

9 rows selected

更多方法,欢迎大家讨论

  • 5
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值