题目
表a包含日期和股市涨跌标记(1:跌,0:涨),请用任意SQL得到连续跌的天数,得到例如表b的效果。
表a
adate(日期) | tag(股市涨跌标记(1:跌,0:涨) |
---|---|
2023-1-16 | 1 |
2023-1-15 | 1 |
2023-1-14 | 1 |
2023-1-13 | 1 |
2023-1-12 | 0 |
2023-1-11 | 0 |
2023-1-10 | 0 |
2023-1-9 | 1 |
2023-1-8 | 0 |
表b
adate(日期) | tag(股市涨跌标记(1:跌,0:涨) | conti(连续跌的天数) |
---|---|---|
2023-1-16 | 1 | 4 |
2023-1-15 | 1 | 3 |
2023-1-14 | 1 | 2 |
2023-1-13 | 1 | 1 |
2023-1-12 | 0 | 0 |
2023-1-11 | 0 | 0 |
2023-1-10 | 0 | 0 |
2023-1-9 | 1 | 1 |
2023-1-8 | 0 | 0 |
求连续天数类型的题,我们核心的思路就是找出能够将每组连续值分开的标记,下面介绍该题的两种解法。
方法一
第一种方法也是求连续类问题时用到的最常规的方法,对adate和tag两列得到排好的row_number,作差,根据差值即可得到分组。最后再对组内排序即可,完整代码如下
Select adate, tag,
Case When tag = 0 Then 0 Else row_number()Over(Partition By diff Order By adate) End conti From
(
Select adate, tag,
Case When tag = 1 Then
row_number()Over(Order By adate desc) -
row_number()Over(Partition By tag Order By adate desc)
End diff
From a
)tmp
Order By adate desc
下面是逐步拆解
首先对两列进行分组排序,得到两列值。
Select adate, tag,
row_number()Over(Order By adate desc) rn1,
row_number()Over(Partition By tag Order By adate desc)rn2
From a
结果为
adate(日期) | tag(股市涨跌标记(1:跌,0:涨) | rn1 | rn2 |
---|---|---|---|
2023-1-16 | 1 | 1 | 1 |
2023-1-15 | 1 | 2 | 2 |
2023-1-14 | 1 | 3 | 3 |
2023-1-13 | 1 | 4 | 4 |
2023-1-12 | 0 | 5 | 1 |
2023-1-11 | 0 | 6 | 2 |
2023-1-10 | 0 | 7 | 3 |
2023-1-9 | 1 | 8 | 5 |
2023-1-8 | 0 | 9 | 4 |
对于tag=1时,观察 rn1 和 rn2 ,不难发现,对于连续跌的天,两值作差得到的数是一致的。
Select adate, tag,
Case When tag = 1 Then
row_number()Over(Order By adate desc) -
row_number()Over(Partition By tag Order By adate desc)
End diff
From a
adate(日期) | tag(股市涨跌标记(1:跌,0:涨) | diff |
---|---|---|
2023-1-16 | 1 | 0 |
2023-1-15 | 1 | 0 |
2023-1-14 | 1 | 0 |
2023-1-13 | 1 | 0 |
2023-1-12 | 0 | |
2023-1-11 | 0 | |
2023-1-10 | 0 | |
2023-1-9 | 1 | 3 |
2023-1-8 | 0 |
对作差得到的值作为标记进行分组,最后对分组以日期为顺序进行排序即可。
Select adate, tag,
Case When tag = 0 Then 0 Else row_number()Over(Partition By diff Order By adate) End conti From
(
Select adate, tag,
Case When tag = 1 Then
row_number()Over(Order By adate desc) -
row_number()Over(Partition By tag Order By adate desc)
End diff
From a
)tmp
Order By adate desc
adate(日期) | tag(股市涨跌标记(1:跌,0:涨) | conti(连续跌的天数) |
---|---|---|
2023-1-16 | 1 | 4 |
2023-1-15 | 1 | 3 |
2023-1-14 | 1 | 2 |
2023-1-13 | 1 | 1 |
2023-1-12 | 0 | 0 |
2023-1-11 | 0 | 0 |
2023-1-10 | 0 | 0 |
2023-1-9 | 1 | 1 |
2023-1-8 | 0 | 0 |
方法二
第二种方法是先用Lag开窗将发生涨跌变化的天标记为1,其他为0,然后用Sum开窗求和即可得到连续涨 / 跌天的统一分组标记。完整代码如下
Select adate, tag,
Case When tag = 1 Then row_number()Over(Partition By grp Order By adate Desc) Else 0 End conti
From (
Select adate, tag, change_flag,
Sum(change_flag)Over(Order By adate Desc) grp
From (
Select adate, tag,
Case When Lag(tag)Over(Order By adate Desc) = tag Then 0 Else 1 End change_flag
From a Order By adate
) t1
) t2
Order By adate Desc;
下面是逐步拆解
首先对tag进行偏移,偏移列的值与原tag值不一致的即为发生涨跌变化的行,将发生变化的行标记为1,未发生变化的行标记为0。
Select adate, tag,
Case When Lag(tag)Over(Order By adate Desc) = tag Then 0 Else 1 End change_flag
From a Order By adate Desc
得到
adate(日期) | tag(股市涨跌标记(1:跌,0:涨) | change_flag |
---|---|---|
2023-1-16 | 1 | 1 |
2023-1-15 | 1 | 0 |
2023-1-14 | 1 | 0 |
2023-1-13 | 1 | 0 |
2023-1-12 | 0 | 1 |
2023-1-11 | 0 | 0 |
2023-1-10 | 0 | 0 |
2023-1-9 | 1 | 1 |
2023-1-8 | 0 | 1 |
对变化标记change_flag进行开窗求和,
由于变化行为1未变化行为0,此时窗口开到的部分,
对于未变化的行有:求和一致为上一次发生变化时的sum值;
对于发生变化的行有:求和为上一次变化时的sum值+1。
Select adate, tag, change_flag,
Sum(change_flag)Over(Order By adate Desc) grp
From (
Select adate, tag,
Case When Lag(tag)Over(Order By adate Desc) = tag Then 0 Else 1 End change_flag
From a Order By adate
)
得到
adate(日期) | tag(股市涨跌标记(1:跌,0:涨) | change_flag | grp |
---|---|---|---|
2023-1-16 | 1 | 1 | 1 |
2023-1-15 | 1 | 0 | 1 |
2023-1-14 | 1 | 0 | 1 |
2023-1-13 | 1 | 0 | 1 |
2023-1-12 | 0 | 1 | 2 |
2023-1-11 | 0 | 0 | 2 |
2023-1-10 | 0 | 0 | 2 |
2023-1-9 | 1 | 1 | 3 |
2023-1-8 | 0 | 1 | 4 |
以grp将连续相同的值(连续跌/涨)分组,
分组后case when对tag = 1的行按照组进行row_number开窗即可得到结果。
Select adate, tag,
Case When tag = 1 Then row_number()Over(Partition By grp Order By adate Desc) Else 0 End conti
From (
Select adate, tag, change_flag,
Sum(change_flag)Over(Order By adate Desc) grp
From (
Select adate, tag,
Case When Lag(tag)Over(Order By adate Desc) = tag Then 0 Else 1 End change_flag
From a Order By adate
) t1
) t2
Order By adate Desc;
adate(日期) | tag(股市涨跌标记(1:跌,0:涨) | conti(连续跌的天数) |
---|---|---|
2023-1-16 | 1 | 4 |
2023-1-15 | 1 | 3 |
2023-1-14 | 1 | 2 |
2023-1-13 | 1 | 1 |
2023-1-12 | 0 | 0 |
2023-1-11 | 0 | 0 |
2023-1-10 | 0 | 0 |
2023-1-9 | 1 | 1 |
2023-1-8 | 0 | 0 |