很多业务场景下都需要获取用户当前订单的下一订单信息,例如淘宝和京东的订单的下一订单,滴滴打车司机的下一单号。或者浏览场景下,用户浏览商品的顺序等等。
样例数据1:表名order_info
userid | orderid | ordertime | orderinfo |
---|---|---|---|
1 | 10000 | 2017-01-14 12:25:56 | 手机 |
1 | 10001 | 2017-01-17 11:05:23 | 手机壳 |
1 | 10002 | 2017-01-20 10:05:28 | 瓜子 |
2 | 10003 | 2017-01-14 12:25:56 | 电脑 |
2 | 10004 | 2017-01-17 11:05:23 | 瓜子 |
3 | 10005 | 2017-01-20 10:05:28 | 手机 |
目标输出:
userid | orderid | ordertime | orderinfo | next_orderid | next_ordertime | next_orderinfo |
---|---|---|---|---|---|---|
1 | 10000 | 2017-01-14 12:25:56 | 手机 | 10001 | 2017-01-17 11:05:23 | 手机壳 |
1 | 10001 | 2017-01-17 11:05:23 | 手机壳 | 10002 | 2017-01-20 10:05:28 | 瓜子 |
2 | 10003 | 2017-01-14 12:25:56 | 电脑 | 10004 | 2017-01-17 11:05:23 | 瓜子 |
sql分析:目标是以用户分组,拿到当前用户订单后面所有订单信息,按照下单时间排序,取得时间差最小的那一个订单信息。
select
t.userid,
t.orderid,
t.ordertime,
t.orderinfo,
t.next_orderid,
t.next_ordertime,
t.next_orderinfo
from
(
select
a.userid,
a.orderid,
a.ordertime,
a.orderinfo,
b.orderid as next_orderid,
b.ordertime as next_ordertime,
b.orderinfo as next_orderinfo,
row_number() over(partition by a.userid, a.orderid
order by unix_timestamp(b.ordertime) - unix_timestamp(a.ordertime) asc) as seq_rank
from
(
select userid, orderid, ordertime, orderinfo from order_info
)
a
JOIN
(
select userid, orderid, ordertime, orderinfo from order_info
)
b
on
a.userid = b.userid
where
a.ordertime < b.ordertime
)
t
where
t.seq_rank = 1
通过seq_rank 控制不仅可以拿到下一单还可以拿到下下单,下K单等等。
可以作为类似于n-gram算法的数据源。