数据集
session_id | row_number | sku_id |
aa | 2 | a |
aa | 3 | c |
aa | 1 | b |
bb | 2 | d |
bb | 3 | c |
bb | 1 | a |
bb | 4 | a |
根据session_id分组,将sku_id按照row_number的顺序,多行拼接成一行。
select
session_id,
collect_list(sku_id),
collect_set(sku_id)
from
(
select * from temp.tmp_as_test order by row_number desc
)
b
group by
session_id
代码比较简单,主要是比较巧妙。输出结果为:
aa | ["c","a","b"] | ["c","a","b"] |
bb | ["a","c","d","a"] | ["a","c","d"] |